VALSE 论文速览第139期：三维场景中基于扩散模型的生成、优化、与规划 ...

2023-10-20 12:09| 发布者: 程一-计算所| 查看: 2057| 评论: 0

摘要: 为了使得视觉与学习领域相关从业者快速及时地了解领域的最新发展动态和前沿技术进展，VALSE最新推出了《论文速览》栏目，将在每周发布一至两篇顶会顶刊论文的录制视频，对单个前沿工作进行细致讲解。本期VALSE论文速 ...

为了使得视觉与学习领域相关从业者快速及时地了解领域的最新发展动态和前沿技术进展，VALSE最新推出了《论文速览》栏目，将在每周发布一至两篇顶会顶刊论文的录制视频，对单个前沿工作进行细致讲解。本期VALSE论文速览选取了来自北京通用人工智能研究院、北京理工大学、清华大学、北京大学在统一三维场景理解中生成模型方面的工作。该工作由黄思远博士与梁玮教授指导，共同一作在读博士生王赞录制。

论文题目：Diffusion-based Generation, Optimization, and Planning in 3D Scenes

作者列表：

黄思远* (北京通用人工智能研究院)，王赞* (北京理工大学)，李浦豪 (清华大学)、贾宝雄 (北京通用人工智能研究院)、刘腾宇 (北京通用人工智能研究院)、朱毅鑫 (北京大学)、梁玮 (北京理工大学)、朱松纯 (北京通用人工智能研究院)

B站观看网址：

https://www.bilibili.com/video/BV1Rw411r7uW/

论文摘要：

We introduce SceneDiffuser, a conditional generative model for 3D scene understanding. SceneDiffuser provides a unified model for solving scene-conditioned generation, optimization, and planning. In contrast to prior works, SceneDiffuser is intrinsically scene-aware, physics-based, and goal-oriented. With an iterative sampling strategy, SceneDiffuser jointly formulates the scene-aware generation, physics-based optimization, and goal-oriented planning via a diffusion-based denoising process in a fully differentiable fashion. Such a design alleviates the discrepancies among different modules and the posterior collapse of previous scene-conditioned generative models. We evaluate SceneDiffuser with various 3D scene understanding tasks, including human pose and motion generation, dexterous grasp generation, path planning for 3D navigation, and motion planning for robot arms. The results show significant improvements compared with previous models, demonstrating the tremendous potential of SceneDiffuser for the broad community of 3D scene understanding.

论文信息：

[1] Siyuan Huang, Zan Wang, Puhao Li, Baoxiong Jia, Tengyu Liu, Yixin Zhu, Wei Liang, Song-Chun Zhu, Diffusion-based Generation, Optimization, and Planning in 3D Scenes, In CVPR, 2023.

论文链接：

[https://arxiv.org/abs/2301.06015]

代码链接：

[https://github.com/scenediffuser/Scene-Diffuser]

视频讲者简介：

Zan Wang is a second-year Ph.D. candidate in the School of Computer Science, Beijing Institute of Technology (BIT), Beijing. He is currently working in Visual Perception & Human-computer Interaction Lab (PI Lab), advised by Prof. Wei Liang. He has been interning at Beijing Institute for General Artificial Intelligence (BIGAI) from Oct. 2021 to now, under the supervision of vision team leader, Dr. Siyuan Huang. Before beginning his Ph.D. study, he received B.S. in Internet of Things Engineering from Beijing Institute of Technology (BIT) in June 2020. His research interests lie in the field of Computer Vision, Graphics, and Deep Learning.

个人主页：

https://silvester.wang/

特别鸣谢本次论文速览主要组织者：

月度轮值AC：焦剑波 (伯明翰大学)

季度轮值AC：叶茫 (武汉大学)

活动参与方式

1、VALSE每周举行的Webinar活动依托B站直播平台进行，欢迎在B站搜索VALSE_Webinar关注我们！

直播地址：

https://live.bilibili.com/22300737；

历史视频观看地址：

https://space.bilibili.com/562085182/

2、VALSE Webinar活动通常每周三晚上20:00进行，但偶尔会因为讲者时区问题略有调整，为方便您参加活动，请关注VALSE微信公众号：valse_wechat 或加入VALSE QQ S群，群号：317920537）；

*注：申请加入VALSE QQ群时需验证姓名、单位和身份，缺一不可。入群后，请实名，姓名身份单位。身份：学校及科研单位人员T；企业研发I；博士D；硕士M。

3、VALSE微信公众号一般会在每周四发布下一周Webinar报告的通知。

4、您也可以通过访问VALSE主页：http://valser.org/ 直接查看Webinar活动信息。Webinar报告的PPT（经讲者允许后），会在VALSE官网每期报告通知的最下方更新。

收藏邀请

上一篇：VALSE 论文速览第138期：HumanBench:Towards General Human-centric Models下一篇：VALSE 论文速览第140期：通过纯文本训练解码CLIP隐空间的零样本描述方法 ...

下级分类

小黑屋|手机版|Archiver|Vision And Learning SEminar

GMT+8, 2026-7-11 07:04 , Processed in 0.014235 second(s), 14 queries .

返回顶部

VALSE 论文速览 第139期：三维场景中基于扩散模型的生成、优化、与规划 ...

相关分类

下级分类

VALSE 论文速览第139期：三维场景中基于扩散模型的生成、优化、与规划 ...