VALSE Webinar 22-32期总第300期时光倒转万物生：扩散概率模型专题研讨 ...

2022-12-22 17:52| 发布者: 程一-计算所| 查看: 2443| 评论: 0

摘要: 报告时间2022年12月28日 (星期三)晚上20:00 (北京时间)主题时光倒转万物生：扩散概率模型专题研讨主持人李崇轩 (中国人民大学)直播地址https://live.bilibili.com/22300737报告嘉宾：古纾旸 (微软亚洲研究院)报告题 ...

报告时间	2022年12月28日 (星期三) 晚上20:00 (北京时间)
主题	时光倒转万物生：扩散概率模型专题研讨
主持人	李崇轩 (中国人民大学)
直播地址	https://live.bilibili.com/22300737

报告嘉宾：古纾旸 (微软亚洲研究院)

报告题目：向量量化的扩散概率模型 (Vector quantized diffusion models)

报告嘉宾：宋佳铭 (NVIDIA Research)

报告题目：用扩散模型高效求解逆问题

Panel嘉宾：

古纾旸 (微软亚洲研究院)、宋佳铭 (NVIDIA Research)、卢志武 (中国人民大学)、赵洲 (浙江大学)

Panel议题：

1. 扩散概率模型发展如火如荼，也有一些应用已经接近于落地了，每位嘉宾请各自列举一个自己觉得最有趣或最有前景的应用?

2. 扩散概率模型已经被用于各个领域，比如视觉、语音、文到图、拟问题求解等等，请各位嘉宾结合自己的领域谈一谈扩散概率模型相较于其他生成模型或传统方法的优势和不足?

3. 扩散概率模型的加速算法现在的发展是否已经进入了瓶颈？扩散概率模型的采样有可能会变得和GAN、VAE等模型一样快吗？

4. 扩散概率模型在大规模文到图生成中的统治会延续吗？基于扩散概率建模的大模型训练中有哪些值得注意的问题？

5. 扩散概率模型在很多方面还有进步的空间，比如泛化理论、表示学习、可解释性、可控生成等等，请各位嘉宾结合自己最感兴趣的方向，展望一下未来Diffusion model的发展前景?

*欢迎大家在下方留言提出主题相关问题，主持人和panel嘉宾会从中选择若干热度高的问题加入panel议题！

报告嘉宾：古纾旸 (微软亚洲研究院)

报告时间：2022年12月28日 (星期三)晚上20:00 (北京时间)

报告题目：向量量化的扩散概率模型 (Vector quantized diffusion models)

报告人简介：

古纾旸，在中国科学技术大学自动化系于2017年和2022年分别获得学士和博士学位，现为微软亚洲研究院研究员，主要研究方向为计算机视觉中的生成模型，特别是生成对抗网络和扩散模型的理论及其在2D和3D数据中的应用，以及对生成结果的质量评估等。目前已在CVPR、ICCV、ECCV等会议上发表多篇论文并担任多个会议与期刊的审稿人。

个人主页：

https://cientgu.github.io/

报告摘要：

VQ-Diffusion is based on a vector quantized variational autoencoder (VQ-VAE)whose latent space is modeled by a conditional variant of the recently developed Denoising Diffusion Probabilistic Model (DDPM). We find that this latent-space method is well-suited for text-to-image generation tasks because it not only eliminates the unidirectional bias with existing methods but also allows us to incorporate a mask-and -replace diffusion strategy to avoid the accumulation of errors, which is a serious problem with existing methods. Our experiments show that the VQ-Diffusion produces significantly better text-to-image generation results when compared with conventional autoregressive (AR)models with similar numbers of parameters. Compared with previous GAN-based text-to-image methods, our VQ-Diffusion can handle more complex scenes and improve the synthesized image quality by a large margin. Finally, we show that the image generation computation in our method can be made highly efficient by reparameterization. With traditional AR methods, the text-to-image generation time increases linearly with the output image resolution and hence is quite time consuming even for normal size images. The VQ-Diffusion allows us to achieve a better trade-off between quality and speed. Our experiments indicate that the VQ-Diffusion model with the reparameterization is fifteen times faster than traditional AR methods while achieving a better image quality.

参考文献：

[1] Gu, S., Chen, D., Bao, J., Wen, F., Zhang, B., Chen, D., ... & Guo, B. (2022). Vector quantized diffusion model for text-to-image synthesis. In Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition (pp. 10696-10706).

[2] Tang, Z., Gu, S., Bao, J., Chen, D., & Wen, F. (2022). Improved Vector Quantized Diffusion Models. arXiv preprint arXiv:2205.16007.

报告嘉宾：宋佳铭 (NVIDIA Research)

报告时间：2022年12月28日 (星期三)晚上20:30 (北京时间)

报告题目：用扩散模型高效求解逆问题

报告人简介：

宋佳铭是NVIDIA Research的研究科学家，博士毕业于斯坦福大学，本科毕业于清华大学。他最近的研究兴趣在扩散模型。他其中一个工作是将扩散模型的采样过程加速的DDIM算法，这一算法被用在大量工业界的扩散模型中，例如OpenAI的DALL-E 2, Google的Imagen, Stable Diffusion, 以及百度的ERNIE-ViLG 2.0。他曾经获得ICLR杰出论文奖和高通博士生创新奖。

个人主页：

https://tsong.me/

报告摘要：

Diffusion models have become competitive candidates for solving various inverse problems. Models trained for specific inverse problems work well but are limited to their particular use cases, whereas methods that use problem-agnostic models are general but often perform worse empirically. To address this dilemma, we introduce Pseudoinverse-guided Diffusion Models (ΠGDM), an approach that uses problem-agnostic models to close the gap in performance. ΠGDM directly estimates conditional scores from the measurement model of the inverse problem without additional training. It can address inverse problems with noisy, non-linear, or even non-differentiable measurements, in contrast to many existing approaches that are limited to noiseless linear ones. We illustrate the empirical effectiveness of ΠGDM on several image restoration tasks, including super-resolution, inpainting and JPEG restoration. On ImageNet, ΠGDM is competitive with state-of-the-art diffusion models trained on specific tasks, and is the first to achieve this with problem-agnostic diffusion models. ΠGDM can also solve a wider set of inverse problems where the measurement processes are composed of several simpler ones.

参考文献：

[1] Pseudoinverse-Guided Diffusion Models for Inverse Problems. Jiaming Song, Arash Vadhat, Morteza Mardani, Jan Kautz

[2] Denoising Diffusion Restoration Models. Bahjat Kawar, Michael Elad, Stefano Ermon, Jiaming Song. NeurIPS 2022.

Panel嘉宾：卢志武 (中国人民大学)

嘉宾简介：

卢志武博士，中国人民大学高瓴人工智能学院教授，博士生导师。2005年毕业于北京大学数学科学学院，获理学硕士学位；2011年毕业于香港城市大学计算机系，获PhD学位。研究方向为机器学习、计算机视觉等。设计首个公开的中文通用图文预训练模型文澜BriVL，并发表于Nature Communications。

个人主页：

https://gsai.ruc.edu.cn/addons/teacher/index/info.html?user_id=18

Panel嘉宾：赵洲 (浙江大学)

嘉宾简介：

赵洲是浙江大学计算机学院副教授、博士生导师。2015年博士毕业于香港科技大学。主要研究方向是自然语言理解、多媒体计算，包括跨模态序列生成和多模态语义理解等。主持多项国家自然基金项目，担任上海浙江大学高等研究院-海康威视联合研究中心主任。发表国际会议期刊论文60余篇，谷歌学术引用7000+次。获高等学校科学研究优秀成果奖科技进步一等奖、中国电子学会科技进步一等奖、福布斯中国科学领域30U30和杭州市十大青年科技英才。

个人主页：

https://person.zju.edu.cn/zhaozhou/686052.html

主持人：李崇轩 (中国人民大学)

主持人简介：

李崇轩，中国人民大学高瓴人工智能学院准聘助理教授，博导。研究方向为概率机器学习。他的代表性工作有：一致性理论下最优的半监督GAN 方法 Triple-GAN；扩散概率模型在最大似然意义下的最优反向方差估计Analytic-DPM。李崇轩获机器学习领域重要国际会议 ICLR 2022年杰出论文奖，2021年吴文俊人工智能自然科学奖一等奖，2019年中国计算机学会优秀博士论文和2017年微软学者。李崇轩入选2021年北京市科技新星，2019年中国博士后创新人才支持计划，主持国家自然科学基金面上项目。

个人主页：

https://zhenxuan00.github.io/

特别鸣谢本次Webinar主要组织者：

主办AC：李崇轩 (中国人民大学)

活动参与方式

1、VALSE每周举行的Webinar活动依托B站直播平台进行，欢迎在B站搜索VALSE_Webinar关注我们！

直播地址：

https://live.bilibili.com/22300737；

历史视频观看地址：

https://space.bilibili.com/562085182/

2、VALSE Webinar活动通常每周三晚上20:00进行，但偶尔会因为讲者时区问题略有调整，为方便您参加活动，请关注VALSE微信公众号：valse_wechat 或加入VALSE QQ S群，群号：317920537）；

*注：申请加入VALSE QQ群时需验证姓名、单位和身份，缺一不可。入群后，请实名，姓名身份单位。身份：学校及科研单位人员T；企业研发I；博士D；硕士M。

3、VALSE微信公众号一般会在每周四发布下一周Webinar报告的通知。

4、您也可以通过访问VALSE主页：http://valser.org/ 直接查看Webinar活动信息。Webinar报告的PPT（经讲者允许后），会在VALSE官网每期报告通知的最下方更新。

收藏邀请

上一篇：VALSE Webinar 22-31期总第299期遥感图像智能理解与应用下一篇：VALSE Webinar 23-01期总第301期铂金赞助商Webinar

下级分类

小黑屋|手机版|Archiver|Vision And Learning SEminar

GMT+8, 2025-10-15 05:17 , Processed in 0.014607 second(s), 14 queries .

返回顶部

VALSE Webinar 22-32期 总第300期 时光倒转万物生：扩散概率模型专题研讨 ...

相关分类

下级分类

VALSE Webinar 22-32期总第300期时光倒转万物生：扩散概率模型专题研讨 ...