VALSE

VALSE 首页 活动通知 查看内容

VALSE Webinar 24-36期 总第371期 高效能视觉基础模型

2024-12-23 19:01| 发布者: 程一-计算所| 查看: 55| 评论: 0

摘要: 报告嘉宾:黄高 (清华大学)报告题目:面向长序列的注意力机制报告嘉宾:曹杰彰 (Harvard University)报告题目:Diffusion Model-based Restoration: Sequential Sampling vs Parallel Sampling报告嘉宾:黄高 (清华 ...

报告嘉宾:黄高 (清华大学)

报告题目:面向长序列的注意力机制


报告嘉宾:曹杰彰 (Harvard University)

报告题目:Diffusion Model-based Restoration: Sequential Sampling vs Parallel Sampling


报告嘉宾:黄高 (清华大学)

报告时间:2024年12月25日 (星期三)晚上20:00 (北京时间)

报告题目:面向长序列的注意力机制


报告人简介:

黄高,清华大学自动化系副教授,博士生导师。博士毕业于清华大学,博士后工作于美国康奈尔大学。主要研究领域为深度学习、多模态学习、智能系统,提出了代表性深度卷积网络DenseNet。共计发表学术论文100余篇,被引7万余次,最高单篇引用超过4万次。获国家优青、CVPR最佳论文奖、达摩院青橙奖、世界人工智能大会SAIL奖、教育部自然科学一等奖、亚洲青年科学家奖、AI 2000人工智能最具影响力学者、《麻省理工科技评论》“35岁以下科技创新35人”等,担任IEEE T-PAMI、IEEE T-BD、Pattern Recognition等国际重要期刊编委和CVPR、ICCV、NIPS、ICML等人工智能顶级会议领域主席。


个人主页:

http://www.gaohuang.net

 

报告摘要:

基于Transformer的基础模型在自然语言处理、计算机视觉以及多模态学习领域取得了重要的研究进展。然而,长文本处理、高清图像或视频的理解与生成等任务造成的长序列问题为Transformer架构带来了巨大的挑战。一方面,Transformer中自注意力机制的计算复杂度关于输入序列长度呈平方关系,造成了训练和推理时间长、显存开销大等挑战。另一方面,长序列数据中有用信息的提取类似于“大海捞针”,现有的注意力机制容易受到长序列数据中的噪声信号的干扰,难以准确定位和提取关键信息。本报告将介绍如何利用具有线性复杂度的注意力机制应对长序列任务的计算效率问题,以及如何利用差分注意力机制应对长序列任务的噪声干扰问题。


参考文献:

[1] Han et al, Demystify Mamba in Vision: A Linear Attention Perspective, Neural Information Processing Systems (NeurIPS), 2024

[2] Ye et al. Differential transformer, arXiv preprint arXiv:2410.05258.

[3] Han et al. Agent attention: On the integration of softmax and linear attention, European Conference on Computer Vision (ECCV), 2024

[4] Han et al. FLatten Transformer: Vision Transformer using Focused Linear Attention, IEEE International Conference on Computer Vision (ICCV), 2023

[5] Han et al, Bridging the Divide: Reconsidering Softmax and Linear Attention, Neural Information Processing Systems (NeurIPS), 2024


报告嘉宾:曹杰彰 (Harvard University)

报告时间:2024年12月25日 (星期三)晚上20:40 (北京时间)

报告题目:Diffusion Model-based Restoration: Sequential Sampling vs Parallel Sampling


报告人简介:

Jiezhang Cao is currently a postdoctoral researcher in the Department of Psychiatry at Harvard Medical School, Havard University. He received the PhD degree in the Department of Information Technology and Electrical Engineering, ETH Zurich, in 2024. His research interests include generative models and low-level vision, especially image and video restoration, such as super-resolution, denoising and deblurring. He has published several top-tier conferences and journals, including ICML, NeurIPS, CVPR, ECCV, TIP, and TPAMI. His work has been cited over 6000 times according to Google Scholar. As a co-author, he is a recipient of best Paper Prize in ICCV Advances in Image Manipulation (AIM) Workshop.

 

个人主页:

https://www.jiezhangcao.com

 

报告摘要:

In the field of image restoration (IR), diffusion models have emerged as a promising tool for recovering high-quality images from degraded inputs. These models have shown impressive results, but they rely on long serial sampling chains, where each step restores the gradient incrementally. This process is not only time-consuming but also demands significant computational resources. Additionally, the sequential nature of sampling makes it difficult to capture the relationship between the input noise and the final restoration result, as gradients cannot be efficiently computed across the entire chain. This talk will explore an alternative approach to diffusion model-based IR, shifting from traditional serial sampling to parallel sampling. We will also discuss an interesting question: is parallel sampling better than serial sampling?

 

参考文献:

[1] Jiezhang Cao, Yue Shi, Kai Zhang, Yulun Zhang, Radu Timofte, Luc Van Gool, “Deep Equilibrium Diffusion Restoration with Parallel Sampling,” in Proceeding of IEEE Computer Vision and Pattern Recognition (CVPR) 2024.


主持人:高广谓 (南京邮电大学)


主持人简介:

高广谓,南京邮电大学教授,硕士生导师。2014年获得南京理工大学国家重点学科模式识别与智能系统专业博士学位。研究方向为高效能视觉表征学习及其应用。主持国家自然科学基金项目、江苏省自然科学基金项目等多项,参与国家自然科学基金重点项目、科技创新2030-“新一代人工智能”重大项目等多项。近年来在国际权威期刊IEEE TIP/TMM/TCSVT/TIFS/TITS、PR以及权威会议AAAI、IJCAI发表论文70余篇,ESI高被引论文3篇。曾获江苏省“六大人才高峰”高层次人才,江苏省科学技术奖一等奖(7/11)等奖项,获得授权发明专利13项。IEEE/CCF/CSIG/CAAI 高级会员,中国计算机学会计算机视觉专委会委员,中国人工智能学会模式识别专委会委员,中国自动化学会模式识别与机器智能专委会委员,中国图象图形学学会机器视觉专委会委员,江苏省自动化学会青工委委员,中国图象图形学学会青工委委员,VALSE 执行领域主席委员会委员。


个人主页:

https://guangweigao.github.io/



特别鸣谢本次Webinar主要组织者:

主办AC:高广谓 (南京邮电大学)

协办AC:张宇伦 (上海交通大学)


活动参与方式

1、VALSE每周举行的Webinar活动依托B站直播平台进行,欢迎在B站搜索VALSE_Webinar关注我们!

直播地址:

https://live.bilibili.com/22300737;

历史视频观看地址:

https://space.bilibili.com/562085182/ 


2、VALSE Webinar活动通常每周三晚上20:00进行,但偶尔会因为讲者时区问题略有调整,为方便您参加活动,请关注VALSE微信公众号:valse_wechat 或加入VALSE QQ T群,群号:863867505);


*注:申请加入VALSE QQ群时需验证姓名、单位和身份缺一不可。入群后,请实名,姓名身份单位。身份:学校及科研单位人员T;企业研发I;博士D;硕士M。


3、VALSE微信公众号一般会在每周四发布下一周Webinar报告的通知。


4、您也可以通过访问VALSE主页:http://valser.org/ 直接查看Webinar活动信息。Webinar报告的PPT(经讲者允许后),会在VALSE官网每期报告通知的最下方更新。


小黑屋|手机版|Archiver|Vision And Learning SEminar

GMT+8, 2025-1-31 09:45 , Processed in 0.013457 second(s), 14 queries .

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

返回顶部