VALSE 首页 活动通知 查看内容

VALSE 论文速览 第95期:面向跨模态匹配的噪声关联学习

2022-9-23 11:12| 发布者: 程一-计算所| 查看: 43| 评论: 0

摘要: 为了使得视觉与学习领域相关从业者快速及时地了解领域的最新发展动态和前沿技术进展,VALSE最新推出了《论文速览》栏目,将在每周发布一至两篇顶会顶刊论文的录制视频,对单个前沿工作进行细致讲解。本期VALSE论文速 ...


论文题目:Learning with Noisy Correspondence for Cross-modal Matching

作者列表:黄振宇 (四川大学)、牛国成 (百度)、刘霄 (好未来)、丁文彪 (好未来)、肖欣延 (百度)、吴华 (百度)、*彭玺 (四川大学)



Cross-modal matching, which aims to establish the correspondence between two different modalities, is fundamental to a variety of tasks such as cross-modal retrieval and vision-and-language understanding. Although a huge number of cross-modal matching methods have been proposed and achieved remarkable progress in recent years, almost all of these methods implicitly assume that the multimodal training data are correctly aligned. In practice, however, such an assumption is extremely expensive even impossible to satisfy. Based on this observation, we reveal and study a latent and challenging direction in cross-modal matching, named noisy correspondence, which could be regarded as a new paradigm of noisy labels. Different from the traditional noisy labels which mainly refer to the errors in category labels, our noisy correspondence refers to the mismatch paired samples. To solve this new problem, we propose a novel method for learning with noisy correspondence, named Noisy Correspondence Rectifier (NCR). In brief, NCR divides the data into clean and noisy partitions based on the memorization effect of neural networks and then rectifies the correspondence via an adaptive prediction model in a co-teaching manner. To verify the effectiveness of our method, we conduct experiments by using the image-text matching as a showcase. Extensive experiments on Flickr30K, MS-COCO, and Conceptual Captions verify the effectiveness of our method.


[1] Huang, Z., Niu, G., Liu, X., Ding, W., Xiao, X., Wu, H., & Peng, X. (2021). Learning with Noisy Correspondence for Cross-modal Matching. Advances in Neural Information Processing Systems, 34, 29406-29419.







黄振宇,四川大学博士生,师从彭玺教授。主要研究方向包括多模态学习、噪声标签学习和深度聚类等。在TIP、NeurIPS、ICML、CVPR和IJCAI等CCF A类期刊和会议发表多篇论文。


月度轮值AC:冯尊磊 (浙江大学),徐易 (大连理工大学)

季度责任AC:张姗姗 (南京理工大学)





2、VALSE Webinar活动通常每周三晚上20:00进行,但偶尔会因为讲者时区问题略有调整,为方便您参加活动,请关注VALSE微信公众号:valse_wechat 或加入VALSE QQ R群,群号:137634472);

*注:申请加入VALSE QQ群时需验证姓名、单位和身份缺一不可。入群后,请实名,姓名身份单位。身份:学校及科研单位人员T;企业研发I;博士D;硕士M。


4您也可以通过访问VALSE主页: 直接查看Webinar活动信息。Webinar报告的PPT(经讲者允许后),会在VALSE官网每期报告通知的最下方更新。

小黑屋|手机版|Archiver|Vision And Learning SEminar

GMT+8, 2022-10-2 01:14 , Processed in 0.010912 second(s), 14 queries .

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.