设为首页收藏本站

VALSE

VALSE 首页 活动通知 查看内容

20201014-25 自监督学习

2020-10-10 14:50| 发布者: 程一-计算所| 查看: 342| 评论: 0

摘要: 报告时间2020年10月14日 (星期三)晚上20:00 (北京时间)主 题自监督学习主持人黄高 (清华大学) 、曹越 (Microsoft Research Asia)报告嘉宾:陈挺 (Google Brain)报告题目:SimCLR V1/V2: a simple framework for unsu ...

报告时间

2020年10月14日 (星期三)

晚上20:00 (北京时间)

主  题

自监督学习

主持人

黄高 (清华大学) 、曹越 (Microsoft Research Asia)


报告嘉宾:陈挺 (Google Brain)

报告题目:SimCLR V1/V2: a simple framework for unsupervised learning of visual representations


报告嘉宾:田永龙 (MIT)

报告题目:Contrastive Multi-view Learning and the Influence of View Construction



Panel嘉宾:
陈挺(Google Brain)、田永龙(MIT)、翟晓华(Google Brain)、胡瀚(Microsoft Research Asia)


Panel议题:

1. 近一年自监督学习方向几乎被对比学习(contrastive learning)所主导,如何看待这种趋势?是否有其他自监督学习方法被低估?

2. 最近有文章(What is being transferred in transfer learning?)认为预训练模型中适合迁移的主要是低层的统计信息,而不是高层的特征。几位是否认同这个观点,以及它对于自监督学习有何启示?

3. 自监督学习领域现如今的痛点问题与挑战是什么?

4. 自监督学习是否一定比监督学习需要更多的数据?自监督学习中有哪些问题更适合计算资源较为有限的学术界来研究?

5. 预训练模型在自然语言处理领域取得了巨大成功,现如今计算机视觉领域是否有模型有此潜力?计算机视觉领域是否会出现像BERT一样的大一统预训练模型?

6. 如何看待自监督学习与迁移学习、小样本学习的关系?网络结构设计在自监督学习问题中是否可以发挥重要的作用?


*欢迎大家在下方留言提出主题相关问题,主持人和panel嘉宾会从中选择若干热度高的问题加入panel议题!


报告嘉宾:陈挺 (Google Brain)

报告时间:2020年10月14日(星期三)晚上20:00(北京时间)

报告题目:SimCLR V1/V2: a simple framework for unsupervised learning of visual representations


报告人简介:

Ting Chen is a research scientist from Google Brain team. He joined Google after obtaining his PhD from University of California, Los Angeles. Representation learning is his main research interest.


个人主页:
http://web.cs.ucla.edu/~tingchen/


报告摘要:

SimCLR is a simple framework for contrastive learning of visual representations. It simplifies recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100X fewer labels. I will also talk about SimCLRv2 which is an extension of the SimCLR framework for better semi-supervised learning.


参考文献:

[1] Ting Chen, Simon Kornblith, Mohammad Norouzi, Geoffrey Hinton, “A simple framework for contrastive learning of visual representations”. arXiv preprint arXiv:2002.05709.

[2] Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, Geoffrey Hinton, “Big self-supervised models are strong semi-supervised learners”. arXiv preprint arXiv:2006.10029.


报告嘉宾:田永龙 (MIT)

报告时间:2020年10月14日(星期三)晚上20:30(北京时间)

报告题目:Contrastive Multi-view Learning and the Influence of View Construction


报告人简介:

Yonglong Tian is a third-year Ph.D. student at MIT CSAIL, working with Prof. Phillip Isola and Prof. Josh Tenenbaum. His main research interests lie in the intersection of machine perception, learning and reasoning, mainly from the perspective of vision. Prior to MIT, He completed his M.Phil degree at CUHK, advised by Prof. Xiaoou Tang and Prof. Xiaogang Wang. He did his undergraduate study at Tsinghua University.


个人主页:
http://people.csail.mit.edu/yonglong/


报告摘要:

Recently, contrastive learning between multiple views of the data has significantly improved the state-of-the-art in the field of self-supervised learning. Despite its success, the influence of different view choices has been less studied.

Firstly, I will briefly summarize recent progress on contrastive representation learning from a unified multi-view perspective. Then an InfoMin principle is proposed that we should reduce the mutual information (MI) between views while keeping task-relevant information intact. To verify this hypothesis, we also devise unsupervised and semi-supervised frameworks to learn effective views. Lastly, under the principle of InfoMin, I will extend the application of contrastive learning to supervised setting.


Panel嘉宾:翟晓华 (Google Brain)


嘉宾简介:

Xiaohua Zhai is a senior researcher at Google Brain, Zurich. Before that, he received the Ph.D degree in Computer Science from Peking University in 2014, and Bachelor degree from Nanjing University in 2009.

He is the co-founder of “The Visual Task Adaptation Benchmark” (VTAB) project, which performs a large scale study on representation learning, across generative models, self-supervised learning, semi-supervised learning and supervised learning. In self-supervised learning, he discovered that “architecture matters” (Revisiting Self-Supervised Visual Representation Learning) and proposed the “finetune with few labels benchmark” (S4L: Self-Supervised Semi-Supervised Learning). He is the co-founder of “Big Transfer” (BiT) project, which achieves state-of-the-art performance on many vision tasks, especially on low data regime. He is a core contributor to the “Compare GANs” project (github.com/google/compare_gan): a framework for training and evaluating GANs, which received 1.6K stars on GitHub.

He has authored papers in refereed conference proceedings and international journals, including ICML, ICCV, ECCV, CVPR, AAAI, ACM-MM and TCSVT. As second accomplisher of PKU-ICST team, which is led by Prof. Yuxin Peng, he participated in the NIST TRECVID international video retrieval evaluation 2012 and won the first place. He is a reviewer of IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), IEEE Transactions on Image Processing (TIP), ICML, NeurIPS, ICLR, CVPR, ECCV, AAAI and ACM-MM.


个人主页:
https://sites.google.com/site/xzhai89


Panel嘉宾:胡瀚 (Microsoft Research Asia)


嘉宾简介:

胡瀚现任微软亚洲研究院视觉计算组主管研究员,2008年和2014年在清华大学自动化系分别获得本科和博士学位,2016年获中国人工智能学会优秀博士论文奖。于2012年在宾夕法尼亚大学GRASP实验室做访问研究,加入微软亚洲研究院前曾在百度深度学习实验室工作。目前主要研究兴趣是视觉表征学习,视觉语言联合表征学习,以及视觉物体识别等等。将担任CVPR 2021领域主席。


个人主页:
https://ancientmooner.github.io/


主持人:黄高 (清华大学)


主持人简介:

Gao Huang is currently an Assistant Professor with the Department of Automation, Tsinghua University. He received the B.S. degree in automation from Beihang University in 2009, and the Ph.D. degree in automation from Tsinghua University in 2015. He was a Post-Doctoral Researcher with the Department of Computer Science, Cornell University, from 2015 to 2018. His research interests include deep learning and computer vision.


个人主页:
http://www.gaohuang.net/


主持人:曹越 (Microsoft Research Asia)

主持人简介:

曹越现任微软亚洲研究院视觉计算组研究员,分别于2014年和2019年在清华大学软件学院获得本科和博士学位,曾于2018年获清华大学特等奖学金。目前主要的研究兴趣是自监督学习、多模态学习以及自注意力建模。


个人主页:
http://yue-cao.me/


20-25期VALSE在线学术报告参与方式:

长按或扫描下方二维码,关注“VALSE”微信公众号 (valse_wechat),后台回复“25期”,获取直播地址。


特别鸣谢本次Webinar主要组织者:

主办AC:曹越 (MSRA)

协办AC:黄高 (清华大学)

责任AC:王兴刚(华中科技大学)



活动参与方式

1、VALSE Webinar活动依托在线直播平台进行,活动时讲者会上传PPT或共享屏幕,听众可以看到Slides,听到讲者的语音,并通过聊天功能与讲者交互;

2、为参加活动,请关注VALSE微信公众号:valse_wechat 或加入VALSE QQ群(目前A、B、C、D、E、F、G、H、I、J、K、L、M、N群已满,除讲者等嘉宾外,只能申请加入VALSE O群,群号:1149026774);

*注:申请加入VALSE QQ群时需验证姓名、单位和身份,缺一不可。入群后,请实名,姓名身份单位。身份:学校及科研单位人员T;企业研发I;博士D;硕士M。

3、在活动开始前5分钟左右,讲者会开启直播,听众点击直播链接即可参加活动,支持安装Windows系统的电脑、MAC电脑、手机等设备;

4、活动过程中,请不要说无关话语,以免影响活动正常进行;

5、活动过程中,如出现听不到或看不到视频等问题,建议退出再重新进入,一般都能解决问题;

6、建议务必在速度较快的网络上参加活动,优先采用有线网络连接;

7、VALSE微信公众号会在每周四发布下一周Webinar报告的通知及直播链接。

8、Webinar报告的PPT(经讲者允许后),会在VALSE官网每期报告通知的最下方更新[slides]

9、Webinar报告的视频(经讲者允许后),会更新在VALSEB站、西瓜视频,请在搜索Valse Webinar进行观看。

Archiver|手机版|小黑屋|Vision And Learning SEminar    

GMT+8, 2020-10-21 00:40 , Processed in 0.030767 second(s), 18 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.

返回顶部