20171206-28：黄岩 Improving Image and Sentence Matching with

2017-12-4 11:28| 发布者: 程一-计算所| 查看: 4224| 评论: 0

摘要: 报告嘉宾：黄岩（中科院自动化研究所）报告时间：2017年12月06日（星期三）晚20:00（北京时间）报告题目：Improving Image and Sentence Matching with Multimodal Attention and Visual Attributes主持人：任传贤（ ...

报告嘉宾：黄岩（中科院自动化研究所）

报告时间：2017年12月06日（星期三）晚20:00（北京时间）

报告题目：Improving Image and Sentence Matching with Multimodal Attention and Visual Attributes

主持人：任传贤（中山大学）

报告摘要：Effective image and sentence matching depends on how to well measure their global visual-semantic similarity. Based on the observation that such a global similarity arises from a complex aggregation of multiple local similarities between pairwise instances of image (objects) and sentence (words), we propose a selective multimodal Long Short-Term Memory network (sm-LSTM) for instance-aware image and sentence matching. The sm-LSTM includes a multi-modal context-modulated attention scheme at each timestep that can selectively attend to a pair of instances of image and sentence, by predicting pairwise instance-aware saliency maps for image and sentence. By similarly measuring multiple local similarities within a few timesteps, the sm-LSTM sequentially aggregates them with hidden states to obtain a final matching score as the desired global similarity. Extensive experiments show that our model can well match image and sentence with complex content, and achieve the state-of-the-art results on two public benchmark datasets. In addition, this talk will introduces our recent progress on using visual attributes for instance-aware image and sentence matching.

报告人简介：

黄岩，助理研究员。2012年获电子科技大学学士学位，2017年获中科院大学博士学位。2017年7月加入中科院自动化研究所模式识别国家重点实验室工作。研究方向为深度学习、计算机视觉与模式识别。目前已在相关领域顶级会议和期刊上发表多篇文章，包括TPAMI, TIP, TMM, NIPS, ICCV, CVPR等。曾获得CVPR 2014-Deep Vision Workshop最佳论文奖、ICPR 2014最佳学生论文奖、RACV 2016最佳墙报奖、中科院院长特别奖、百度奖学金等奖项。

特别鸣谢本次Webinar主要组织者：

VOOC责任委员：任传贤（中山大学）

VODB协调理事：卢孝强（中国科学院西安光学精密机械研究所）

活动参与方式：

1、VALSE Webinar活动全部网上依托VALSE QQ群的“群视频”功能在线进行，活动时讲者会上传PPT或共享屏幕，听众可以看到Slides，听到讲者的语音，并通过文字或语音与讲者交互；

2、为参加活动，需加入VALSE QQ群，目前A、B、C、D、E、F群已满，除讲者等嘉宾外，只能申请加入VALSE G群，群号：669280237。申请加入时需验证姓名、单位和身份，缺一不可。入群后，请实名，姓名身份单位。身份：学校及科研单位人员T；企业研发I；博士D；硕士M

3、为参加活动，请下载安装Windows QQ最新版，群视频不支持非Windows的系统，如Mac，Linux等，手机QQ可以听语音，但不能看视频slides；

4、在活动开始前10分钟左右，主持人会开启群视频，并发送邀请各群群友加入的链接，参加者直接点击进入即可；

5、活动过程中，请勿送花、棒棒糖等道具，也不要说无关话语，以免影响活动正常进行；

6、活动过程中，如出现听不到或看不到视频等问题，建议退出再重新进入，一般都能解决问题；

7、建议务必在速度较快的网络上参加活动，优先采用有线网络连接。

收藏邀请

上一篇：20171129-27：黄伟林 Learn CNNs from Large-scale Web Images without human annotat ...下一篇：20171213-29 俞扬：高效强化学习的一些探索

20171206-28：黄岩 Improving Image and Sentence Matching with

最新评论

相关分类