VALSE 首页 活动通知 热点话题Panel 查看内容

20171011-24:VALSE ICCV2017 专场二

2017-10-1 02:52| 发布者: 程一-计算所| 查看: 4760| 评论: 0

摘要: VALSE ICCV2017 专场重磅来袭:两年一度的视觉盛宴ICCV2017即将上演,为了更好的促进学术交流,VALSE Webinar将连续举行3场ICCV Pre-Conference专场,奉上最新鲜的ICCV2017论文,提前引燃本年度的ICCV热潮。第二场10 ...

VALSE ICCV2017 专场重磅来袭:两年一度的视觉盛宴ICCV2017即将上演,为了更好的促进学术交流,VALSE Webinar将连续举行3场ICCV Pre-Conference专场,奉上最新鲜的ICCV2017论文,提前引燃本年度的ICCV热潮。


报告嘉宾1:谢凌曦(The Johns Hopkins University)

报告题目:Adversarial Examples for Semantic Segmentation and Object Detection
主持人:   卢策吾(上海交通大学)


It has been well demonstrated that adversarial examples, i.e., natural images with visually imperceptible perturbations added, generally exist for deep networks to fail on image classification. In this paper, we extend adversarial examples to semantic segmentation and object detection which are much more difficult. Our observation is that both segmentation and detection are based on classifying multiple targets on an image (e.g., the basic target is a pixel or a receptive field in segmentation, and an object proposal in detection), which inspires us to optimize a loss function over a set of pixels/proposals for generating adversarial perturbations. Based on this idea, we propose a novel algorithm named Dense Adversary Generation (DAG), which generates a large family of adversarial examples, and applies to a wide range of state-of-the-art deep networks for segmentation and detection. We also find that the adversarial perturbations can be transferred across networks with different training data, based on different architectures, and even for different recognition tasks. In particular, the transferability across networks with the same architecture is more significant than in other cases. Besides, summing up heterogeneous perturbations often leads to better transfer performance, which provides an effective method of black-box adversarial attack.


[1] Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie and Alan Yuille, "Adversarial Examples for Semantic Segmentation and Object Detection", in IEEE International Conference on Computer Vision (ICCV, Acceptance Rate ~ 29%), Venice, Italy, 2017.


Lingxi Xie obtained his B.E and Ph.D. degree from Tsinghua University in 2010 and 2015, respectively. He is currently a post-doctoral researcher in the Johns Hopkins University. He moved there from the University of California, Los Angeles. From 2013 to 2015, he was a research intern at Microsoft Research Asia. He was a visiting researcher at the University of Texas as San Antonio in 2014. Lingxi has been working on computer vision and multimedia information retrieval, especially in the area of image classification, image retrieval and object detection. He is also interested in the theory and application of deep learning. Lingxi obtained the best paper award on ICMR 2015.


报告题目:RMPE: Regional Multi-person Pose Estimation
主持人: 卢策吾(上海交通大学)


IMulti-person pose estimation in the wild is challenging. Although state-of-the-art human detectors have demonstrated good performance, small errors in localization and recognition are inevitable. These errors can cause failures for a single-person pose estimator (SPPE), especially for methods that solely depend on human detection results. In this paper, we propose a novel regional multi-person pose estimation (RMPE) framework to facilitate pose estimation in the presence of inaccurate human bounding boxes. Our framework consists of three components: Symmetric Spatial Transformer Network (SSTN), Parametric Pose Non-Maximum-Suppression (NMS), and Pose-Guided Proposals Generator (PGPG). Our method is able to handle inaccurate bounding boxes and redundant detections, allowing it to achieve the state-of-the-art performance on the MPII (multi person) and COCO keypoint dataset. Our model and source codes are publicly available.


[1] Hao-Shu Fang, Shuqin Xie, Yu-Wing Tai, Cewu Lu, " RMPE: Regional Multi-person Pose Estimation ", in IEEE International Conference on Computer Vision, Venice, Italy, 2017.


Hao-Shu Fang is currently an undergraduate in computer science from Shanghai Jiao Tong University, Shanghai, China, in 2018. He is working in the vision research group leaded by Prof. Cewu Lu. His research interests include computer vision and robotics.

报告嘉宾3:韩欣彤 (上海交通大学)

报告题目:Automatic Spatially-aware Fashion Concept Discovery
主持人:  林巍峣(上海交通大学)


This paper proposes an automatic spatially-aware concept discovery approach using weakly labeled image-text data from shopping websites. We first fine-tune GoogleNet by jointly modeling clothing images and their corresponding descriptions in a visual-semantic embedding space. Then, for each attribute (word), we generate its spatially-aware representation by combining its semantic word vector representation with its spatial representation derived from the convolutional maps of the fine-tuned network. The resulting spatially-aware representations are further used to cluster attributes into multiple groups to form spatially-aware concepts (e.g., the neckline concept might consist of attributes like v-neck, round-neck, etc). Finally, we decompose the visual-semantic embedding space into multiple concept-specific subspaces, which facilitates structured browsing and attribute-feedback product retrieval by exploiting multimodal linguistic regularities. We conducted extensive experiments on our newly collected Fashion200K dataset, and results on clustering quality evaluation and attribute-feedback product retrieval task demonstrate the effectiveness of our automatically discovered spatially-aware concepts.


[1] Xintong Han, Zuxuan Wu, Phoenix X. Huang, Xiao Zhang, Menglong Zhu, Yuan Li, Yang Zhao, Larry S. Davis, " Automatic Spatially-aware Fashion Concept Discovery ", in IEEE International Conference on Computer Vision, Venice, Italy, 2017.


Xintong Han received the B.S. degree in electrical engineering from Shanghai Jiao Tong University, Shanghai, China, in 2013. He is currently pursuing the Ph.D. at the University of Maryland, College Park. His research interests include computer vision, machine learning, and multimedia.


报告题目:A New Way to Evaluate Foreground Maps
主持人:  林巍峣(上海交通大学)


Foreground map evaluation is crucial for gauging the progress of object segmentation algorithms, in particular in the field of salient object detection where the purpose is to accurately detect and segment the most salient object in a scene. Several widely-used measures such as Area Under the Curve (AUC), Average Precision (AP) and the recently proposed Fbw have been used to evaluate the similarity between a non-binary saliency map (SM) and a ground-truth (GT) map. These measures are based on pixel-wise errors and often ignore the structural similarities. Behavioral vision studies, however, have shown that the human visual system is highly sensitive to structures in scenes. Here, we propose a novel, efficient, and easy to calculate measure known as structural similarity measure (Structure-measure) to evaluate non-binary foreground maps. Our new measure simultaneously evaluates region-aware and object-aware structural similarity between a SM and a GT map. We demonstrate superiority of our measure over existing ones using 5 meta-measures on 5 benchmark datasets.

[1] Deng-Ping Fan, Ming-Ming Cheng, Yun Liu, Tao Li, Ali Borji, " Structure-measure: A New Way to Evaluate Foreground Maps ", in IEEE International Conference on Computer Vision, Venice, Italy, 2017.

Deng-Ping Fan received his MS degree from the Guanxi Normal University, Guanxi, China, 2015. He currently is a Ph.D candidate in the Nankai University, Tianjin, China. He is working with Prof. Ming-Ming Cheng. His research interests includes computer vision, salient object detection.





VODB协调理事:鲁继文(清华大学 )


1、VALSE Webinar活动全部网上依托VALSE QQ群的“群视频”功能在线进行,活动时讲者会上传PPT或共享屏幕,听众可以看到Slides,听到讲者的语音,并通过文字或语音与讲者交互;

2、为参加活动,需加入VALSE QQ群,目前A、B、C、D、E群已满,除讲者等嘉宾外,只能申请加入VALSE F群,群号:594312623 。申请加入时需验证姓名、单位和身份,缺一不可。入群后,请实名,姓名身份单位。身份:学校及科研单位人员T;企业研发I;博士D;硕士M

3、为参加活动,请下载安装Windows QQ最新版,群视频不支持非Windows的系统,如Mac,Linux等,手机QQ可以听语音,但不能看视频slides;






Archiver|手机版|小黑屋|Vision And Learning SEminar    

GMT+8, 2020-8-5 00:23 , Processed in 0.030236 second(s), 18 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.