1. 专题报告(25 min/person * 3 = 75 min;Panel后集中提问) 报告嘉宾1:王刚(南洋理工大学) 报告时间:2016年3月23日晚20:00(北京时间) 报告类型:领域专家主题报告 + 专题panel 主持人:王瑞平 报告题目:Recurrent Neural Networks for Semantic Labelling 报告摘要: Recurrent neural networks have been successfully applied to model the contextual dependency in sequential signals such as text and speech for long. However, few works have been conducted on adapting RNNs to process image data because images are undirected cyclic graphs (UCG). Due to the loopy property of UCGs, RNNs are not directly applicable to UCG-structured images. In this talk, I will first present the method of DAG-RNNs, which decompose images into directed acyclic graphs and build the RNN models based on them afterwards. DAG-RNNs can successfully model the contextual dependency between local image regions to extract more powerful image features for labelling. State-of-the-art-performance has been achieved on the benchmark scene labelling dataset. In the second part, I will introduce how RNNs can be integrated with an attentional mechanism to selectively refine image labelling results in an iterative manner. RNNs and the attentional networks are learned in an end-to-end framework. These two works have been recently accepted to CVPR 2016. 报告人简介: Wang Gang is currently an assistant professor of Nanyang Technological University, Singapore. He is a research scientist (joint appointment) at the Advanced Digital Science Center between 2010 and 2014. He received his PhD degree from University of Illinois at Urbana-Champaign, and Bachelor degree from Harbin Institute of Technology. A number of his technologies have been successfully commercialized. He is a recipient of Harriett & Robert Perry Fellowship, CS/AI award, MMSP top 10 percent paper award, PREMIA best student paper award. He is an associate editor of Neurocomputing and the general chair of VISVA 2014. 报告材料[Slides] 报告嘉宾2:施行健(香港科技大学) 报告时间:2016年3月23日晚20:25(北京时间) 报告类型:领域专家主题报告 + 专题panel 主持人:王乃岩 报告题目:Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting 文章信息: [NIPS2015] Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting 报告摘要: The goal of precipitation nowcasting is to predict the future rainfall intensity in a local region over a relatively short period of time. Very few previous studies have examined this crucial and challenging weather forecasting problem from the machine learning perspective. In this paper, we formulate precipitation nowcasting as a spatiotemporal sequence forecasting problem in which both the input and the prediction target are spatiotemporal sequences. By extending the fully connected LSTM (FC-LSTM) to have convolutional structures in both the input-to-state and state-to-state transitions, we propose the convolutional LSTM (ConvLSTM) and use it to build an end-to-end trainable model for the precipitation nowcasting problem. Experiments show that our ConvLSTM network captures spatiotemporal correlations better and consistently outperforms FC-LSTM and the state-of-the-art operational ROVER algorithm for precipitation nowcasting. 报告人简介: Xingjian Shi is currently a 2nd year Ph.D student in Hong Kong University of Science and Technology (HKUST) supervised by Prof. Dit-Yan Yeung. Before that, he obtained a B.E. degree from Shanghai Jiao Tong University (SJTU) under the supervision of Prof. Wu-Jun Li. His current research focus is on spatiotemporal analysis and sequence to sequence learning. 报告材料[Slides] 报告嘉宾3:石葆光(华中科技大学) 报告时间:2016年3月23日晚20:50(北京时间) 报告类型:领域专家主题报告 + 专题panel 主持人:王兴刚 报告题目:Attention-based Model and Its Application in Scene Text Recognition 文章信息: [CVPR’16] Robust scene text recognition with automatic rectification 报告摘要: Algorithms based on attention-based models have achieved success in a wide variety of real-world tasks, including speech recognition, machine translation, and image captioning. Attention-based model is a special type of Recurrent Neural Network (RNN). One of its features is sequence-to-sequence learning, where the model learns to map an input sequence to another sequence, both have arbitrary lengths. Moreover, through its attention mechanism, the model predicts a soft-alignment between the input and output sequences. In this talk, I will first introduce the mechanisms of attention-based models, and how they are applied to various tasks. Then, I will introduce our recent work on scene text recognition, where we combine a Convolutional Neural Network (CNN) and an attention-based model into one network. Our hybrid model achieves state-of-the-art or highly-competitive performance on several benchmarks. Furthermore, we extend the model with a spatial transformer network that adaptively rectifies images, resulting in a model that effectively recognizes regular/irregular scene text. 报告人简介: Baoguang Shi received his B.S. degree in Electronics and Information Engineering from Huazhong University of Science and Technology (HUST), Wuhan, China, in 2012, where he is currently working toward his Ph.D. degree under the supervision of Prof. Xiang Bai at the School of Electronic Information and Communications. His research interests include scene text detection and recognition, script identification and face alignment. 报告材料[Slides] 2. Panel讨论(30 min;主持人:董乐、汪张扬) 嘉宾:王刚,施行健,石葆光 3. 群友提问(15 min ) |
小黑屋|手机版|Archiver|Vision And Learning SEminar
GMT+8, 2024-11-21 19:08 , Processed in 0.015281 second(s), 15 queries .
Powered by Discuz! X3.4
Copyright © 2001-2020, Tencent Cloud.