20210324-07 总第232期图像视频分割

2021-3-19 17:35| 发布者: 程一-计算所| 查看: 2456| 评论: 0

摘要: 报告时间2021年03月24日 (星期三)晚上18:30 (北京时间)主题图像视频分割主持人夏勇 (西北工业大学)报告嘉宾：沈春华 (The University of Adelaide)报告题目：Instance Segmentation Made Simple报告嘉宾：王井东 (M ...

报告时间	2021年03月24日 (星期三) 晚上18:30 (北京时间)
主题	图像视频分割
主持人	夏勇 (西北工业大学)

报告嘉宾：沈春华 (The University of Adelaide)

报告题目：Instance Segmentation Made Simple

报告嘉宾：王井东 (Microsoft Research Asia, Beijing China)

报告题目：Learning high-resolution and object-contextual representations for semantic segmentation

Panel嘉宾：

沈春华 (The University of Adelaide)、王井东 (Microsoft Research Asia, Beijing China)、魏云超 (University of Technology Sydney)、赵恒爽 (University of Oxford)、高常鑫 (华中科技大学)、许永超 (武汉大学)

Panel议题：

1. 现在语义分割在很多数据集上的结果都已经很高，语义分割还存在什么难题呢，还有哪些新的研究方向？

2. 现在语义分割、实例分割、全景分割都取得很好的结果，传统的一般图像分割与超像素分割的工作越来越少，是否还有很大研究价值？

3. 自监督学习是现在的一个研究热点，自监督学习在语义分割方面的进展如何?

4. 语义分割领域当前的工作比较趋同，大家都在设计一些模块提取上下文信息、强化特征，那么各位老师如何看待这个现象？语义分割下一步应该怎么走呢？

5. 最近Transformer大热，请问transformer相比于CNN有什么优缺点呢？如何应用于语义分割领域呢？

6. 最近有些工作开始探索在dense prediction任务中的应用，请问各位老师对于self-supervised/unsupervised learning在语义分割领域的应用有什么看法呢？

7. 当前语义分割算法主要基于FCN架构，对标object detection相当于是anchor-free的一阶段做法，那么语义分割是否有可能存在其他的类似anchor-based或者二阶段这样的架构呢？

*欢迎大家在下方留言提出主题相关问题，主持人和panel嘉宾会从中选择若干热度高的问题加入panel议题！

报告嘉宾：沈春华 (The University of Adelaide)

报告时间：2021年3月24日 (星期三)晚上18:30 (北京时间)

报告题目：Instance Segmentation Made Simple

报告人简介：

Professor Shen has been a Full Professor of Computer Science at The University of Adelaide since 2014, and Founding Director of the Machine Learning Theory theme at the Australian Institute for Machine Learning. His research mainly focuses on Machine Learning and Computer Vision. He was recognised as a top 5 researcher for Engineering and Computer Sciences as part of The Australian’s Life time Achievement Leaderboard (Sept. 2020, https://specialreports.theaustralian.com.au/1540291/9/), which is an exceptional achievement. He has supervised over 20 PhD students to completion. His student alumni include two Australian Research Council DECRA fellows and additional graduates who are now in tenured or tenure track roles in Universities including the University of Adelaide, Sydney University, Monash University, Wollongong University, Nanyang Technological University Singapore, and quite a few other universities in China.

个人主页：

https://cshen.github.io/

报告摘要：

Instance segmentation is one of the fundamental vision tasks. I will present several new, simple approaches to instance segmentation in images. Compared to many other dense prediction tasks, e.g., semantic segmentation, it is the arbitrary number of instances that have made instance segmentation much more challenging. In order to predict a mask for each instance, mainstream approaches either follow the `detect-then-segment' strategy as used by Mask R-CNN, or predict category masks first then use clustering techniques to group pixels into individual instances. Recently, fully convolutional instance segmentation methods have drawn much attention as they are often simpler and more efficient than two-stage approaches like Mask R-CNN. To date, almost all such approaches fall behind the two-stage Mask R-CNN method in mask precision when models have similar computation complexity, leaving great room for improvement. First, I will present BlendMask, achieving improved mask prediction by effectively combining instance-level information with semantic information with lower-level fine-granularity. Second, we view the task of instance segmentation from a completely new perspective by introducing the notion of "instance categories", which assigns categories to each pixel within an instance according to the instance's location and size, thus nicely converting instance mask segmentation into a classification-solvable problem. Last, I will present a simple yet effective instance segmentation framework, termed CondInst (conditional convolutions for instance segmentation). CondInst solves instance segmentation from a new perspective. Instead of using instance-wise ROIs as inputs to a network of fixed weights, CondInst employs dynamic instance-aware networks, conditioned on instances, thus eliminating ROI operations. Experiments on the COCO dataset show great promises of the proposed methods.

参考文献：

[1] FCOS: fully convolutional one-stage object detection, ICCV 2019.

[2] BlendMask: top-down meets bottom-up for instance segmentation, CVPR 2020.

[3] Conditional convolutions for instance segmentation, ECCV 2020.

[4] SOLO: segmenting objects by locations, ECCV 2020.

[5] SOLOv2: dynamic and fast instance segmentation, NeurIPS 2020.

[6] BoxInst: High-Performance Instance Segmentation with Box Annotations, CVPR 2021.

报告嘉宾：王井东 (Microsoft Research Asia, Beijing China)

报告时间：2021年3月24日 (星期三)晚上19:00 (北京时间)

报告题目：Learning high-resolution and object-contextual representations for semantic segmentation

报告人简介：

Jingdong Wang is a Senior Principal Research Manager with the Visual Computing Group at Microsoft Research Asia (Beijing, China). He received the B.Eng. and M.Eng. degrees from the Department of Automation at Tsinghua University in 2001 and 2004, respectively, and the PhD degree from the Department of Computer Science and Engineering, the Hong Kong University of Science and Technology, Hong Kong, in 2007. His areas of interest include neural network design, human pose estimation, large-scale indexing, and person re-identification. He is/was an Associate Editor of the IEEE TPAMI, the IEEE TMM and the IEEE TCSVT, and is an area chair of several leading Computer Vision and AI conferences, such as CVPR, ICCV, ECCV, ACM MM, IJCAI, and AAAI. He was elected as an IAPR Fellow, an ACM Distinguished Member, and an Industrial Distinguished Lecturer Program (iDLP) speaker of the IEEE Circuits and Systems Society.

His representative works include deep high-resolution network (HRNet), interleaved group convolutions, discriminative regional feature integration (DRFI) for supervised saliency detection, neighborhood graph search (NGS, SPTAG) for large scale similarity search, composite quantization for compact coding, and so on. He has shipped a number of technologies to Microsoft products, including Bing search, Bing Ads, Cognitive service, and XiaoIce Chatbot. The NGS algorithm developed in his group serves as a basic building block in many Microsoft products. In the Bing image search engine, the key color filter function is based on the salient object algorithm developed in his group. He has pioneered in the development of a commercial color-sketch image search system.

个人主页：

https://jingdongwang2017.github.io/

报告摘要：

Semantic segmentation is a fundamental and challenging visual recognition problem. It aims to assign a category to each pixel in an image. Various solutions have been developed mainly from two aspects. One is to improve the spatial granularity, e.g. applying dilated convolutions to ResNet or upsampling the low-resolution representation output by ResNet. The other is to explore the context, e.g. using pyramid pooling module (PPM) in PSPNet or atrous spatial pyramid pooling (ASPP) in DeepLab for combining multi-scale information. I will introduce our two research works handling the two issues. The first one is the high-resolution network (HRNet) for learning spatially fine-grained and semantically strong representations. The HRNet is designed from scratch other than from a classification network (e.g., ResNet) and maintains high-resolution presentations through the forward process with repeated multi-scale fusions. The second one is object-contextual representation (OCR). It starts from the intuition that the label of a pixel is the category of the object/stuff the pixel belongs to. The OCR approach aims to aggregate the representations of the pixels lying in the same object class by differentiating them from the pixels lying in different object classes. Experiments show that the HRNet and OCR approaches outperform the corresponding competitors. Together with a boundary refinement scheme, HRNet + OCR wins the first place in semantic segmentation on cityscapes. In addition, I will give a short introduction to our two CVPR 2021 papers about semantic segmentation: Light-HRNet: A Lightweight High-Resolution Network and Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision. The codes are available at https://github.com/HRNet/.

参考文献：

[1] Light-HRNet: A Lightweight High-Resolution Network, CVPR 2021.

[2] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision, CVPR 2021.

Panel嘉宾：魏云超 (University of Technology Sydney)

嘉宾简介：

Yunchao Wei is a Senior Lecturer of Australian Artificial Intelligence Institute at the University of Technology Sydney. Before joining UTS, he was a Postdoctoral Researcher of the Beckman Institute at UIUC. He received his Ph.D. degree from Beijing Jiaotong University in 2016. He was named as one of the five top early-career researchers in Engineering and Computer Sciences in Australia by The Australian in 2020. He received the Discovery Early Career Researcher Award of the Australian Research Council in 2019, 1st Prize in Science and Technology awarded by the China Society of Image and Graphics (CSIG) in 2019. He has published more than 70 papers in top-tier conferences/journals, Google Citation 5500+. He organized LID workshops in CVPR 2019/2020, RLQ workshops in ICCV 2019, and ECCV 2020. He received many competition prizes from CVPR/ICCV/ECCV, such as the Winner prizes of ILSVRC-2014, LIP-2018, Runner-up Prizes of ILSVRC-2017, DAVIS 2020, etc. His current research interest focuses on applying deep learning to computer vision tasks including classification, object detection, and segmentation.

个人主页：

https://weiyc.github.io/

Panel嘉宾：赵恒爽 (University of Oxford)

嘉宾简介：

Dr. Hengshuang Zhao is a postdoctoral researcher at the University of Oxford. Before that, he obtained his Ph.D. degree from the Chinese University of Hong Kong. His general research interests cover the broad area of computer vision, machine learning and artificial intelligence, with special emphasis on building intelligent visual systems like 2D image understanding and 3D point cloud understanding. He and his team won several champions in competitive international challenges like ImageNet Scene Parsing Challenge. Some of his research projects are supported by Microsoft, SenseTime, Adobe, Uber, Intel, and Apple. His works have been cited for about 5,000+ times, with 5,000+ GitHub credits and 80,000+ YouTube views.

个人主页：

https://hszhao.github.io/

Panel嘉宾：高常鑫 (华中科技大学)

嘉宾简介：

高常鑫，华中科技大学人工智能与自动化学院副教授，主要研究方向为图像语义分割、行人重识别、行为分析、小样本目标识别等。在相关的国际期刊和会议上发表/接收学术论文80余篇，获湖北省科技进步一等奖二等奖各1项。现担任VALSE执行AC委员。

个人主页：

https://sites.google.com/site/changxingao

Panel嘉宾：许永超 (武汉大学)

嘉宾简介：

许永超，武汉大学计算机学院，教授，2018年入选中国科协青年托举人才工程，2013年获得东巴黎大学博士学位，回国前任职于巴黎高等信息工程师学院Tenured Assistant Professor，研究领域涉及数学形态学、目标检测与图像分割、医学图像分析。在包括IEEE TPAMI、IJCV、IEEE TIP、CVPR、ICCV、MICCAI等重要国际期刊和会议发表学术论文 40余篇，目前担任Frontiers of Computer Science期刊的青年编委，IEEE TPAMI、IJCV、CVPR等多个期刊会议的审稿人。

个人主页：

https://sites.google.com/view/yongchaoxu

主持人：夏勇 (西北工业大学)

主持人简介：

夏勇，博士，教授，国家级青年人才计划入选者，西北工业大学计算机学院多学科交叉计算研究中心执行主任，中国图象图形学学会视觉大数据专委会常委，中国抗癌协会肿瘤影像专业委员会人工智能学组副组长，陕西省计算机学会人工智能专委会主任，VALSE执行领域主席，MICCAI 2019领域主席。研究方向为医学影像处理、分析与学习，发表学术论文140余篇，并先后在ISBI 2019 C-NMC、LiTS 2017和PROMISE12等国际学科竞赛中排名第一。

个人主页：

http://jszy.nwpu.edu.cn/yongxia.html

21-07期VALSE在线学术报告参与方式：

长按或扫描下方二维码，关注“VALSE”微信公众号 (valse_wechat)，后台回复“07期”，获取直播地址。

特别鸣谢本次Webinar主要组织者：

主办AC：夏勇 (西北工业大学)

协办AC：许永超 (武汉大学)、高常鑫 (华中科技大学)

责任AC：王兴刚 (华中科技大学)

活动参与方式

1、VALSE Webinar活动依托在线直播平台进行，活动时讲者会上传PPT或共享屏幕，听众可以看到Slides，听到讲者的语音，并通过聊天功能与讲者交互；

2、为参加活动，请关注VALSE微信公众号：valse_wechat 或加入VALSE QQ群（目前A、B、C、D、E、F、G、H、I、J、K、L、M、N群已满，除讲者等嘉宾外，只能申请加入VALSE P群，群号：1085466722）；

*注：申请加入VALSE QQ群时需验证姓名、单位和身份，缺一不可。入群后，请实名，姓名身份单位。身份：学校及科研单位人员T；企业研发I；博士D；硕士M。

3、在活动开始前5分钟左右，讲者会开启直播，听众点击直播链接即可参加活动，支持安装Windows系统的电脑、MAC电脑、手机等设备；

4、活动过程中，请不要说无关话语，以免影响活动正常进行；

5、活动过程中，如出现听不到或看不到视频等问题，建议退出再重新进入，一般都能解决问题；

6、建议务必在速度较快的网络上参加活动，优先采用有线网络连接；

7、VALSE微信公众号会在每周四发布下一周Webinar报告的通知及直播链接。

8、Webinar报告的PPT（经讲者允许后），会在VALSE官网每期报告通知的最下方更新[slides]。

9、Webinar报告的视频（经讲者允许后），会更新在VALSEB站、西瓜视频，请在搜索Valse Webinar进行观看。

沈春华 [slides]

王井东 [slides]

收藏邀请

上一篇：20210317-06 总第231期深度学习中的拓扑美学：图神经网络下一篇：VALSE Student Webinar 20210328-02 总第233期视频分析与理解的机遇与挑战 ... ...

下级分类

小黑屋|手机版|Archiver|Vision And Learning SEminar

GMT+8, 2025-9-18 23:26 , Processed in 0.015259 second(s), 14 queries .

返回顶部

20210324-07 总第232期 图像视频分割

相关分类

下级分类

20210324-07 总第232期图像视频分割