20210120-03 总第227期无人驾驶场景3D物体检测与跟踪

2021-1-15 18:09| 发布者: 程一-计算所| 查看: 5059| 评论: 0

摘要: 报告时间2021年01月20日 (星期三)晚上20:00 (北京时间)主题无人驾驶场景3D物体检测与跟踪主持人马超 (上海交通大学)报告嘉宾：周星壹 (UT Austin)报告题目：Center-based 3D Object Detection and Tracking报告嘉宾 ...

报告时间	2021年01月20日 (星期三) 晚上20:00 (北京时间)
主题	无人驾驶场景3D物体检测与跟踪
主持人	马超 (上海交通大学)

报告嘉宾：周星壹 (UT Austin)

报告题目：Center-based 3D Object Detection and Tracking

报告嘉宾：王哲 (商汤科技)

报告题目：Multi-Modality Cut and Paste for 3D Object Detection

Panel嘉宾：

周星壹 (UT Austin)、王哲 (商汤科技)、沈春华 (阿德莱德大学)、王乃岩 (图森未来)

Panel议题：

1. 作为自动驾驶场景感知的关键技术，3D物体检测和跟踪未来的突破点在哪？

2. 面向真实应用场景，学术界只能利用公开的自动驾驶数据集设计算法，但是不同数据集之间的差异特别大，这种差异性对真实应用场景的影响在哪里？是数据问题还是算法问题？

3. 如何解决3D物体检测和跟踪当中的小样本问题、样本不均衡问题，如何提升可解释性以及算法模型的推理效率？

4. 针对自动驾驶实际场景，视觉感知技术如何充分发挥最大效能？如，基于跨设备联动、多模态融合等，请各位老师谈谈自己的见解?

5. 各位老师在各自熟悉的领域，对有志于从事无人驾驶场景感知的同学们有何建议?

*欢迎大家在下方留言提出主题相关问题，主持人和panel嘉宾会从中选择若干热度高的问题加入panel议题！

报告嘉宾：周星壹 (UT Austin)

报告时间：2021年01月20日(星期三)晚上20:00(北京时间)

报告题目：Center-based 3D Object Detection and Tracking

报告人简介：

Xingyi Zhou a fourth-year Computer Science Ph.D. student at The University of Texas at Austin, supervised by Prof. Philipp Krähenbühl. He obtained his bachelor’s degree from the School of Computer Science at Fudan University. He has interned at Microsoft Research Asia, Google Research, and Intel Labs. His research focuses on object-level visual recognition, including object detection, 3D perception, pose estimation, and tracking.

个人主页：

https://www.cs.utexas.edu/~zhouxy/

报告摘要：

Three-dimensional objects are commonly represented as 3D boxes in a point-cloud. This representation mimics the well-studied image-based 2D bounding-box detection but comes with additional challenges. Objects in a 3D world do not follow any particular orientation, and box-based detectors have difficulties enumerating all orientations or fitting an axis-aligned bounding box to rotated objects. In this paper, we instead propose to represent, detect, and track 3D objects as points. Our framework, CenterPoint, first detects centers of objects using a keypoint detector and regresses to other attributes, including 3D size, 3D orientation, and velocity. In a second stage, it refines these estimates using additional point features on the object. In CenterPoint, 3D object tracking simplifies to greedy closest-point matching. The resulting detection and tracking algorithm is simple, efficient, and effective. In NeurIPS 2020 nuScenes 3D Detection challenge, CenterPoint is adopted in 3 of the top 4 winning entries. On the Waymo Open Dataset, CenterPoint outperforms all previous single model method by a large margin.

参考文献：

[1] Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl, Tracking Objects as Points, In ECCV 2020.

[2] Tianwei Yin, Xingyi Zhou, Philipp Krähenbühl, Center-based 3D Object Detection and Tracking, In arXiv 2006.11275.

报告嘉宾：王哲 (商汤科技)

报告时间：2021年01月20日(星期三)晚上20:30(北京时间)

报告题目：Multi-Modality Cut and Paste for 3D Object Detection

报告人简介：

王哲，商汤科技研究副总监。2017年在香港中文大学获得博士学位，之后在商汤自动驾驶部门工作，负责自动驾驶方向3D感知技术研发。他在领域顶级会议和期刊上发表近30篇论文，谷歌学术引用1400余次，曾连续三年参加计算机视觉领域权威的ImageNet目标检测竞赛，获得一次第二名和两次第一名。

个人主页：

wang-zhe.me

报告摘要：

Three-dimensional (3D) object detection is essential in autonomous driving. There are observations that multi-modality methods based on both point cloud and imagery features perform only marginally better or sometimes worse than approaches that solely use single-modality point cloud. This paper investigates the reason behind this counter-intuitive phenomenon through a careful comparison between augmentation techniques used by single modality and multi-modality methods. We found that existing augmentations practiced in single-modality detection are equally useful for multi-modality detection. Then we further present a new multi-modality augmentation approach, Multi-mOdality Cut and pAste (MoCa). MoCa boosts detection performance by cutting point cloud and imagery patches of ground-truth objects and pasting them into different scenes in a consistent manner while avoiding collision between objects. We also explore beneficial architecture design and optimization practices in implementing a good multi-modality detector. Without using ensemble of detectors, our multi-modality detector achieves new state-of-the-art performance on nuScenes dataset and competitive performance on KITTI 3D benchmark. Our method also wins the best PKL award in the 3rd nuScenes detection challenge.

参考文献：

[1] Wenwei Zhang, Zhe Wang, Chen Change Loy, “Multi-Modality Cut and Paste for 3D Object Detection,” arXiv:2012.12741.

Panel嘉宾：沈春华 (阿德莱德大学)

嘉宾简介：

Chunhua Shen is a Professor of Computer Science at University of Adelaide. He was awarded an ARC Future Fellowship in 2012. He is also an Adjunct Professor of Data Science and AI at Faculty of Information Technology, Monash University. His research interest is Computer Vision.

个人主页：

https://git.io/shen

Panel嘉宾：王乃岩 (图森未来)

嘉宾简介：

王乃岩目前在图森未来任首席科学家，负责无人驾驶卡车算法研发，深度学习应用于目标追踪领域的全球第一人，参与知名深度学习框架MXNet的早期开发，获得2014 Google PhD Fellowship。多次在国际数据挖掘比赛和计算机视觉比赛中名列前茅，在计算机视觉与机器学习顶级会议与期刊上发表论文40余篇，被引用超过8000次。

个人主页：

http://winsty.net

主持人：马超 (上海交通大学)

主持人简介：

马超，上海交通大学人工智能研究院、人工智能教育部重点实验室助理教授，博士生导师。上海交通大学与加州大学默塞德分校联合培养博士。2016至2018年澳大利亚机器人视觉研究中心（阿德莱德大学）博士后研究员。中国图象与图形学会优博。上海市浦江人才。微软亚洲研究院“铸星计划”访问学者。主要研究方向为计算机视觉与机器学习。研究工作多次发表在计算机视觉领域顶级期刊 (TPAMI/IJCV/TIP) 和会议 (ICCV/CVPR/ECCV/NIPS) 上。担任国际期刊Pattern Recognition客座编辑，TPAMI/IJCV/TIP等二十余份国际期刊审稿人，多次任ICCV/CVPR/ECCV/IJCAI等国际会议的程序委员和审稿人， CVPR 2018、CVPR 2019优秀审稿人。IJCAI 2019计算机视觉Session Chair。多层级深度视觉跟踪论文谷歌学术单篇引用1400余次，谷歌学术引用4700余次。

个人主页：

https://vision.sjtu.edu.cn/

21-03期VALSE在线学术报告参与方式：

长按或扫描下方二维码，关注“VALSE”微信公众号 (valse_wechat)，后台回复“03期”，获取直播地址。