为了使得视觉与学习领域相关从业者快速及时地了解领域的最新发展动态和前沿技术进展,VALSE最新推出了《论文速览》栏目,将在每周发布一至两篇顶会顶刊论文的录制视频,对单个前沿工作进行细致讲解。本期VALSE论文速览选取了来自北京科技大学和北京大学的FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality Assessment。该工作由论文第一作者徐婧林录制。 论文题目: FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality Assessment 作者列表: 徐婧林 (北京科技大学),尹思博 (北京大学),赵国豪 (北京大学),王梓烁 (北京大学),彭宇新 (北京大学) B站观看网址: 论文摘要: Existing action quality assessment (AQA)methods mainly learn deep representations at the video level for scoring diverse actions. Due to the lack of a fine-grained understanding of actions in videos, they harshly suffer from low credibility and interpretability, thus insufficient for stringent applications, such as Olympic diving events. We argue that a fine-grained understanding of actions requires the model to perceive and parse actions in both time and space, which is also the key to the credibility and interpretability of the AQA technique. Based on this insight, we propose a new fine-grained spatial-temporal action parser named FineParser. It learns human-centric foreground action representations by focusing on target action regions within each frame and exploiting their fine-grained alignments in time and space to minimize the impact of invalid backgrounds during the assessment. In addition, we construct fine-grained annotations of human-centric foreground action masks for the FineDiving dataset, called FineDiving-HM. With refined annotations on diverse target action procedures, FineDiving-HM can promote the development of real-world AQA systems. Through extensive experiments, we demonstrate the effectiveness of FineParser, which outperforms state-of-the-art methods while supporting more tasks of fine-grained action understanding. 参考文献: [1] Jinglin Xu, Sibo Yin, Guohao Zhao, Zishuo Wang, Yuxin Peng*, FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality Assessment, CVPR 2024. Oral (3.3%) 论文链接: https://openaccess.thecvf.com/content/CVPR2024/papers/Xu_FineParser_A_Fine-grained_Spatio-temporal_Action_Parser_for_Human-centric_Action_Quality_CVPR_2024_paper.pdf 代码链接: https://github.com/PKU-ICST-MIPL/FinePOSE_CVPR2024 视频讲者简介: Jinglin Xu is now an Associate Professor in the School of Intelligence Science and Technology at the University of Science and Technology Beijing (USTB), a council member, and deputy secretary-general of the Beijing Society of Image and Graphics (BSIG). Her research interests include computer vision, video understanding, and fine-grained action analysis, where she has authored more than 20 papers in top-tier journals and conference proceedings.
个人主页: https://xujinglin.github.io/ 特别鸣谢本次论文速览主要组织者: 月度轮值AC:于茜 (北京航空航天大学) |
小黑屋|手机版|Archiver|Vision And Learning SEminar
GMT+8, 2025-10-14 14:09 , Processed in 0.013389 second(s), 14 queries .
Powered by Discuz! X3.4
Copyright © 2001-2020, Tencent Cloud.