VALSE

VALSE 首页 活动通知 查看内容

VALSE 论文速览 第204期:病理图驱动的的细粒度脑CT报告生成研究 ...

2025-1-8 19:06| 发布者: 程一-计算所| 查看: 41| 评论: 0

摘要: 为了使得视觉与学习领域相关从业者快速及时地了解领域的最新发展动态和前沿技术进展,VALSE最新推出了《论文速览》栏目,将在每周发布一至两篇顶会顶刊论文的录制视频,对单个前沿工作进行细致讲解。本期VALSE论文速 ...

为了使得视觉与学习领域相关从业者快速及时地了解领域的最新发展动态和前沿技术进展,VALSE最新推出了《论文速览》栏目,将在每周发布一至两篇顶会顶刊论文的录制视频,对单个前沿工作进行细致讲解。本期VALSE论文速览选取了来自北京工业大学等机构的医学报告生成 (Medical Report Generation)的工作。该工作的通讯作者为北京工业大学张晓丹副教授和香港大学屈靓琼助理教授,由论文一作时彦钊同学录制。


论文题目:

Granularity Matters: Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation

作者列表:

时彦钊 (北京工业大学),冀俊忠 (北京工业大学),张晓丹 (北京工业大学),屈靓琼 (香港大学),刘颖 (北京大学第三医院)


B站观看网址:

https://www.bilibili.com/video/BV1oXxNeAE9t/?spm_id_from=333.999.0.0&vd_source=2044ac986b5caa4c8b5f00b525441df3


复制链接到浏览器打开或点击阅读原文即可跳转至观看页面。


论文摘要:

The automatic Brain CT reports generation can improve the efficiency and accuracy of diagnosing cranial diseases. However, current methods are limited by 1) coarse-grained supervision: the training data in image-text format lacks detailed supervision for recognizing subtle abnormalities, and 2) coupled cross-modal alignment: visual-textual alignment may be inevitably coupled in a coarse-grained manner, resulting in tangled feature representation for report generation. In this paper, we propose a novel Pathological Graph-driven Cross-modal Alignment (PGCA) model for accurate and robust Brain CT report generation. Our approach effectively decouples the cross-modal alignment by constructing a Pathological Graph to learn fine-grained visual cues and align them with textual words. This graph comprises heterogeneous nodes representing essential pathological attributes (i.e., tissue and lesion) connected by intra- and inter-attribute edges with prior domain knowledge. Through carefully designed graph embedding and updating modules, our model refines the visual features of subtle tissues and lesions and aligns them with textual words using contrastive learning. Extensive experimental results confirm the viability of our method. We believe that our PGCA model holds the potential to greatly enhance the automatic generation of Brain CT reports and ultimately contribute to improved cranial disease diagnosis.


参考文献:

[1] Yanzhao Shi, Junzhong Ji, Xiaodan Zhang, Liangqiong Qu, and Ying Liu. 2023. Granularity Matters: Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), pages 6617–6630.


论文链接:

[https://aclanthology.org/2023.emnlp-main.408.pdf]


视频讲者简介:

时彦钊,北京工业大学硕士研究生二年级在读,师从冀俊忠教授与张晓丹副教授,主要研究方向为医学报告自动生成、医学视觉文本表示学习。目前在国际顶级会议和SCI期刊上发表多篇论文。


个人主页:

https://yanzhaoshi.github.io/



特别鸣谢本次论文速览主要组织者:

月度轮值AC:武宇 (武汉大学)


活动参与方式

1、VALSE每周举行的Webinar活动依托B站直播平台进行,欢迎在B站搜索VALSE_Webinar关注我们!

直播地址:

https://live.bilibili.com/22300737;

历史视频观看地址:

https://space.bilibili.com/562085182/ 


2、VALSE Webinar活动通常每周三晚上20:00进行,但偶尔会因为讲者时区问题略有调整,为方便您参加活动,请关注VALSE微信公众号:valse_wechat 或加入VALSE QQ T群,群号:863867505);


*注:申请加入VALSE QQ群时需验证姓名、单位和身份缺一不可。入群后,请实名,姓名身份单位。身份:学校及科研单位人员T;企业研发I;博士D;硕士M。


3、VALSE微信公众号一般会在每周四发布下一周Webinar报告的通知。


4您也可以通过访问VALSE主页:http://valser.org/ 直接查看Webinar活动信息。Webinar报告的PPT(经讲者允许后),会在VALSE官网每期报告通知的最下方更新。

小黑屋|手机版|Archiver|Vision And Learning SEminar

GMT+8, 2025-1-31 05:16 , Processed in 0.012745 second(s), 14 queries .

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

返回顶部