VALSE 论文速览第193期：GenSAM: 用单一通用提示分割伪装对象

2024-9-9 11:08| 发布者: 程一-计算所| 查看: 1242| 评论: 0

摘要: 论文题目：Relax Image-Specific Prompt Requirement in SAM: A Single Generic Prompt for Segmenting Camouflaged Objects作者列表：胡健 (伦敦大学玛丽女王学院)，林佳仪 (伦敦大学玛丽女王学院)，蔡卫彤 (伦敦大 ...

论文题目：

Relax Image-Specific Prompt Requirement in SAM: A Single Generic Prompt for Segmenting Camouflaged Objects

作者列表：

胡健 (伦敦大学玛丽女王学院)，林佳仪 (伦敦大学玛丽女王学院)，蔡卫彤 (伦敦大学玛丽女王学院)，Shaogang Gong (伦敦大学玛丽女王学院)

B站观看网址：

https://www.bilibili.com/video/BV1pi421a7HY/

论文摘要：

A Camouflaged object detection (COD) approaches heavily rely on pixel-level annotated datasets. Weakly-supervised COD (WSCOD) approaches use sparse annotations like scribbles or points to reduce annotation efforts, but this can lead to decreased accuracy. The Segment Anything Model (SAM) shows remarkable segmentation ability with sparse prompts like points. However, manual prompt is not always feasible, as it may not be accessible in real-world application. Additionally, it only provides localization information instead of semantic one, which can intrinsically cause ambiguity in interpreting targets. In this work, we aim to eliminate the need for manual prompt. The key idea is to employ Cross-modal Chains of Thought Prompting (CCTP) to reason visual prompts using the semantic information given by a generic text prompt. To that end, we introduce a test-time instance-wise adaptation mechanism called Generalizable SAM (GenSAM) to automatically generate and optimize visual prompts from the generic task prompt for WSCOD.

参考文献：

[1] Hu J, Lin J, Gong S, et al. Relax Image-Specific Prompt Requirement in SAM: A Single Generic Prompt for Segmenting Camouflaged Objects[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2024, 38(11): 12511-12518.

论文链接：

[https://arxiv.org/abs/2312.07374]

代码链接：

[https://github.com/jyLin8100/GenSAM]

视频讲者简介：

Jian Hu is pursuing his PhD in the Computer Vision Group at Queen Mary, University of London, under the mentorship of Prof. Shaogang Gong. His research is concentrated on deep learning and computer vision, with a particular emphasis on Transfer learning and Semi-supervised learning. His work is dedicated to devising methods for cross-domain knowledge transfer in uncontrolled, real-world environments. He has contributed to and published papers at prestigious conferences including ECCV, AAAI, and SIFIR, among others.

个人主页：

https://lwpyh.github.io/

特别鸣谢本次论文速览主要组织者：

月度轮值AC：陈使明 (阿联酋人工智能大学)

活动参与方式

1、VALSE每周举行的Webinar活动依托B站直播平台进行，欢迎在B站搜索VALSE_Webinar关注我们！

直播地址：

https://live.bilibili.com/22300737；

历史视频观看地址：

https://space.bilibili.com/562085182/

2、VALSE Webinar活动通常每周三晚上20:00进行，但偶尔会因为讲者时区问题略有调整，为方便您参加活动，请关注VALSE微信公众号：valse_wechat 或加入VALSE QQ T群，群号：863867505）；

*注：申请加入VALSE QQ群时需验证姓名、单位和身份，缺一不可。入群后，请实名，姓名身份单位。身份：学校及科研单位人员T；企业研发I；博士D；硕士M。

3、VALSE微信公众号一般会在每周四发布下一周Webinar报告的通知。

4、您也可以通过访问VALSE主页：http://valser.org/ 直接查看Webinar活动信息。Webinar报告的PPT（经讲者允许后），会在VALSE官网每期报告通知的最下方更新。

收藏邀请

上一篇：VALSE Webinar 2024-26期总第361期医疗人工智能的前沿进展下一篇：VALSE Webinar 24-27期总第362期医学视觉语言大模型：进展与展望 ...

下级分类

小黑屋|手机版|Archiver|Vision And Learning SEminar

GMT+8, 2025-10-29 20:59 , Processed in 0.013408 second(s), 14 queries .

返回顶部

VALSE 论文速览 第193期：GenSAM: 用单一通用提示分割伪装对象

相关分类

下级分类

VALSE 论文速览第193期：GenSAM: 用单一通用提示分割伪装对象