VALSE

VALSE 首页 活动通知 查看内容

VALSE 论文速览 第219期:Tuning-Free High-Performance Model Merging

2025-8-14 20:42| 发布者: 程一-计算所| 查看: 10| 评论: 0

摘要: 为了使得视觉与学习领域相关从业者快速及时地了解领域的最新发展动态和前沿技术进展,VALSE最新推出了《论文速览》栏目,将在每周发布一至两篇顶会顶刊论文的录制视频,对单个前沿工作进行细致讲解。本期VALSE论文速 ...

为了使得视觉与学习领域相关从业者快速及时地了解领域的最新发展动态和前沿技术进展,VALSE最新推出了《论文速览》栏目,将在每周发布一至两篇顶会顶刊论文的录制视频,对单个前沿工作进行细致讲解。本期VALSE论文速览聚焦复旦大学的模型融合研究,陈涛教授和欧阳万里教授指导,一作黄辰宇同学录制。


论文题目:

EMR-Merging: Tuning-Free High-Performance Model Merging

作者列表:

黄辰宇 (复旦大学)、叶鹏 (复旦大学、香港中文大学)、陈涛 (复旦大学)、贺通 (上海人工智能实验室)、岳翔宇 (香港中文大学)、欧阳万里 (香港中文大学)


B站观看网址:

https://www.bilibili.com/video/BV1iko7Y4Eao/


论文摘要:

l The success of pretrain-finetune paradigm brings about the release of numerous model weights. In this case, merging models finetuned on different tasks to enable a single model with multi-task capabilities is gaining increasing attention for its practicability. Existing model merging methods usually suffer from (1) significant performance degradation or (2) requiring tuning by additional data or training. In this paper, we rethink and analyze the existing model merging paradigm. We discover that using a single model's weights can hardly simulate all the models' performance. To tackle this issue, we propose Elect, Mask & Rescale-Merging (EMR-Merging). We first (a) elect a unified model from all the model weights and then (b) generate extremely lightweight task-specific modulators, including masks and rescalers, to align the direction and magnitude between the unified model and each specific model, respectively. EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance. We find that EMR-Merging shows outstanding performance compared to existing merging methods under different classical and newly-established settings, including merging different numbers of vision models (up to 30), NLP models, PEFT models, and multi-modal models.


参考文献:

[1] EMR-Merging: Tuning-Free High-Performance Model Merging, Chenyu Huang, Peng Ye, Tao Chen, Tong He, Xiangyu Yue, Wanli Ouyang, NeurIPS 2024.


论文链接:

https://arxiv.org/abs/2405.17461


代码链接:

https://github.com/harveyhuang18/EMR_Merging


视频讲者简介:

l Chenyu Huang is currently working towards the Ph.D. degree at School of Information Science and Technology, Fudan University. He received his M.S. and B.S. degree from School of Electronic Science and Engineering, Southeast University. His research interests include computer vision, model merging, supervised finetuning, and model compression.



特别鸣谢本次论文速览主要组织者:

月度轮值AC:杨旭 (西安电子科技大学)

小黑屋|手机版|Archiver|Vision And Learning SEminar

GMT+8, 2025-10-15 06:29 , Processed in 0.012646 second(s), 14 queries .

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

返回顶部