VALSE Webinar 20240612-16期总第351期可信基础模型

2024-6-12 18:10| 发布者: 程一-计算所| 查看: 548| 评论: 0

摘要: 报告嘉宾：刘扬 (UC Santa Cruz)报告题目：Large Language Model Unlearning报告嘉宾：韩波 (HKBU/RIKEN)报告题目：Exploring Trustworthy Foundation Models under Imperfect DataPanel议题：1.基础模型是否能彻底 ...

报告嘉宾：刘扬 (UC Santa Cruz)

报告题目：Large Language Model Unlearning

报告嘉宾：韩波 (HKBU/RIKEN)

报告题目：Exploring Trustworthy Foundation Models under Imperfect Data

Panel议题：

1. 基础模型是否能彻底解决开放场景的分类识别问题？

2. 如何定义可信基础模型？

3. 基础模型的可解释性，比如因果性有哪些挑战？

4. 基础模型是否相比于传统模型更加鲁棒可靠，如何有效评估和防范风险？

5. 基础模型的训练和推理可能涉及隐私数据，如何应对诸如模型逆向攻击等隐私泄漏和数据滥用的挑战？

6. 大量的数据是构建基础模型的基石，如何避免因数据歧视而导致的智能决策的偏见？

7. 是否可以通过增强模型的自我监督能力来提高基础模型在未知领域的泛化性？

8. 目前互联网上广泛存在生成模型自我生成的数据，长期来看，过量自生成的虚拟数据会如何影响基础模型的预训练？

Panel嘉宾：

宫辰 (南京理工大学)、宫明明 (墨尔本大学 / 穆罕默德·本·扎耶德人工智能大学)、刘扬 (UC Santa Cruz)、韩波 (HKBU/RIKEN)

*欢迎大家在下方留言提出主题相关问题，主持人和panel嘉宾会从中选择若干热度高的问题加入panel议题！

报告嘉宾：刘扬 (UC Santa Cruz)

报告时间：2024年6月12日 (星期三)晚上20:00 (北京时间)

报告题目：Large Language Model Unlearning

报告人简介：

Yang Liu is currently an Assistant Professor of Computer Science and Engineering at UC Santa Cruz (2019 - present). He was previously a postdoctoral fellow at Harvard University (2016 - 2018) hosted by Yiling Chen. He obtained his Ph.D. degree from the Department of EECS, University of Michigan, Ann Arbor in 2015. He is interested in crowdsourcing and algorithmic fairness in machine learning. He is a recipient of the NSF CAREER Award and the NSF Fairness in AI award (lead PI). He has been selected to participate in several high-profile projects, including DARPA SCORE and IARPA HFC. His research has been covered by media including WIRED and WSJ. His work on using machine learning to forecast future security incidents has been successfully commercialized and acquired by FICO. His recent works have won four best paper awards at relevant workshops.

个人主页：

https://yliuu.com/

报告摘要：

This talk presents large language model unlearning, a study on how to perform unlearning, i.e. forgetting undesirable (mis)behaviors, on large language models (LLMs). We show at least three scenarios of aligning LLMs with human preferences can benefit from unlearning: (1) removing harmful responses, (2) erasing copyright-protected content as requested, and (3) reducing hallucinations. Unlearning, as an alignment technique, has three advantages. (1) It only requires negative (e.g. harmful) examples, which are much easier and cheaper to collect (e.g. via red teaming or user reporting) than positive (e.g. helpful and often human-written) examples required in the standard alignment process. (2) It is computationally efficient. (3) It is especially effective when we know which training samples cause the misbehavior.

In addition, we envision LLM unlearning becoming a pivotal element in the life-cycle management of LLMs, potentially standing as an essential foundation for developing generative AI that is not only safe, secure, and trustworthy, but also resource-efficient without the need of full retraining. Towards this end, we navigate the unlearning landscape in LLMs from conceptual formulation, methodologies, metrics, and applications.

参考文献：

[1] Yao, Yuanshun, Xiaojun Xu, and Yang Liu. "Large language model unlearning." arXiv preprint arXiv:2310.10683 (2023).

[2] Liu, Sijia, et al. "Rethinking Machine Unlearning for Large Language Models." arXiv preprint arXiv:2402.08787 (2024).

报告嘉宾：韩波 (HKBU/RIKEN)

报告时间：2024年6月12日 (星期三)晚上20:30 (北京时间)

报告题目：Exploring Trustworthy Foundation Models under Imperfect Data

报告人简介：

Bo Han is an Assistant Professor in Machine Learning at Hong Kong Baptist University and a BAIHO Visiting Scientist at RIKEN AIP, where his research focuses on machine learning, deep learning, foundation models and their applications. He was a Visiting Research Scholar at MBZUAI MLD, a Visiting Faculty Researcher at Microsoft Research and Alibaba DAMO Academy, and a Postdoc Fellow at RIKEN AIP. He has co-authored three machine learning monographs by MIT Press, Springer Nature, and Foundations and Trends. He has served as Senior Area Chair of NeurIPS, and Area Chairs of NeurIPS, ICML, ICLR, UAI and AISTATS. He has also served as Action Editors of MLJ, TMLR, JAIR and IEEE TNNLS, and Editorial Board Members of JMLR and MLJ. He received Outstanding Paper Award at NeurIPS, Notable Area Chair at NeurIPS, Outstanding Area Chair at ICLR, and Outstanding Associate Editor at IEEE TNNLS. He received the RGC Early CAREER Scheme, NSFC General Program, IJCAI Early Career Spotlight, RIKEN BAIHO Award, Microsoft Research StarTrack Program, and Faculty Research Awards from ByteDance, Baidu, Alibaba and Tencent.

个人主页：

https://bhanml.github.io/

报告摘要：

In the current landscape of machine learning, it is crucial to build trustworthy foundation models that can operate under imperfect conditions, since most real-world data, such as unexpected inputs, image artifacts, and adversarial inputs, are easily noisy. These models need to possess human-like capabilities to learn and reason in uncertainty. In this talk, I will focus on three recent research advancements, each shedding light on the reliability, robustness, and safety in this field. Specifically, the reliability will be explored through the enhancement of vision-language models by introducing negative labels, which effectively detect out-of-distribution samples. Meanwhile, robustness will be explored through our investigation into image interpolation using diffusion models, addressing the challenge of information loss to ensure consistency and quality of generated content. Then, safety will be highlighted by our study on hypnotizing large language models, DeepInception, which leverages the creation of a novel nested scenario to induce adaptive jailbreak behaviors, revealing vulnerabilities during interactive model engagement. Furthermore, l will introduce the newly established Trustworthy Machine Learning and Reasoning (TMLR) Group at Hong Kong Baptist University.

参考文献：

[1] Xue Jiang, Feng Liu, Zhen Fang, Hong Chen, Tongliang Liu, Feng Zheng, Bo Han. Negative label guided OOD detection with pretrained vision-language models. In ICLR, 2024. (Spotlight)

[2] Pengfei Zheng, Yonggang Zhang, Zhen Fang, Tongliang Liu, Defu Lian, Bo Han. NoiseDiffusion: Correcting noise for image interpolation with diffusion models beyond spherical linear interpolation. In ICLR, 2024. (Spotlight)

[3] Xuan Li, Zhanke Zhou, Jianing Zhu, Jiangchao Yao, Tongliang Liu, Bo Han. DeepInception: Hypnotize large language model to be jailbreaker. In arXiv, 2023.

Panel 嘉宾：宫辰 (南京理工大学)

嘉宾简介：

宫辰现任南京理工大学计算机科学与工程学院教授、博导；获国家级青年人才计划、江苏省杰青。已在世界权威期刊或会议上发表100余篇学术论文，主要包括IEEE T-PAMI, IEEE T-NNLS, IEEE T-IP, ICML, NeurIPS, ICLR, CVPR, ICCV, AAAI, IJCAI等，另有7项发明专利获得授权。目前担任IEEE T-CSVT、NN、NePL编委，以及ICML、NeurIPS、ICLR、CVPR、ICCV、ECCV、AAAI、IJCAI、ICDM等多个国际会议的(S)PC member。曾获吴文俊人工智能优秀青年奖、中国科协“青年人才托举工程”、中国人工智能学会“优秀博士学位论文”奖、上海市自然科学二等奖等，并入选百度发布的全球华人AI青年学者榜单、斯坦福大学发布的全球前2％顶尖科学家榜单。

个人主页：

https://gcatnjust.github.io/ChenGong/index.html

Panel 嘉宾：宫明明 (墨尔本大学 / 穆罕默德·本·扎耶德人工智能大学)

嘉宾简介：

宫明明博士是墨尔本大学高级讲师并兼任穆罕默德·本·扎耶德人工智能大学副教授。他的研究兴趣括包因果学习、可信学习、生成模型、计算机视觉。他在 ICML、NeurIPS、ICLR、CVPR 等领域顶会顶刊撰写与合著了100多篇研究论文。他曾担任NeurIPS、ICML、ICLR等会议的Area Chair，获得澳大利亚AI新锐研究贡献奖、澳大亚利研究委员早会期职业奖、并被百度学者2022年评选为人工智领能域全球顶尖青年华人学者之一。

个人主页：

https://mingming-gong.github.io/

主持人：刘同亮 (悉尼大学)

主持人简介：

Tongliang Liu is an Associate Professor with the School of Computer Science and the Director of Sydney AI Centre at the University of Sydney. He is broadly interested in the fields of trustworthy machine learning and its interdisciplinary applications, with a particular emphasis on learning with noisy labels, adversarial learning, causal representation learning, transfer learning, unsupervised learning, and statistical deep learning theory. He has authored and co-authored more than 200 research articles including ICML, NeurIPS, ICLR, CVPR, ICCV, ECCV, AAAI, IJCAI, JMLR, and TPAMI. He is/was a senior meta-reviewer for NeurIPS, AAAI, and IJCAI. He is/was a meta-reviewer for many conferences, such as ICML, NeurIPS, ICLR, UAI, AAAI, IJCAI, and KDD, and was a notable AC for NeurIPS and ICLR. He is a co-Editor-in-Chief for Neural Networks, an Associate Editor of ACM Computing Surveys, and is on the Editorial Boards of JMLR and MLJ. He is a recipient of CORE Award for Outstanding Research Contribution in 2024, the IEEE AI’s 10 to Watch Award in 2022, the Future Fellowship Award from Australian Research Council in 2022, the Top-40 Early Achievers by The Australian in 2020, and the Discovery Early Career Researcher Award from Australian Research Council in 2018.

个人主页：

https://tongliang-liu.github.io/

特别鸣谢本次Webinar主要组织者：

主办AC：刘同亮 (悉尼大学)

活动参与方式

1、VALSE每周举行的Webinar活动依托B站直播平台进行，欢迎在B站搜索VALSE_Webinar关注我们！

直播地址：

https://live.bilibili.com/22300737；

历史视频观看地址：

https://space.bilibili.com/562085182/

2、VALSE Webinar活动通常每周三晚上20:00进行，但偶尔会因为讲者时区问题略有调整，为方便您参加活动，请关注VALSE微信公众号：valse_wechat 或加入VALSE QQ T群，群号：863867505）；

*注：申请加入VALSE QQ群时需验证姓名、单位和身份，缺一不可。入群后，请实名，姓名身份单位。身份：学校及科研单位人员T；企业研发I；博士D；硕士M。

3、VALSE微信公众号一般会在每周四发布下一周Webinar报告的通知。

4、您也可以通过访问VALSE主页：http://valser.org/ 直接查看Webinar活动信息。Webinar报告的PPT（经讲者允许后），会在VALSE官网每期报告通知的最下方更新。

收藏邀请

上一篇：VALSE第八届EAC委员纳新下一篇：VALSE 论文速览第177期：基于非线性邻居聚合的鲁棒图神经网络 ...

下级分类

小黑屋|手机版|Archiver|Vision And Learning SEminar

GMT+8, 2025-8-1 23:14 , Processed in 0.013452 second(s), 14 queries .

返回顶部

VALSE Webinar 20240612-16期 总第351期 可信基础模型

相关分类

下级分类

VALSE Webinar 20240612-16期总第351期可信基础模型