报告嘉宾:刘希慧 (香港大学) 报告题目:Towards Controllable and Compositional Visual Content Generation 报告嘉宾:陈铠 (香港科技大学) 报告题目:Geometric-Controllable Visual Generation: A Systematic Solution 报告嘉宾:刘希慧 (香港大学) 报告时间:2024年5月29日 (星期三)晚上20:00 (北京时间) 报告题目:Towards Controllable and Compositional Visual Content Generation 报告人简介: Xihui Liu is an Assistant Professor at the Department of Electrical and Electronic Engineering (EEE) and Institute of Data Science (IDS), The University of Hong Kong, affliated with HKU-MMLab. Before joining HKU, she was a postdoc Scholar at UC Berkeley, advised by Prof. Trevor Darrell. She obtained her Ph.D. degree from Multimedia Lab (MMLab), the Chinese University of Hong Kong, supervised by Prof. Xiaogang Wang and Prof. Hongsheng Li, and received her bachelor's degree from Tsinghua University. Her research interests cover computer vision, machine learning, and artificial intelligence, with special emphasis on visual synthesis, generative models, vision and language, and multimodal AI. She was awarded Adobe Research Fellowship 2020, MIT EECS Rising Stars 2021, and WAIC Rising Stars Award 2022. 个人主页: https://xh-liu.github.io/ 报告摘要: Visual content generation has achieved great success in the past few years, but current visual generation models still lack controllability and compositionality. In real applications, we desire highly controllable visual generation models which allow users to control the generated contents in a fine-grained manner. We also desire models which can effectively compose objects with different attributes and relationships into a complex and coherent scene. In this talk, I will introduce our several works towards controllable and compositional visual content generation. I will introduce T2I-CompBench for benchmarking compositional text-to-image generation. I will also introduce our recent works on drag-based video editing, controllable 3D generation, and concept editing for visual generative models. 报告嘉宾:陈铠 (香港科技大学) 报告时间:2024年5月29日 (星期三)晚上20:30 (北京时间) 报告题目:Geometric-Controllable Visual Generation: A Systematic Solution 报告人简介: Kai Chen is a PhD candidate in HKUST, supervised by Prof. Dit-Yan Yeung. His research aims at building generalizable AI systems via a data-centric perspective, especially in controllable generation for visual world modeling, Mixture-of-Experts (MoE) and (M)LLM self-alignment. He has published more than 10 papers in top conferences including CVPR, ICCV and ECCV, and actively served as academic reviewers and workshop organizers to promote development of the research community.
个人主页: https://kaichen1998.github.io/
报告摘要: Controllability is an essential property to use generative models in real-life applications. Text prompts are currently considered as the primary conditions due to the superior interactivity with humans. However, different from language modeling, our visual world is 3D environment with precise geometric constraints. A typical case is that a robot can “turn left” in various ways, but the moving trajectory cannot be determined without the specific geometric information (e.g., angles and distances). In this talk, I will systematically discuss how to introduce geometric controls into foundational text-to-image generative models, which are then generalized to controllable video and 3D scene generation separately. Finally, I will discuss several remaining problems proposed in our ECCV 2024 W-CODA Workshop, which might finally lead us toward unified visual world modeling. 主持人:贾旭 (大连理工大学) 主持人简介: 贾旭现为大连理工大学未来技术学院/人工智能学院长聘副教授,辽宁省智能感知与理解人工智能重点实验室骨干成员,博士毕业于比利时鲁汶大学,师从Tinne Tuytelaars教授和Luc Van Gool教授,曾在Google Research,商汤科技,华为诺亚方舟实验室等从事研究工作。现主要研究方向包括视觉内容增强与生成、类脑视觉等,近年来在CVPR、ICCV、NeurIPS、TPAMI、TIP等计算机视觉和机器学习领域顶级会议及期刊发表论文40余篇,Google Scholar引用8200余次,申请国内外专利10余项。主持或参与国家自然科学基金重点项目、科技部科技创新2030重大项目以及华为等多项科研项目。担任多个国际顶级会议和期刊领域主席和审稿人,CCF和CSIG多个专委会执行委员,以及VALSE第六、七届执行委员会委员。 个人主页: https://stephenjia.github.io/ 特别鸣谢本次Webinar主要组织者: 主办AC:贾旭 (大连理工大学) 活动参与方式 1、VALSE每周举行的Webinar活动依托B站直播平台进行,欢迎在B站搜索VALSE_Webinar关注我们! 直播地址: https://live.bilibili.com/22300737; 历史视频观看地址: https://space.bilibili.com/562085182/ 2、VALSE Webinar活动通常每周三晚上20:00进行,但偶尔会因为讲者时区问题略有调整,为方便您参加活动,请关注VALSE微信公众号:valse_wechat 或加入VALSE QQ T群,群号:863867505); *注:申请加入VALSE QQ群时需验证姓名、单位和身份,缺一不可。入群后,请实名,姓名身份单位。身份:学校及科研单位人员T;企业研发I;博士D;硕士M。 3、VALSE微信公众号一般会在每周四发布下一周Webinar报告的通知。 4、您也可以通过访问VALSE主页:http://valser.org/ 直接查看Webinar活动信息。Webinar报告的PPT(经讲者允许后),会在VALSE官网每期报告通知的最下方更新。 |
小黑屋|手机版|Archiver|Vision And Learning SEminar
GMT+8, 2025-1-3 04:08 , Processed in 0.012747 second(s), 14 queries .
Powered by Discuz! X3.4
Copyright © 2001-2020, Tencent Cloud.