王宇驰 (Yuchi Wang)

About Me

Hi! I am a first-year PhD student at MMLab, The Chinese University of Hong Kong, supervised by Prof. Hongsheng Li and Prof. Xiaogang Wang. Prior to this, I received my Master's degree from the AAIS, Peking University in 2025, where I was a member of Lanco Lab, Institute of Computational Linguistics (ICL), supervised by Prof. Xu Sun. I obtained my Bachelor's degree from the School of Data Science, Fudan University in 2022.

My research interests lie in multimodal learning, spanning from visual understanding to visual agents, especially for MLLMs. I am also interested in diffusion models and their application to visual generation.

News

[2026.05] Excited to join Qwen as a research intern.
[2026.04] Check out our new paper on reasoning-enhanced embedding models MMEmb-R1.
[2025.10] Release Techinical report for SAIL-Embedding, tailored for Douyin Recommendation.
[2025.08] Joining MMLab@CUHK as a PhD student.
[2025.08] One paper accepted by EMNLP 2025 – RICO.
[2025.07] One paper accepted by ACM MM 2025.
[2025.06] One paper accepted by ACL 2025.
[2025.02] One paper accepted by CVPR 2025 – VidTwin.
[2024.12] One paper accepted by AAAI 2025.
[2024.12] Two papers accepted by FinNLP@COLING 2025.
[2024.06] We release the InstructAvatar project page!
[2024.05] One paper accepted by ACL 2024 (Findings).
[2024.03] One paper accepted by NAACL 2024 – LaDiC.
[2024.01] One paper accepted by ICLR 2024 – GAIA.
[2023.10] We release the GAIA demo!
[2023.10] One paper accepted by FMDM@NeurIPS 2023.
[2023.05] Starting internship at Microsoft Research Asia ML Group.
[2022.09] Joining Lanco Lab, Peking University.

Education

CUHK

The Chinese University of Hong Kong

AI PhD, MMLab

2025 – Expected 2029

Multimedia Lab · Advisor: Prof. Hongsheng Li and Prof. Xiaogang Wang

PKU

Peking University

Master of Data Science, AAIS

2022 – 2025

Lanco Lab, ICL · Advisor: Prof. Xu Sun

Fudan

Fudan University

Bachelor of Science, School of Data Science

2018 – 2022

GPA Rank: 3 / 85

Selected Publications Full List →

Visual Understanding

MMEmb-R1: Reasoning-Enhanced Multimodal Embedding with Pair-Aware Selection and Adaptive Control

Yuchi Wang, Haiyang Yu, Weikang Bian, Jiefeng Long, Xiao Liang, Chao Feng, Hongsheng Li

Preprint

Paper

SAIL-Embedding Technical Report: Omni-modal Embedding Foundation Model

Yuchi Wang (Core Contributor), et al

Technical Report

Paper

RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction

Yuchi Wang, Yishuo Cai, Shuhuai Ren, Sihan Yang, Linli Yao, Yuanxin Liu, Yuanxing Zhang, Pengfei Wan, Xu Sun

EMNLP 2025

Paper Code

LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?

Yuchi Wang*, Shuhuai Ren*, Rundong Gao, Linli Yao, Qingyan Guo, Kaikai An, Jianhong Bai, Xu Sun

NAACL 2024

Paper Code

Visual Generation

VidTwin: Video VAE with Decoupled Structure and Dynamics

Yuchi Wang, Junliang Guo, Xinyi Xie, Tianyu He, Xu Sun, Jiang Bian

CVPR 2025

Paper Code Project

InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation

Yuchi Wang, Junliang Guo, Jianhong Bai, Runyi Yu, Tianyu He, Xu Tan, Xu Sun, Jiang Bian

AAAI 2025

Paper Project

GAIA: Zero-shot Talking Avatar Generation

Tianyu He*, Junliang Guo*, Runyi Yu*, Yuchi Wang*, Jialiang Zhu, Kaikai An, Leyi Li, Xu Tan, Chunyu Wang, Han Hu, HsiangTao Wu, Sheng Zhao, Jiang Bian

ICLR 2024

Paper Project

Internship

Alibaba Group

Alibaba Qwen

June 2026 – Now · Hangzhou, China

Tongyi Qwen, Base Large Model Team, closely worked with Rongyao Fang and Shuai Bai

Topic: VL Agentic, AI Coding

ByteDance

ByteDance Douyin

June 2025 – April 2026 · Beijing/Shenzhen, China

Douyin (抖音) Content Group, Base Multimodal Model Team, closely worked with Dingkang Yang, Haiyang Yu and Xiao Liang

Topic: Multimodal Representation, MLLM, Recommendation

Kling (可灵) Kuaishou Technology

Kuaishou Kling

November 2024 – May 2025 · Beijing, China

Kling (可灵) AI, Multimodal Understanding Group, closely worked with Yuanxing Zhang

Topic: Multimodal Understanding, Image Captioning

MSRA

Microsoft Research Asia (MSRA)

May 2023 – August 2024 · Beijing, China

ML (Machine Learning) Group, closely worked with Junliang Guo, Tianyu He and Xu Tan

Topic: Multimodal Generative Learning, Talking Avatar, Video VAE

Academic Service

Conference/Journal Reviewer

NeurIPS 2026 PR CVPR 2026 ACL 2025, 2026 EMNLP 2025, 2026 ARR Oct 2025, Mar 2026 AAAI 2025

Teaching Assistant

PKU Class 2024 Spring: Introduction to Large Language Models Peking University · 2024
ELEG5760: Machine Learning for Multimedia Applications CUHK · 2025
AIMS5710: Deep Learning Fundamentals and Theories CUHK · 2026