About Me
Hi! I am a first-year PhD student at MMLab, The Chinese University of Hong Kong, supervised by Prof. Hongsheng Li and Prof. Xiaogang Wang. Prior to this, I received my Master's degree from the AAIS, Peking University in 2025, where I was a member of Lanco Lab, Institute of Computational Linguistics (ICL), supervised by Prof. Xu Sun. I obtained my Bachelor's degree from the School of Data Science, Fudan University in 2022.
My research interests lie in multimodal learning, spanning from visual understanding to visual agents, especially for MLLMs. I am also interested in diffusion models and their application to visual generation.
News
- [2026.05] Excited to join Qwen as a research intern.
- [2026.04] Check out our new paper on reasoning-enhanced embedding models MMEmb-R1.
- [2025.10] Release Techinical report for SAIL-Embedding, tailored for Douyin Recommendation.
- [2025.08] Joining MMLab@CUHK as a PhD student.
- [2025.08] One paper accepted by EMNLP 2025 – RICO.
- [2025.07] One paper accepted by ACM MM 2025.
- [2025.06] One paper accepted by ACL 2025.
- [2025.02] One paper accepted by CVPR 2025 – VidTwin.
- [2024.12] One paper accepted by AAAI 2025.
- [2024.12] Two papers accepted by FinNLP@COLING 2025.
- [2024.06] We release the InstructAvatar project page!
- [2024.05] One paper accepted by ACL 2024 (Findings).
- [2024.03] One paper accepted by NAACL 2024 – LaDiC.
- [2024.01] One paper accepted by ICLR 2024 – GAIA.
- [2023.10] We release the GAIA demo!
- [2023.10] One paper accepted by FMDM@NeurIPS 2023.
- [2023.05] Starting internship at Microsoft Research Asia ML Group.
- [2022.09] Joining Lanco Lab, Peking University.
Education
CUHK
The Chinese University of Hong Kong
AI PhD, MMLab
2025 – Expected 2029
Multimedia Lab · Advisor: Prof. Hongsheng Li and Prof. Xiaogang Wang
PKU
Fudan
Fudan University
Bachelor of Science, School of Data Science
2018 – 2022
GPA Rank: 3 / 85
Selected Publications Full List →
Visual Understanding
MMEmb-R1: Reasoning-Enhanced Multimodal Embedding with Pair-Aware Selection and Adaptive Control
Preprint
Visual Generation
Internship
Alibaba Group
Alibaba Qwen
June 2026 – Now · Hangzhou, China
Tongyi Qwen, Base Large Model Team, closely worked with
Rongyao Fang
and Shuai Bai
Topic: VL Agentic, AI Coding
ByteDance
ByteDance Douyin
June 2025 – April 2026 · Beijing/Shenzhen, China
Douyin (抖音) Content Group, Base Multimodal Model Team, closely worked with
Dingkang Yang, Haiyang Yu
and Xiao Liang
Topic: Multimodal Representation, MLLM, Recommendation
Kling (可灵) Kuaishou Technology
Kuaishou Kling
November 2024 – May 2025 · Beijing, China
Kling (可灵) AI, Multimodal Understanding Group, closely worked with
Yuanxing Zhang
Topic: Multimodal Understanding, Image Captioning
MSRA
Microsoft Research Asia (MSRA)
May 2023 – August 2024 · Beijing, China
Topic: Multimodal Generative Learning, Talking Avatar, Video VAE
Academic Service
Conference/Journal Reviewer
Teaching Assistant
- PKU Class 2024 Spring: Introduction to Large Language Models
- ELEG5760: Machine Learning for Multimedia Applications
- AIMS5710: Deep Learning Fundamentals and Theories