Hi! I’m a final year master’s student at the AAIS (Academy for Advanced Interdisciplinary Studies), Peking University. Currently, I am a member of Lanco Lab, lead by Prof. Xu Sun. Previously, I earned my Bachelor’s degree from the School of Data Science, Fudan University.
My research interests encompass (1) Multimodal learning, including visual understanding, text-guided visual generation (e.g., image/video generation or editing, talking face generation), etc (2) Generative Models such as diffusion models and VAEs (3) LLMs (especially multimodal large language models) and their applications like embodied AI.
I’m currently seeking a potential PhD position. Feel free to reach out if you are interested!
Master in Data Science, Peking University, 2025 (Expected)
Lanco Lab; Advisor: Xu Sun; AAIS
BSc in Data Science, Fudan University, 2022
School of Data Science; GPA Rank: 3/85
We present InstructAvatar, a novel text-guided approach for generating emotionally expressive 2D avatars, offering fine-grained control, improved interactivity and generalizability to the resulting video.
We explore the previously untapped advantages of diffusion models over autoregressive (AR) methods in image-to-text generation. Through meticulous design of a latent-based diffusion model tailored for captioning, we achieve comparable performance with some strong AR baselines.
I am currently serving as a teaching assistant for the course Large Language Model in Decision Intelligence (PKU Class 2024 Spring). This course is tailored for undergraduates, aiming to provide them with a foundational understanding of large language models and effective strategies for their utilization.