Sports Research Centre

3D / Human-Centric Scene Understanding & Multimodal Understanding and Generation

Assistant Professor

Department of Computer Science and Engineering, HKUST

Di2Pose: Discrete Diffusion Model for Occluded 3D Human Pose Estimation (NeurIPS'24)

View-Consistent 3D Editing with Gaussian Splatting (ECCV'24)

Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models (NeurIPS'23)

DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism (ECCV'24)

IterIS: Iterative Inference-Solving Alignment for LoRA Merging (CVPR'25)
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation (ICLR'25)

CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation (CVPR'25)