My research focuses on post-training of multimodal foundation models and agents.
I received my M.S. from Nanjing University, advised by Prof. Tong Lu. Previously at Tencent Youtu Lab, collaborating with Dr. Xiaobin Hu and Prof. Ying Tai.
News
- 2026.02: Released MedXIAOHE, a medical vision-language foundation model pretrained on 640B tokens. SOTA across 30+ medical benchmarks, outperforming GPT-5.2 Thinking and Gemini 3.0 Pro (e.g., MMMU-Med 87.53, MedQA-USMLE 97.88, OmniMedVQA 83.40). [Paper]
- 2026.02: Released DEEPMED, a medical DeepResearch agent. +9.79% over base model across 7 medical benchmarks, outperforming larger medical reasoning and DR models. [Paper]
- 2026.02: Three papers accepted to CVPR / ICLR 2026 (MedLesionVQA, etc.).
- 2025.12: Champion in CURE-Bench @ NeurIPS 2025 (1st out of 322 teams), the first competition on Agentic AI reasoning for drug decision-making in precision therapeutics.
- 2025.03: Four papers accepted to CVPR / ACM MM 2025 (Sonic, GroundingFace, HunyuanPortrait, DICE-Talk).
- 2024.07: Two papers accepted to ECCV / IJCAI 2024 (DiffuMatting, UniM-OV3D).
- 2022.03: Two papers accepted to CVPR / ECCV 2022 (ColorFormer, SGPN).
- 2021.09: Three papers accepted to NeurIPS / AAAI / ECCV 2021 (S2K, FCA, AE-TextSpotter).
- 2020.06: Champion in NTIRE 2020 Real-World SR Challenge @ CVPR.
Publications
Selected Publications

MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs
Baorong Shi, Bo Cui, Boyuan Jiang, Deli Yu, Fang Qian, Haihua Yang, Huichao Wang, Jiale Chen, Jianfei Pan, Jieqiong Cao, Jinghao Lin, Kai Wu, Lin Yang, Shengsheng Yao, Tao Chen, Xiaojun Xiao, Xiaozhong Ji, Xu Wang, Yijun He, Zhixiong Yang
arXiv 2026

Zihan Wang, Hao Wang, Shi Feng, Xiaocui Yang, Daling Wang, Yiqun Zhang, Jinghao Lin, Haihua Yang, Xiaozhong Ji
arXiv 2026

CureFlow: An AI Engine for Complex Medication Decision-Making
Xiaozhong Ji, Jinghao Lin, Yuhang Wu, Zihan Wang, Boyuan Jiang, Chao Gao
NeurIPS 2025 CURE-Bench Challenge | Poster | Leaderboard

Good Teachers, Better Students: A Survey of Reward Models for LLM
Linhao Wang, Zihan Wang, Jinghao Lin, …, Xiaozhong Ji, Xiaobin Hu, et al.
TechRxiv 2025

Align and Surpass Human Camouflaged Perception: Visual Refocus Reinforcement Fine-Tuning
Ruolin Shen, Xiaozhong Ji, Kai WU, Jiangning Zhang, Yijun He, HaiHua Yang, Xiaobin Hu, Xiaoyu Sun
arXiv 2025

Sonic: Shifting focus to global audio perception in portrait animation
Xiaozhong Ji, Xiaobin Hu, Zhihong Xu, Junwei Zhu, Chuming Lin, Qingdong He, Jiangning Zhang, Donghao Luo, Yi Chen, Qin Lin, Qinglin Lu, Chengjie Wang
CVPR 2025 | GitHub

HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation
Zunnan Xu, Zhentao Yu, Zixiang Zhou, Jun Zhou, Xiaoyu Jin, Fa-Ting Hong, Xiaozhong Ji, Junwei Zhu, Chengfei Cai, Shiyu Tang, Qin Lin, Xiu Li, Qinglin Lu
CVPR 2025 | GitHub

GroundingFace: Fine-grained Face Understanding via Pixel Grounding Multimodal Large Language Model
Yue Han, Jiangning Zhang, Junwei Zhu, Runze Hou, Xiaozhong Ji, Chuming Lin, Xiaobin Hu, Zhucun Xue, Yong Liu
CVPR 2025

DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation
Xiaobin Hu, Xu Peng, Donghao Luo, Xiaozhong Ji, Jinlong Peng, Zhengkai Jiang, Jiangning Zhang, Taisong Jin, Chengjie Wang, Rongrong Ji
ECCV 2024 | GitHub

Real-World Super-Resolution via Kernel Estimation and Noise Injection
Xiaozhong Ji, Yun Cao, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang
CVPR 2020 Workshop | GitHub | GitHub (Tencent) |
Full Publication List

The Landscape of Medical Agents: A Survey
Xiaobin Hu, Yunhang Qian, Jiaquan Yu, …, Xiaozhong Ji, …, Shuicheng Yan, et al.
TechRxiv 2025

Human-MME: A Holistic Evaluation Benchmark for Human-Centric Multimodal Large Language Models
Yuansen Liu, Haiming Tang, Jinlong Peng, Jiangning Zhang, Xiaozhong Ji, Qingdong He, Wenbin Wu, Donghao Luo, Zhenye Gan, Junwei Zhu, et al.
arXiv 2025

Haozhen Gong, Xiaozhong Ji, Yuansen Liu, Wenbin Wu, Xiaoxiao Yan, Jingjing Liu, Kai Wu, Jiazhen Pan, Bailiang Jian, Jiangning Zhang, et al.
CVPR 2026

Qingdong He, Jinlong Peng, Zhengkai Jiang, Kai Wu, Xiaozhong Ji, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Mingang Chen, Yunsheng Wu
IJCAI 2024 | GitHub

Weipeng Tan, Chuming Lin, Chengming Xu, FeiFan Xu, Xiaobin Hu, Xiaozhong Ji, Junwei Zhu, Chengjie Wang, Yanwei Fu

MIMAFace: Face Animation via Motion-Identity Modulated Appearance Feature Learning
Yue Han, Junwei Zhu, Yuxiang Feng, Xiaozhong Ji, Keke He, Xiangtai Li, Zhucun Xue, Yong Liu
arXiv 2024

Xiaozhong Ji, Chuming Lin, Zhonggan Ding, Ying Tai, Junwei Zhu, Xiaobin Hu, Donghao Luo, Yanhao Ge, Chengjie Wang
arXiv 2024

ColorFormer: Image Colorization via Color Memory Assisted Hybrid-Attention Transformer
Xiaozhong Ji, Boyuan Jiang, Donghao Luo, Guangpin Tao, Wenqing Chu, Zhifeng Xie, Chengjie Wang, Ying Tai
ECCV 2022 | GitHub | GitHub (Tencent)

Blind Face Restoration via Integrating Face Shape and Generative Priors
Feida Zhu, Junwei Zhu, Wenqing Chu, Xinyi Zhang, Xiaozhong Ji, Chengjie Wang, Ying Tai
CVPR 2022

Spectrum-to-Kernel Translation for Accurate Blind Image Super-Resolution
Guangpin Tao, Xiaozhong Ji, Wenzhuo Wang, Shuo Chen, Chuming Lin, Yun Cao, Tong Lu, Donghao Luo, Ying Tai
NeurIPS 2021

Frequency Consistent Adaptation for Real World Super Resolution
Xiaozhong Ji, Guangpin Tao, Yun Cao, Ying Tai, Tong Lu, Chengjie Wang, Jilin Li, Feiyue Huang
AAAI 2021

AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting
Wenhai Wang, Xuebo Liu, Xiaozhong Ji, Enze Xie, Ding Liang, ZhiBo Yang, Tong Lu, Chunhua Shen, Ping Luo
ECCV 2020
Awards & Honors
-
Champion, CURE-Bench @ NeurIPS 2025 (November 2025) 1st place out of 322 teams, both internal reasoning and agentic reasoning tracks. Competition Website | NeurIPS Page
-
Outstanding Master’s Thesis Award, Nanjing University (2021) Thesis: “Key Techniques for Super-Resolution of Blurred Data in Real-World Scenarios”
-
Champion, NTIRE 2020 Real-World Super-Resolution Challenge @ CVPR (June 2020) 1st place in both Track 1 and Track 2. Competition Website | Code