I am a researcher at ByteDance, working on multimodal foundation models and intelligent agents. My research focuses on post-training models to observe, reason, and act over complex multimodal data, with recent work on medical AI agents and agentic reasoning systems.
Currently at ByteDance. I received my M.S. from Nanjing University, advised by Prof. Tong Lu. Previously at Tencent Youtu Lab, collaborating with Dr. Xiaobin Hu and Prof. Ying Tai.
News
| 2026.02 | MedXIAOHE: core contributor to a 640B-token medical VLM recipe, reaching SOTA on 30+ medical benchmarks. |
|---|---|
| 2026.02 | DEEPMED: project lead for a medical DeepResearch agent using multi-hop Med-Search data and turn-controlled agent training. |
| 2025.12 | CureFlow: project leader of an agentic medical reasoning system, winning CURE-Bench @ NeurIPS 2025 as 1st / 322 teams. |
| 2025.03 | Sonic: first-author CVPR 2025 portrait animation project based on global audio perception, with 3K+ GitHub stars. |
| 2020.06 | RealSR: first-author real-world super-resolution work using kernel estimation and noise injection; NTIRE 2020 @ CVPR dual-track champion with 400+ citations. |
Publications
Selected Publications

MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs
Baorong Shi, Bo Cui, Boyuan Jiang, Deli Yu, Fang Qian, Haihua Yang, Huichao Wang, Jiale Chen, Jianfei Pan, Jieqiong Cao, Jinghao Lin, Kai Wu, Lin Yang, Shengsheng Yao, Tao Chen, Xiaojun Xiao, Xiaozhong Ji, Xu Wang, Yijun He, Zhixiong Yang
arXiv 2026

Zihan Wang, Hao Wang, Shi Feng, Xiaocui Yang, Daling Wang, Yiqun Zhang, Jinghao Lin, Haihua Yang, Xiaozhong Ji
arXiv 2026

CureFlow: An AI Engine for Complex Medication Decision-Making
Xiaozhong Ji, Jinghao Lin, Yuhang Wu, Zihan Wang, Boyuan Jiang, Chao Gao
NeurIPS 2025 CURE-Bench Challenge | Poster | Leaderboard

Sonic: Shifting focus to global audio perception in portrait animation
Xiaozhong Ji, Xiaobin Hu, Zhihong Xu, Junwei Zhu, Chuming Lin, Qingdong He, Jiangning Zhang, Donghao Luo, Yi Chen, Qin Lin, Qinglin Lu, Chengjie Wang
CVPR 2025 | GitHub

HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation
Zunnan Xu, Zhentao Yu, Zixiang Zhou, Jun Zhou, Xiaoyu Jin, Fa-Ting Hong, Xiaozhong Ji, Junwei Zhu, Chengfei Cai, Shiyu Tang, Qin Lin, Xiu Li, Qinglin Lu
CVPR 2025 | GitHub

Real-World Super-Resolution via Kernel Estimation and Noise Injection
Xiaozhong Ji, Yun Cao, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang
CVPR 2020 Workshop | GitHub | GitHub (Tencent) |
Full Publication List

The Landscape of Medical Agents: A Survey
Xiaobin Hu, Yunhang Qian, Jiaquan Yu, …, Xiaozhong Ji, …, Shuicheng Yan, et al.
TechRxiv 2025

Good Teachers, Better Students: A Survey of Reward Models for LLM
Linhao Wang, Zihan Wang, Jinghao Lin, …, Xiaozhong Ji, Xiaobin Hu, et al.
TechRxiv 2025

Align and Surpass Human Camouflaged Perception: Visual Refocus Reinforcement Fine-Tuning
Ruolin Shen, Xiaozhong Ji, Kai WU, Jiangning Zhang, Yijun He, HaiHua Yang, Xiaobin Hu, Xiaoyu Sun
arXiv 2025

Human-MME: A Holistic Evaluation Benchmark for Human-Centric Multimodal Large Language Models
Yuansen Liu, Haiming Tang, Jinlong Peng, Jiangning Zhang, Xiaozhong Ji, Qingdong He, Wenbin Wu, Donghao Luo, Zhenye Gan, Junwei Zhu, et al.
arXiv 2025

Haozhen Gong, Xiaozhong Ji, Yuansen Liu, Wenbin Wu, Xiaoxiao Yan, Jingjing Liu, Kai Wu, Jiazhen Pan, Bailiang Jian, Jiangning Zhang, et al.
CVPR 2026

GroundingFace: Fine-grained Face Understanding via Pixel Grounding Multimodal Large Language Model
Yue Han, Jiangning Zhang, Junwei Zhu, Runze Hou, Xiaozhong Ji, Chuming Lin, Xiaobin Hu, Zhucun Xue, Yong Liu
CVPR 2025

DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation
Xiaobin Hu, Xu Peng, Donghao Luo, Xiaozhong Ji, Jinlong Peng, Zhengkai Jiang, Jiangning Zhang, Taisong Jin, Chengjie Wang, Rongrong Ji
ECCV 2024 | GitHub

Qingdong He, Jinlong Peng, Zhengkai Jiang, Kai Wu, Xiaozhong Ji, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Mingang Chen, Yunsheng Wu
IJCAI 2024 | GitHub

Weipeng Tan, Chuming Lin, Chengming Xu, FeiFan Xu, Xiaobin Hu, Xiaozhong Ji, Junwei Zhu, Chengjie Wang, Yanwei Fu

MIMAFace: Face Animation via Motion-Identity Modulated Appearance Feature Learning
Yue Han, Junwei Zhu, Yuxiang Feng, Xiaozhong Ji, Keke He, Xiangtai Li, Zhucun Xue, Yong Liu
arXiv 2024

Xiaozhong Ji, Chuming Lin, Zhonggan Ding, Ying Tai, Junwei Zhu, Xiaobin Hu, Donghao Luo, Yanhao Ge, Chengjie Wang
arXiv 2024

ColorFormer: Image Colorization via Color Memory Assisted Hybrid-Attention Transformer
Xiaozhong Ji, Boyuan Jiang, Donghao Luo, Guangpin Tao, Wenqing Chu, Zhifeng Xie, Chengjie Wang, Ying Tai
ECCV 2022 | GitHub | GitHub (Tencent)

Blind Face Restoration via Integrating Face Shape and Generative Priors
Feida Zhu, Junwei Zhu, Wenqing Chu, Xinyi Zhang, Xiaozhong Ji, Chengjie Wang, Ying Tai
CVPR 2022

Spectrum-to-Kernel Translation for Accurate Blind Image Super-Resolution
Guangpin Tao, Xiaozhong Ji, Wenzhuo Wang, Shuo Chen, Chuming Lin, Yun Cao, Tong Lu, Donghao Luo, Ying Tai
NeurIPS 2021

Frequency Consistent Adaptation for Real World Super Resolution
Xiaozhong Ji, Guangpin Tao, Yun Cao, Ying Tai, Tong Lu, Chengjie Wang, Jilin Li, Feiyue Huang
AAAI 2021

AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting
Wenhai Wang, Xuebo Liu, Xiaozhong Ji, Enze Xie, Ding Liang, ZhiBo Yang, Tong Lu, Chunhua Shen, Ping Luo
ECCV 2020
Awards & Honors
-
Champion, CURE-Bench @ NeurIPS 2025 (November 2025) 1st place out of 322 teams, both internal reasoning and agentic reasoning tracks. Competition Website | NeurIPS Page
-
Outstanding Master’s Thesis Award, Nanjing University (2021) Thesis: “Key Techniques for Super-Resolution of Blurred Data in Real-World Scenarios”
-
Champion, NTIRE 2020 Real-World Super-Resolution Challenge @ CVPR (June 2020) 1st place in both Track 1 and Track 2. Competition Website | Code