I am a 3rd-year undergraduate in Mathematics and Computer Science at Arcadia University, currently collaborating with Prof. Zhengzhong Tu at Texas A&M University and Prof. Yue Zhao at USC on Reasoning and Alignment for Multimodal Large Language Models (MLLMs). Previously, I worked with Prof. Zhe Liu, and Prof. Victor S. Sheng on research in Robust Medical Vision and Multimodal Machine Learning. Moving forward, I am eager to continue exploring the magic of Large Vision-Language Models.
Research Interests
My long-term vision is to develop efficient, robust and generalizable machine learning system capable of perceiving, understanding and interacting with the world through multimodal information from both 2D and 3D perspectives. Specifically, my previous research focuses on these topics:
- Generalizable Medical Image Segmentation with Sparse and Noisy Labeled Data
- Modality Competition and Imbalances for Multimodal Machine Learning
- Cross-modal Decoupling and Alignment for Multimodal Representation Learning
- Reasoning and Alignment for Large Vision-language Models
π₯ News
- 2025.03: Β ππ Excited to propose my first-author work DecAlign, a novel cross-modal decoupling and alignment framwork for multimodal representation learning, which is now available on ArXiv!
- 2025.02: Β ππ Excited to propose Re-Align, a novel alignment framework that leverages image retrieval to mitigate hallucinations in Vision Language Models, which is now available on ArXiv!
- 2025.01: Β ππ I will collaborate with Prof. Zhengzhong Tu on advancing cutting-edge research in the alignment of Multimodal Foundation Models and Multimodal Large Language Models (MLLMs)οΌ
- 2024.11: Β ππ Excited to propose my first-author work DynCIM, a novel dynamic multimodal curriculum learning framework in addressing cross-modal competition and imbalances, which is now available on ArXiv!
- 2024.11: Β ππ My co-authored paper is now under Major Revision by IEEE Transaction on Medical Imaging (IF: 8.9).
- 2024.10: Β ππ My co-authored paper is now under Major Revision by Medical Image Analysis (IF: 10.9).
- 2024.08: Β ππ Excited to propose my first-author work ALC, a novel adaptive label correction framework for medical image segmentation with noisy labels, which is now available on ArXiv!
- 2024.06: Β ππ My project βDynamic Self-adaptive Fusion Framework for Medical Disease Dignosisβ has been selected as a Chinese National Undergraduate College Students Innovation and Entrepreneurship Program (National Key Project).
π Publications

DecAlign: Hierarchical Cross-Modal Alignment for Decoupled Multimodal Representation Learning
Chengxuan Qian, Shuo Xing, Shawn Li, Yue Zhao, Zhengzhong Tuβ .

Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization
Shuo Xing, Yuping Wang, Peiran Li, Ruizheng Bai, Yueqi Wang, Chengxuan Qian, Huaxiu Yao, Zhengzhong Tuβ .

DynCIM: Dynamic Curriculum for Imbalanced Multimodal Learning
Chengxuan Qian, Kai Han, Jingchao Wang, Zhenlong Yuan, Chongwen Lyu, Jun Chenβ , Zhe Liuβ .

Adaptive Label Correction Framework for Robust Medical Image Segmentation with Noisy Labels
Chengxuan Qian, K Han, Siqi Ma, Chongwen Lyu, Zhenlong Yuan, Jun Chenβ , Zhe Liuβ .

CLIMD: A Curriculum Learning Framework for Imbalanced Multimodal Diagnosis
Under Review
Kai Han, Chongwen Lyu, Chengxuan Qian, Siqi Ma, Jun Chenβ , Zhe Liuβ ,

Region Uncertainty Estimation for Medical Image Segmentation with Noisy Labels
IEEE Transaction on Medical Imaging (CCF B, IF:8.9)(Major Revision)
Kai Han, Shuhui Wang, Jun Chenβ , Chengxuan Qian, Chongwen Lyu, Siqi Ma, Victor S. Shengβ , Qingming Huangβ , Zhe Liuβ .

Frequency Domain Unlocks New Perspectives for Medical Image Segmentation
IEEE Transactions on Circuits and Systems for Video Technology (CCF B, IF: 8.3)(Under Review)
Kai Han, Siqi Ma, Chengxuan Qian, Jun Chenβ , Chongwen Lyu, Victor S. Shengβ , Zhe Liuβ .

Curriculum for Region-guided Automatic Radiology Report Generation
IEEE Transactions on Circuits and Systems for Video Technology (CCF B, IF: 8.3)(Under Review)
Chongwen Lyu, Chengxuan Qian, Kai Han, Jun Chenβ , Victor S. Shengβ , Zhe Liuβ .

LiMT: A Multi-task Liver Image Benchmark Dataset
Medical Image Analysis (IF: 10.7)(Major Revision)
Z Liuβ , K Han, S Ma, J Chenβ , β¦, C Qian, C Lyu, β¦, V S. Shengβ .
-
Dataset and Benchmarking work
-
A multi-task medical image benchmark dataset for Segmentation, Classification and Detection of liver lesions, encompassing CT liver scans annotated for four common liver diseases.
-
Collaborated with researchers from Jiangsu University, Texas Tech University, and clinicians from the Affiliated Hospital of Jiangsu University.

Diffusion Contrastive Learning for Image Classification
Under Review
Xincheng Zhu, Yonghan Lu, Kai Han, Chongwen Lyu, Chengxuan Qian, J Chenβ , Z Liuβ .
Note: Details of some papers above are not allowed to show since they are currently under reviewed by double-blind conference. β is the note for advisor.
π Academical Services
- Reviewer of IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), IEEE Transactions on Multimedia (TMM), IEEE International Conference on Multimedia & Expo (ICME 2025) and ICCV 2025.
π¬ Open-source Projects
- Re-Align, a novel Direct Preference Optimization (DPO)-based alignment framework that leverages image retrieval to mitigate hallucinations in Vision Language Models. See more in the corresponding website with codes.
- DecAlign, a novel cross-modal decoupling and alignment framwork for multimodal representation learning. See more in the corresponding website with codes(Will be fully released soon!).