DyCo-RL-Qwen2.5-VL-3B

DyCo-RL-Qwen2.5-VL-3B is a GRPO-trained checkpoint built upon Qwen2.5-VL-3B.

The model is developed under the DyCo-RL framework, which aims to improve visual reasoning through dynamic cross-modal coordination between visual perception and language reasoning.

This repository provides the final checkpoint obtained after GRPO training for research and reproducibility purposes.

Model Details

Base Model: Qwen2.5-VL-3B
Training Method: GRPO+DyCo-RL
Task: Visual Reasoning
Paper: DyCo-RL: Dynamic Cross-Modal Coordination for Visual Reasoning

Citation

If you find this work useful for your research, please cite our paper :

@misc{lin2026dycorldynamiccrossmodalcoordination, title={DyCo-RL: Dynamic Cross-Modal Coordination for Visual Reasoning}, author={Hangui Lin and Yan Shu and Zhengyang Liang and Chi Liu and Xiangrui Liu and Minghao Qin and Teng Long and Zheng Liu and Nicu Sebe}, year={2026}, eprint={2606.08035}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2606.08035}, }