DyCo-RL-Qwen2.5-VL-3B

DyCo-RL-Qwen2.5-VL-3B is a GRPO-trained checkpoint built upon Qwen2.5-VL-3B.

The model is developed under the DyCo-RL framework, which aims to improve visual reasoning through dynamic cross-modal coordination between visual perception and language reasoning.

This repository provides the final checkpoint obtained after GRPO training for research and reproducibility purposes.

Model Details

  • Base Model: Qwen2.5-VL-3B
  • Training Method: GRPO+DyCo-RL
  • Task: Visual Reasoning
  • Paper: DyCo-RL: Dynamic Cross-Modal Coordination for Visual Reasoning

Citation

If you find this work useful for your research, please cite our paper :

@misc{lin2026dycorldynamiccrossmodalcoordination, title={DyCo-RL: Dynamic Cross-Modal Coordination for Visual Reasoning}, author={Hangui Lin and Yan Shu and Zhengyang Liang and Chi Liu and Xiangrui Liu and Minghao Qin and Teng Long and Zheng Liu and Nicu Sebe}, year={2026}, eprint={2606.08035}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2606.08035}, }

Downloads last month
15
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LinYuanMo/DyCo-RL-Qwen2.5-VL-3B

Finetuned
(792)
this model

Collection including LinYuanMo/DyCo-RL-Qwen2.5-VL-3B

Paper for LinYuanMo/DyCo-RL-Qwen2.5-VL-3B