Reflect-R1 / README.md
CSDDSFSFSAFSAF's picture
Add model card
5c4c4a7 verified
|
Raw
History Blame Contribute Delete
1.16 kB
metadata
license: apache-2.0
language:
  - en
tags:
  - video-language-model
  - long-video-understanding
  - reinforcement-learning
  - self-correction
  - reflection
  - qwen2.5-vl

Reflect-R1

Model checkpoints for Reflect-R1: Evidence-Driven Reflection for Self-Correction in Long Video Understanding.

Checkpoints

Reflect-R1-SFT-6000/       Cold-start SFT checkpoint.
Reflect-R1-GRPO-Final/     Final SD-GRPO checkpoint.

Both checkpoints are based on Qwen2.5-VL-7B and include sharded safetensors weights together with the corresponding tokenizer and processor configuration files.

Citation

@article{chen2026reflectr1,
  title   = {Reflect-R1: Evidence-Driven Reflection for Self-Correction in Long Video Understanding},
  author  = {Shuimu Chen and Yuteng Chen and Yuanshen Guan and Zebang Cheng and Zeyu Zhang and Shengqian Qin and Bin Xia and Jiaran Li and Wenming Yang and Fei Ma},
  journal = {arXiv preprint arXiv:2606.27922},
  year    = {2026}
}