yuhangzang commited on
Commit
afbf8a4
·
verified ·
1 Parent(s): af935d2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -23,4 +23,4 @@ stage uses LVLMs to generate rich and accurate captions. Subsequently, the secon
23
  caption quality by using a vision-only LLM to perform the QA task. We also created a specific QA
24
  curation pipeline to ensure the quality of the questions and answers used for the second stage.
25
 
26
- By employing our CapRL training framework, initializing with the Qwen2.5-VL-3B model, and using a carefully filtered 75K QA dataset as the training set, we obtained a highly capable captioner, CapRL-3B.
 
23
  caption quality by using a vision-only LLM to perform the QA task. We also created a specific QA
24
  curation pipeline to ensure the quality of the questions and answers used for the second stage.
25
 
26
+ By employing CapRL training framework, initializing with the Qwen2.5-VL-3B model, and using a carefully filtered 75K QA dataset as the training set, we obtained a highly capable captioner, CapRL-3B.