ob11 commited on
Commit
793b6e1
·
verified ·
1 Parent(s): 171a261

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -12,7 +12,7 @@ datasets:
12
  > Qwen-VL-PRM-7B is a process reward model finetuned from Qwen2.5-7B-Instruct on approximately 300,000 examples. It demonstrates strong test-time scaling performance improvements on various advanced multimodal reasoning benchmarks when used with Qwen2.5-VL and Gemma-3 models despite being trained mainly on abstract reasoning datasets and elementary reasoning datasets.
13
 
14
  - **Logs:** https://wandb.ai/aisg-arf/multimodal-reasoning/runs/pj4oc0qh
15
- - **Repository:** https://github.com/theogbrand/vlprm/
16
  - **Paper:** https://arxiv.org/pdf/2509.23250
17
 
18
  # Use
 
12
  > Qwen-VL-PRM-7B is a process reward model finetuned from Qwen2.5-7B-Instruct on approximately 300,000 examples. It demonstrates strong test-time scaling performance improvements on various advanced multimodal reasoning benchmarks when used with Qwen2.5-VL and Gemma-3 models despite being trained mainly on abstract reasoning datasets and elementary reasoning datasets.
13
 
14
  - **Logs:** https://wandb.ai/aisg-arf/multimodal-reasoning/runs/pj4oc0qh
15
+ - **Repository:** https://github.com/theogbrand/vlprm
16
  - **Paper:** https://arxiv.org/pdf/2509.23250
17
 
18
  # Use