ob11 commited on
Commit
171a261
·
verified ·
1 Parent(s): c06a11a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -12,8 +12,8 @@ datasets:
12
  > Qwen-VL-PRM-7B is a process reward model finetuned from Qwen2.5-7B-Instruct on approximately 300,000 examples. It demonstrates strong test-time scaling performance improvements on various advanced multimodal reasoning benchmarks when used with Qwen2.5-VL and Gemma-3 models despite being trained mainly on abstract reasoning datasets and elementary reasoning datasets.
13
 
14
  - **Logs:** https://wandb.ai/aisg-arf/multimodal-reasoning/runs/pj4oc0qh
15
- - **Repository:** [ob11/vlprm](https://github.com/theogbrand/vlprm/)
16
- - **Paper:** https://arxiv.org/abs/
17
 
18
  # Use
19
 
@@ -57,12 +57,12 @@ The model usage is documented [here](https://github.com/theogbrand/vlprm/blob/ma
57
 
58
  ```bibtex
59
  @misc{ong2025vlprms,
60
- title={VL-PRMs: Vision-Language Process Reward Models},
61
- author={Brandon Ong, Tej Deep Pala, Vernon Toh, William Chandra Tjhi and Soujanya Poria},
62
  year={2025},
63
- eprint={},
64
  archivePrefix={arXiv},
65
- primaryClass={cs.CL},
66
- url={},
67
  }
68
  ```
 
12
  > Qwen-VL-PRM-7B is a process reward model finetuned from Qwen2.5-7B-Instruct on approximately 300,000 examples. It demonstrates strong test-time scaling performance improvements on various advanced multimodal reasoning benchmarks when used with Qwen2.5-VL and Gemma-3 models despite being trained mainly on abstract reasoning datasets and elementary reasoning datasets.
13
 
14
  - **Logs:** https://wandb.ai/aisg-arf/multimodal-reasoning/runs/pj4oc0qh
15
+ - **Repository:** https://github.com/theogbrand/vlprm/
16
+ - **Paper:** https://arxiv.org/pdf/2509.23250
17
 
18
  # Use
19
 
 
57
 
58
  ```bibtex
59
  @misc{ong2025vlprms,
60
+ title={Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned},
61
+ author={Brandon Ong, Tej Deep Pala, Vernon Toh, William Chandra Tjhi, and Soujanya Poria},
62
  year={2025},
63
+ eprint={2509.23250},
64
  archivePrefix={arXiv},
65
+ primaryClass={cs.AI},
66
+ url={https://arxiv.org/pdf/2509.23250},
67
  }
68
  ```