Update README.md
Browse files
README.md
CHANGED
|
@@ -12,8 +12,8 @@ datasets:
|
|
| 12 |
> Qwen-VL-PRM-7B is a process reward model finetuned from Qwen2.5-7B-Instruct on approximately 300,000 examples. It demonstrates strong test-time scaling performance improvements on various advanced multimodal reasoning benchmarks when used with Qwen2.5-VL and Gemma-3 models despite being trained mainly on abstract reasoning datasets and elementary reasoning datasets.
|
| 13 |
|
| 14 |
- **Logs:** https://wandb.ai/aisg-arf/multimodal-reasoning/runs/pj4oc0qh
|
| 15 |
-
- **Repository:**
|
| 16 |
-
- **Paper:** https://arxiv.org/
|
| 17 |
|
| 18 |
# Use
|
| 19 |
|
|
@@ -57,12 +57,12 @@ The model usage is documented [here](https://github.com/theogbrand/vlprm/blob/ma
|
|
| 57 |
|
| 58 |
```bibtex
|
| 59 |
@misc{ong2025vlprms,
|
| 60 |
-
title={
|
| 61 |
-
author={Brandon Ong, Tej Deep Pala, Vernon Toh, William Chandra Tjhi and Soujanya Poria},
|
| 62 |
year={2025},
|
| 63 |
-
eprint={},
|
| 64 |
archivePrefix={arXiv},
|
| 65 |
-
primaryClass={cs.
|
| 66 |
-
url={},
|
| 67 |
}
|
| 68 |
```
|
|
|
|
| 12 |
> Qwen-VL-PRM-7B is a process reward model finetuned from Qwen2.5-7B-Instruct on approximately 300,000 examples. It demonstrates strong test-time scaling performance improvements on various advanced multimodal reasoning benchmarks when used with Qwen2.5-VL and Gemma-3 models despite being trained mainly on abstract reasoning datasets and elementary reasoning datasets.
|
| 13 |
|
| 14 |
- **Logs:** https://wandb.ai/aisg-arf/multimodal-reasoning/runs/pj4oc0qh
|
| 15 |
+
- **Repository:** https://github.com/theogbrand/vlprm/
|
| 16 |
+
- **Paper:** https://arxiv.org/pdf/2509.23250
|
| 17 |
|
| 18 |
# Use
|
| 19 |
|
|
|
|
| 57 |
|
| 58 |
```bibtex
|
| 59 |
@misc{ong2025vlprms,
|
| 60 |
+
title={Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned},
|
| 61 |
+
author={Brandon Ong, Tej Deep Pala, Vernon Toh, William Chandra Tjhi, and Soujanya Poria},
|
| 62 |
year={2025},
|
| 63 |
+
eprint={2509.23250},
|
| 64 |
archivePrefix={arXiv},
|
| 65 |
+
primaryClass={cs.AI},
|
| 66 |
+
url={https://arxiv.org/pdf/2509.23250},
|
| 67 |
}
|
| 68 |
```
|