R-PRM: Reasoning-Driven Process Reward Modeling
Shuaijie She
kevinpro
AI & ML interests
Reasoning, Chain of Thoughts, Alignment, Factual Consistency, Summarization
Recent Activity
updated
a collection
7 days ago
R-PRM
updated
a collection
7 days ago
R-PRM
upvoted
a
collection
7 days ago
R-PRM
Organizations
Collections
2
MAPO: Advancing Multilingual Reasoning through Multilingual Alignment‑as‑Preference
Optimization
-
3
Open Multilingual Reasoning Leaderboard
🦊Display and search a leaderboard of math models
-
MAPO: Advancing Multilingual Reasoning through Multilingual Alignment-as-Preference Optimization
Paper • 2401.06838 • Published -
kevinpro/MNumGLUESub
Updated • 23 -
kevinpro/MetaMathOctopus-MAPO-DPO-13B
Text Generation • Updated • 100
Papers
1
spaces
1
models
15

kevinpro/R-PRM-7B-DPO
Text Generation
•
Updated
•
8

kevinpro/Hydra-LLaMA3-8B-0531-preview-Q4_K_M-GGUF
Text Generation
•
Updated
•
15

kevinpro/MistralMathOctopus-7B
Text Generation
•
Updated
•
8

kevinpro/MetaMathOctopus-MAPO-DPO-13B
Text Generation
•
Updated
•
100

kevinpro/MathOctopus-MAPO-DPO-7B
Text Generation
•
Updated
•
2

kevinpro/MetaMathOctopus-13B
Text Generation
•
Updated
•
19

kevinpro/MetaMathOctopus-MAPO-DPO-7B
Text Generation
•
Updated
•
5

kevinpro/MetaMathOctopus-7B
Text Generation
•
Updated
•
13

kevinpro/MathOctopus-MAPO-DPO-13B
Text Generation
•
Updated
•
3

kevinpro/MistralMathOctopus-MAPO-DPO-7B
Text Generation
•
Updated