The collection for the Paper "SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild"
HKUST NLP Group
university
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collections
7
models
45
hkust-nlp/Llama-3.1-8B-SimpleRL-Zoo
Updated
•
29
hkust-nlp/Qwen-2.5-32B-SimpleRL-Zoo
Updated
•
147
hkust-nlp/Qwen-2.5-7B-SimpleRL-Zoo
Updated
•
454
hkust-nlp/DeepSeek-Math-7B-SimpleRL-Zoo
Updated
•
72
hkust-nlp/Mistral-7B-v0.1-SimpleRL-Zoo
Updated
•
15
hkust-nlp/Qwen-2.5-1.5B-SimpleRL-Zoo
Updated
•
756
hkust-nlp/Qwen-2.5-0.5B-SimpleRL-Zoo
Updated
•
28
hkust-nlp/Qwen-2.5-14B-SimpleRL-Zoo
Updated
•
68
hkust-nlp/Mistral-Small-24B-SimpleRL-Zoo
Updated
•
41
hkust-nlp/Qwen-2.5-Math-7B-SimpleRL-Zoo
Updated
•
3.36k
datasets
22
hkust-nlp/SimpleRL-Zoo-Data
Viewer
•
Updated
•
53.1k
•
733
•
3
hkust-nlp/PreSelect-100B
Viewer
•
Updated
•
54.5M
•
866
•
9
hkust-nlp/CodeIO-PyEdu-Reasoning
Preview
•
Updated
•
181
•
49
hkust-nlp/CodeIO-PyEdu-Reasoning-Raw
Updated
•
106
hkust-nlp/SynCSE-partial-NLI
Viewer
•
Updated
•
263k
•
36
•
2
hkust-nlp/SynCSE-scratch-NLI
Viewer
•
Updated
•
276k
•
40
•
2
hkust-nlp/gsm8k-fix
Viewer
•
Updated
•
7.47k
•
107
•
2
hkust-nlp/dart-math-uniform
Viewer
•
Updated
•
591k
•
59
•
9
hkust-nlp/vrt-baseline
Viewer
•
Updated
•
591k
•
40
•
1
hkust-nlp/dart-math-hard
Viewer
•
Updated
•
585k
•
113
•
14