arxiv:2410.10563
Dongfu Jiang
DongfuJiang
AI & ML interests
NLP, common sense reasoning
Organizations
Papers
10
models
20
DongfuJiang/prm_version3_subsample_no_ref_hf
Text Generation
•
Updated
DongfuJiang/prm_version3_hf
Text Generation
•
Updated
DongfuJiang/prm_version3_full_hf
Updated
DongfuJiang/prm_version2_subsample_no_ref_hf
Text Generation
•
Updated
•
49
DongfuJiang/Qwen2.5-0.5B-Instruct
Text Generation
•
Updated
•
150
DongfuJiang/prm_version2_subsample_hf
Text Generation
•
Updated
•
1.22k
DongfuJiang/prm_version2_hf
Updated
DongfuJiang/PairRM-V2-phi-3-4k-mini-all
Updated
•
5
DongfuJiang/vapo_lora_all_data_iter_2
Updated
•
6
DongfuJiang/vapo_lora_all_data_iter_1
Updated
•
7
datasets
11
DongfuJiang/PRM_SFT
Viewer
•
Updated
•
3.26M
•
57
DongfuJiang/PRM_prepared
Viewer
•
Updated
•
25k
•
35
DongfuJiang/PRM_train
Viewer
•
Updated
•
25.2k
•
103
DongfuJiang/PRM_eval
Viewer
•
Updated
•
3.54k
•
17
DongfuJiang/zeroeval
Viewer
•
Updated
•
13.5k
•
98
DongfuJiang/MATH-500
Viewer
•
Updated
•
500
•
22
DongfuJiang/simpo_v2_ultrafeedback
Viewer
•
Updated
•
59.9k
•
36
DongfuJiang/VAPO
Viewer
•
Updated
•
72.5k
•
34
DongfuJiang/PairRM-data
Viewer
•
Updated
•
586k
•
33
DongfuJiang/WildFeedback
Viewer
•
Updated
•
26.5k
•
34