Holarissun/REPROD_dpo_helpfulhelpful_human_subset-1_modelgemma2b_maxsteps10000_bz8_lr1e-05 Updated 26 days ago
Holarissun/REPROD_dpo_helpfulhelpful_gpt3_subset-1_modelgemma2b_maxsteps10000_bz8_lr1e-05 Updated 26 days ago • 1 • 1
magnifi/phi-3-mini-4k-instruct-attribute-output-4-0524-epoch10 Text Generation • Updated 26 days ago • 6
Holarissun/REPROD_dpo_harmlessharmless_human_subset-1_modelgemma2b_maxsteps6000_bz8_lr5e-06 Updated 26 days ago
magnifi/phi-3-mini-4k-instruct-attribute-output-4-0524-epoch20 Text Generation • Updated 26 days ago • 130
Holarissun/REPROD_dpo_helpfulhelpful_gpt4_subset-1_modelgemma2b_maxsteps10000_bz8_lr1e-05 Updated 26 days ago
magnifi/phi-3-mini-4k-instruct-attribute-output-4-0524-epoch40 Text Generation • Updated 20 days ago • 1
Holarissun/REPROD_dpo_harmlessharmless_human_subset-1_modelgemma2b_maxsteps6000_bz8_lr5e-05 Updated 25 days ago