selfcorrexp2/selfcorrexp2_llama3_openmath_1m_ep1_tmp10_goldrm_labeled Viewer • Updated 1 day ago • 15k • 12
selfcorrexp2/selfcorrexp2_llama3_openmath_1m_ep1_tmp10_goldrm_labeled Viewer • Updated 1 day ago • 15k • 12
selfcorrexp/llama3_rr40k_2e6_bz32_ep2_moredatatmp10_gold_reward Viewer • Updated 1 day ago • 15k • 10
selfcorrexp/llama3_rr40k_2e6_bz32_ep2_moredatatmp10_gold_reward Viewer • Updated 1 day ago • 15k • 10
selfcorrexp2/HanningZhang_Llama3-sft-more-corr-rr60k-3ep_moredatatmp10_vllmexp3 Viewer • Updated 1 day ago • 15k • 21
selfcorrexp2/HanningZhang_Llama3-sft-more-corr-rr60k-3ep_moredatatmp10_gold_reward Viewer • Updated 1 day ago • 15k • 9
selfcorrexp2/HanningZhang_Llama3-sft-more-corr-rr60k-3ep_moredatatmp10_gold_reward Viewer • Updated 1 day ago • 15k • 9
selfcorrexp2/balanced_self_rewarding_rm_labeled_llama3_sft_gen_1round_prompt Viewer • Updated 1 day ago • 15k • 9
selfcorrexp2/balanced_self_rewarding_rm_labeled_llama3_sft_gen_1round_prompt Viewer • Updated 1 day ago • 15k • 9
tmpmodelsave/llamasft_math_ift_balanced_moredata_gold_reward_tmp10_vllmexp Viewer • Updated 2 days ago • 20k • 29
tmpmodelsave/llamasft_math_ift_balanced_moredata_gold_reward_tmp07_vllmexp Viewer • Updated 2 days ago • 30k • 38
tmpmodelsave/llamasft_math_ift_balanced_moredata_gold_reward_tmp07_vllmexp Viewer • Updated 2 days ago • 30k • 38
tmpmodelsave/llamasft_math_ift_balanced_moredata_gold_reward_tmp10_vllmexp Viewer • Updated 2 days ago • 20k • 29
tmpmodelsave/llamasft_math_ift_balanced_moredata_100tmp10_vllmexp Viewer • Updated 2 days ago • 20k • 24
tmpmodelsave/llamasft_math_ift_balanced_moredata_100tmp10_vllmexpz Viewer • Updated 2 days ago • 20k • 6