selfrew/train_reward_1_round_filtered_data_sft_2epoch_5e6_bz128 Text Generation • Updated 27 days ago • 20
selfrew/train_reward_1_round_filtered_data_sft_3epoch_5e6_bz128 Text Generation • Updated 27 days ago • 22
selfrew/train_reward_1_round_filtered_data_sft_2epoch_2e6_bz128 Text Generation • Updated 29 days ago • 22
selfrew/train_reward_1_round_filtered_data_sft_3epoch_2e6_bz128 Text Generation • Updated 29 days ago • 28