final-iteration / run-output /plots /training_log.csv
ycwhencpp's picture
HF Job: train_grpo run output
17149c8 verified
raw
history blame contribute delete
142 Bytes
phase,round,global_step,use_hint,avg_episode_reward,max_episode_reward,min_episode_reward,avg_grader,max_grader,n_training_samples,train_loss