Edit model card

ewqr2130/alignment-handbook-zephyr-7b_ppo_5e7step_51 runing the SFT with PPO for 51 steps. runing the SFT with PPO for 51 steps. runing the SFT with PPO for 51 steps. runing the SFT with PPO for 51 steps. runing the SFT with PPO for 51 steps.

Downloads last month
3,788
Safetensors
Model size
7.24B params
Tensor type
F32
·