ewqr2130/alignment-handbook-zephyr-7b_ppo_5e7step_51 runing the SFT with PPO for 51 steps. runing the SFT with PPO for 51 steps. runing the SFT with PPO for 51 steps. runing the SFT with PPO for 51 steps. runing the SFT with PPO for 51 steps.

Downloads last month
1,583
Safetensors
Model size
7.24B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for ewqr2130/alignment-handbook-zephyr-7b_ppo_5e7step_51

Quantizations
1 model