rlhf-gpt2-pipeline / ppo_aligned_final /generation_config.json
Nabeel Shan
Added SFT, Reward Model, and PPO-Aligned Model
46724ea
raw
history blame contribute delete
119 Bytes
{
"_from_model_config": true,
"bos_token_id": 50256,
"eos_token_id": 50256,
"transformers_version": "4.43.4"
}