rlhf-qa-ppo / zero_to_fp32.py

Commit History

Step 3 of 3; First attempt at a PPO fine-tuned model.
959dbed

kastan commited on