Model Card for Model ID
- Summary Length PPO experiment #2
- No KL divergence in loss
Model Details
- Dataset size: 1024
- Epochs: 2
- Batch Size: 4 * 8 (w / Gradient Accumulation)
Optimizer args: Torch AdamW default, except
Outcomes
Only outputs one word "relationship"