Orpo-GutenLlama-3-8B-v2 / README.md

Update README.md

f8a13a7 verified 8 months ago

529 Bytes

metadata

library_name: transformers
datasets:
  - mlabonne/orpo-dpo-mix-40k
  - jondurbin/gutenberg-dpo-v0.1

Orpo-GutenLlama-3-8B-v2

Training Params

Training time: 4 hours on 1x4090. This is a small 1800 sample fine tune to get comfortable with ORPO fine tuning before scaling up.