File size: 529 Bytes
2527647
 
8a18124
 
 
2527647
8a18124
2527647
8a18124
2527647
8a18124
 
 
 
 
 
2527647
f8a13a7
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
---
library_name: transformers
datasets:
- mlabonne/orpo-dpo-mix-40k
- jondurbin/gutenberg-dpo-v0.1
---
# Orpo-GutenLlama-3-8B-v2

## Training Params

+ Learning Rate: 8e-6
+ Batch Size: 1
+ Eval Batch size: 1
+ Gradient accumulation steps: 4
+ Epochs: 3
+ Training Loss: 0.88

Training time: 4 hours on 1x4090. This is a small 1800 sample fine tune to get comfortable with ORPO fine tuning before scaling up.

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6455cc8d679315e4ef16fbec/q5Okh82tXKgaonwPrT7Gg.png)