macadeliccc
/

Orpo-GutenLlama-3-8B-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Orpo-GutenLlama-3-8B-v2 / README.md

macadeliccc's picture

Update README.md

f8a13a7 verified 8 months ago

|

history blame contribute delete

529 Bytes

	---
	library_name: transformers
	datasets:
	- mlabonne/orpo-dpo-mix-40k
	- jondurbin/gutenberg-dpo-v0.1
	---
	# Orpo-GutenLlama-3-8B-v2

	## Training Params

	+ Learning Rate: 8e-6
	+ Batch Size: 1
	+ Eval Batch size: 1
	+ Gradient accumulation steps: 4
	+ Epochs: 3
	+ Training Loss: 0.88

	Training time: 4 hours on 1x4090. This is a small 1800 sample fine tune to get comfortable with ORPO fine tuning before scaling up.

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6455cc8d679315e4ef16fbec/q5Okh82tXKgaonwPrT7Gg.png)