Edit model card

Llama 3 8B finetuned on mlabonne/orpo-dpo-mix-40k with ORPO.
Max length was reduced to 1024 tokens. LoRA (r=16) and 4bit quantization was used to increase memory efficiency.

Benchmark LLaMa 3 8B LLaMa 3 8B Inst LLaMa 3 8B ORPO V1 LLaMa 3 8B ORPO V2 (WIP)
MMLU 62.12 63.92 61.87
BoolQ 81.04 83.21 82.42
Winogrande 73.24 72.06 74.43
ARC-Challenge 53.24 56.91 52.90
TriviaQA 63.33 51.09 63.93
GSM-8K (flexible) 50.27 75.13 52.16
SQuAD V2 (f1) 32.48 29.68 33.68
LogiQA 29.23 32.87 30.26
All scores obtained with lm-evaluation-harness v0.4.2
Downloads last month
8
Safetensors
Model size
8.03B params
Tensor type
FP16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.