Edit model card

OrpoLlama-3-8B-GGUF

Model Description

This is an ORPO fine-tune of meta-llama/Meta-Llama-3-8B on 1k samples of mlabonne/orpo-dpo-mix-40k created for this article.

It's a successful fine-tune that follows the ChatML template!

πŸ”Ž Application

This model uses a context window of 8k. It was trained with the ChatML template.

πŸ† Evaluation

Nous

OrpoLlama-4-8B outperforms Llama-3-8B-Instruct on the GPT4All and TruthfulQA datasets.

Evaluation performed using LLM AutoEval, see the entire leaderboard here.

Model Average AGIEval GPT4All TruthfulQA Bigbench
meta-llama/Meta-Llama-3-8B-Instruct πŸ“„ 51.34 41.22 69.86 51.65 42.64
mlabonne/OrpoLlama-3-8B πŸ“„ 48.63 34.17 70.59 52.39 37.36
mlabonne/OrpoLlama-3-8B-1k πŸ“„ 46.76 31.56 70.19 48.11 37.17
meta-llama/Meta-Llama-3-8B πŸ“„ 45.42 31.1 69.95 43.91 36.7

mlabonne/OrpoLlama-3-8B-1k corresponds to a version of this model trained on 1K samples (you can see the parameters in this article).

Open LLM Leaderboard

TBD.

πŸ“ˆ Training curves

You can find the experiment on W&B at this address.

image/png

Downloads last month
1,332
GGUF

Quantized from

Dataset used to train QuantFactory/OrpoLlama-3-8B-GGUF