Update README.md
Browse files
README.md
CHANGED
@@ -4,12 +4,16 @@ license: llama3
|
|
4 |
[Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) finetuned on [mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k) with [ORPO](https://arxiv.org/abs/2403.07691).\
|
5 |
Max length was reduced to 1024 tokens. LoRA (r=16) and 4bit quantization was used to increase memory efficiency.
|
6 |
|
7 |
-
| **Benchmark** | **LLaMa 3 8B** | **LLaMa 3 8B Inst** | **LLaMa 3 8B ORPO V1** | **LLaMa 3 8B ORPO V2** |
|
8 |
|--------------------|:-----------------:|:----------------:|:---------------:|----------------|
|
9 |
-
| **MMLU** | 62.12 |
|
10 |
-
| **BoolQ** | 81.04 |
|
11 |
-
| **Winogrande** | 73.24 |
|
12 |
-
| **ARC
|
|
|
|
|
|
|
|
|
13 |
All scores obtained with [lm-evaluation-harness v0.4.2](https://github.com/EleutherAI/lm-evaluation-harness)
|
14 |
|
15 |
|
|
|
4 |
[Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) finetuned on [mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k) with [ORPO](https://arxiv.org/abs/2403.07691).\
|
5 |
Max length was reduced to 1024 tokens. LoRA (r=16) and 4bit quantization was used to increase memory efficiency.
|
6 |
|
7 |
+
| **Benchmark** | **LLaMa 3 8B** | **LLaMa 3 8B Inst** | **LLaMa 3 8B ORPO V1** | **LLaMa 3 8B ORPO V2 (WIP)** |
|
8 |
|--------------------|:-----------------:|:----------------:|:---------------:|----------------|
|
9 |
+
| **MMLU** | 62.12 | 63.92 | 61.87 | |
|
10 |
+
| **BoolQ** | 81.04 | 83.21 | 82.42 | |
|
11 |
+
| **Winogrande** | 73.24 | 72.06 | 74.43 | |
|
12 |
+
| **ARC-Challenge** | 53.24 | 56.91 | 52.90 | |
|
13 |
+
| **TriviaQA** | 63.33 | 51.09 | 63.93 | |
|
14 |
+
| **GSM-8K (flexible)** | 50.27 | 75.13 | 52.16 | |
|
15 |
+
| **SQuAD V2 (f1)** | 32.48 | 29.68 | 33.68 | |
|
16 |
+
| **LogiQA** | 29.23 | 32.87 | 30.26 | |
|
17 |
All scores obtained with [lm-evaluation-harness v0.4.2](https://github.com/EleutherAI/lm-evaluation-harness)
|
18 |
|
19 |
|