Z3R6X
/

Llama-3-8B-ORPO-V1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Z3R6X commited on Apr 24

Commit

1b68599

•

1 Parent(s): 5f38df4

Update README.md

Files changed (1) hide show

README.md +9 -5

README.md CHANGED Viewed

@@ -4,12 +4,16 @@ license: llama3
 [Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) finetuned on [mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k) with [ORPO](https://arxiv.org/abs/2403.07691).\
 Max length was reduced to 1024 tokens. LoRA (r=16) and 4bit quantization was used to increase memory efficiency.
-| **Benchmark**     | **LLaMa 3 8B** | **LLaMa 3 8B Inst** | **LLaMa 3 8B ORPO V1** | **LLaMa 3 8B ORPO V2** |
 |--------------------|:-----------------:|:----------------:|:---------------:|----------------|
-| **MMLU** | 62.12 | - | 61.87 | - |
-| **BoolQ** | 81.04 | - | 82.42 | - |
-| **Winogrande** | 73.24 | - | 74.43 | - |
-| **ARC ch** | 53.24 | - | 52.90 | - |
 All scores obtained with [lm-evaluation-harness v0.4.2](https://github.com/EleutherAI/lm-evaluation-harness)

 [Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) finetuned on [mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k) with [ORPO](https://arxiv.org/abs/2403.07691).\
 Max length was reduced to 1024 tokens. LoRA (r=16) and 4bit quantization was used to increase memory efficiency.
+| **Benchmark**     | **LLaMa 3 8B** | **LLaMa 3 8B Inst** | **LLaMa 3 8B ORPO V1** | **LLaMa 3 8B ORPO V2 (WIP)** |
 |--------------------|:-----------------:|:----------------:|:---------------:|----------------|
+| **MMLU** | 62.12 | 63.92 | 61.87 | |
+| **BoolQ** | 81.04 | 83.21 | 82.42 | |
+| **Winogrande** | 73.24 | 72.06 | 74.43 | |
+| **ARC-Challenge** | 53.24 | 56.91 | 52.90 | |
+| **TriviaQA** | 63.33 | 51.09 | 63.93 | |
+| **GSM-8K (flexible)** | 50.27 | 75.13 | 52.16 | |
+| **SQuAD V2 (f1)** | 32.48 | 29.68  | 33.68 | |
+| **LogiQA** | 29.23 | 32.87 | 30.26 | |
 All scores obtained with [lm-evaluation-harness v0.4.2](https://github.com/EleutherAI/lm-evaluation-harness)