radm
/

Llama-3-70B-Instruct-AH-lora

Model card Files Files and versions Community

radm commited on Jun 3, 2024

Commit

ad2b123

•

1 Parent(s): 410f08f

Update README.md

Files changed (1) hide show

README.md +25 -2

README.md CHANGED Viewed

@@ -24,7 +24,7 @@ This is a LORA adapter for NousResearch/Meta-Llama-3-70B-Instruct, fine-tuned to
 ## Uses
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-[More Information Needed]
 ## Training Details
@@ -51,8 +51,31 @@ Datasets:
 ### Results
-[More Information Needed]
 ## Hardware

 ## Uses
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+Use repository (https://github.com/radaevm/arena-hard-local) for evaluate with local judge model.
 ## Training Details
 ### Results
+#### Llama-3-70B-Instruct-GPTQ as judge:
+```console
+Llama-3-Instruct-8B-SimPO                          | score: 78.3  | 95% CI:   (-1.5, 1.2)   | average #tokens: 545
+SELM-Llama-3-8B-Instruct-iter-3                    | score: 72.8  | 95% CI:   (-2.1, 1.4)   | average #tokens: 606
+Meta-Llama-3-8B-Instruct-f16                       | score: 65.3  | 95% CI:   (-1.8, 2.1)   | average #tokens: 560
+suzume-llama-3-8B-multilingual-orpo-borda-half     | score: 63.5  | 95% CI:   (-1.6, 2.1)   | average #tokens: 978
+Phi-3-medium-128k-instruct                         | score: 50.0  | 95% CI:   (0.0, 0.0)    | average #tokens: 801
+suzume-llama-3-8B-multilingual                     | score: 48.1  | 95% CI:   (-2.2, 1.8)   | average #tokens: 767
+aya-23-8B                                          | score: 48.0  | 95% CI:   (-2.0, 2.1)   | average #tokens: 834
+Vikhr-7B-instruct_0.5                              | score: 19.6  | 95% CI:   (-1.3, 1.5)   | average #tokens: 794
+alpindale_gemma-2b-it                              | score: 11.2  | 95% CI:   (-1.0, 0.8)   | average #tokens: 425
+```
+#### Llama-3-70B-Instruct-AH-AWQ as judge:
+```console
+Llama-3-Instruct-8B-SimPO                          | score: 83.8  | 95% CI:   (-1.4, 1.3)   | average #tokens: 545
+SELM-Llama-3-8B-Instruct-iter-3                    | score: 78.8  | 95% CI:   (-1.7, 1.9)   | average #tokens: 606
+suzume-llama-3-8B-multilingual-orpo-borda-half     | score: 71.8  | 95% CI:   (-1.7, 2.4)   | average #tokens: 978
+Meta-Llama-3-8B-Instruct-f16                       | score: 69.8  | 95% CI:   (-1.9, 1.7)   | average #tokens: 560
+suzume-llama-3-8B-multilingual                     | score: 54.0  | 95% CI:   (-2.1, 2.1)   | average #tokens: 767
+aya-23-8B                                          | score: 50.4  | 95% CI:   (-1.7, 1.7)   | average #tokens: 834
+Phi-3-medium-128k-instruct                         | score: 50.0  | 95% CI:   (0.0, 0.0)    | average #tokens: 801
+Vikhr-7B-instruct_0.5                              | score: 14.2  | 95% CI:   (-1.3, 1.0)   | average #tokens: 794
+alpindale_gemma-2b-it                              | score:  7.9  | 95% CI:   (-0.9, 0.8)   | average #tokens: 425
+```
 ## Hardware