Safetensors
English
llama
hamishivi commited on
Commit
adce9d4
1 Parent(s): 8a1371f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -14
README.md CHANGED
@@ -41,20 +41,19 @@ Note that Llama 3.1 is released under the Meta Llama 3 community license, includ
41
 
42
  ## Performance
43
 
44
- | Model | MMLU 5-shot | GSM8k 8-shot cot | BBH 3-shot cot | TydiQA 1-shot Gold Passage | Codex HumanEval Pass@10 |AlpacaEval 1 | AlpacaEval 2 LC | TruthfulQA %Info+True | IFEval loose acc | XSTest safe but ref. | XSTest unsafe but follow | Average |
45
- |-|-|-|-|-|-|-|-|-|-|-|-|-|
46
- | [Llama 3 8b base](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 0.649 | 0.565 | 0.653 | 66.80 | 0.664 | - | - | 0.299 | 0.146 | 0.200 | 0.390 | 54.36 |
47
- | [Llama 3 8b instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | 0.626 | 0.770 | 0.606 | 59.04 | 0.799 | 94.65 | 23.12 | 0.682 | 0.741 | 0.028 | 0.115 | 70.36 |
48
- | [Llama 3 Tulu 2 8b](https://huggingface.co/allenai/llama-3-tulu-2-8b) | 0.606 | 0.610 | 0.592 | 56.24 | 0.685 | 79.40 | 10.16 | 0.503 | 0.468 | 0.092 | 0.165 | 59.39 |
49
- | **[Llama 3 Tulu 2+DPO 8b](https://huggingface.co/allenai/llama-3-tulu-2-dpo-8b) (this model)** | 0.609 | 0.650 | 0.584 | 21.18 | 0.688 | 93.02 | 13.94 | 0.698 | 0.518 | 0.092 | 0.165 | 59.61 |
50
- | [Llama 3 70b base](https://huggingface.co/meta-llama/Meta-Llama-3-70B) | 0.790 | 0.840 | 0.801 | 73.35 | 0.745 | - | - | 0.469 | 0.163 | 0.256 | 0.330 | 65.60 |
51
- | [Llama 3 70b instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) | 0.786 | 0.930 | 0.801 | 59.21 | 0.908 | 96.71 | 39.99 | 0.701 | 0.828 | 0.060 | 0.140 | 79.22 |
52
- | [Llama 3 Tulu 2 70b](https://huggingface.co/allenai/llama-3-tulu-2-70b) | 0.752 | 0.845 | 0.779 | 69.798 | 0.861 | 86.007 | 17.51 | 0.646 | 0.591 | 0.108 | 0.130 | 73.01 |
53
- | [Llama 3 Tulu 2+DPO 70b](https://huggingface.co/allenai/llama-3-tulu-2-dpo-70b) | 0.754 | 0.860 | 0.785 | 23.443 | 0.878 | 96.65 | 27.34 | 0.780 | 0.643 | 0.080 | 0.140 | 71.60 |
54
-
55
- We also release reward models based off Llama 3 8b and 70b respectively:
56
- - [Llama 3 Tulu 2 8b UltraFeedback RM](https://huggingface.co/allenai/llama-3-tulu-2-8b-uf-mean-rm)
57
- - [Llama 3 Tulu 2 70b UltraFeedback RM](https://huggingface.co/allenai/llama-3-tulu-2-70b-uf-mean-rm)
58
 
59
  ## Input Format
60
 
 
41
 
42
  ## Performance
43
 
44
+ | Model | MMLU 5-shot | GSM8k 8-shot cot | BBH 3-shot cot | Codex HumanEval Pass@10 | AlpacaEval 1 | AlpacaEval 2 LC | TruthfulQA %Info+True | IFEval loose acc | XSTest safe but ref. | XSTest unsafe but follow | Average |
45
+ |-|-|-|-|-|-|-|-|-|-|-|-|
46
+ | [Llama 3.1 8b base](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) | 65.5 | 57.0 | 65.6 | 61.6 | - | - | 32.7 | 11.1 | 17.2 | 44.0 | - |
47
+ | [Llama 3.1 8b instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) | 65.6 | 84.5 | 68.5 | 84.5 | 94.8 | 26.0 | 31.1 | 75.6 | 8.8 | 5.5 | 71.8 |
48
+ | [Tulu 2 Llama 3.1 8b](https://huggingface.co/allenai/llama-3.1-tulu-2-8b) | 61.4 | 68.0 | 59.2 | 67.9 | 80.6 | 9.0 | 56.2 | 46.4 | 11.2 | 13.0 | 63.9 |
49
+ | **[Tulu 2 Llama 3.1 8b DPO](https://huggingface.co/allenai/llama-3.1-tulu-2-dpo-8b) (this model)** | 62.0 | 66.5 | 60.6 | 69.1 | 93.5 | 14.7 | 70.3 | 52.3 | 8.4 | 15.5 | 67.0 |
50
+ | [Llama 3.1 70b base](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B) | 78.8 | 85.5 | 82.9 | 94.5 | - | - | - | 10.9 | 12.4 | 41.0 | - |
51
+ | [Llama 3.1 70b instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct) | 81.4 | 96.0 | 83.1 |94.5 | 96.0 | 35.8 | 69.0 | 87.1 | 5.6 | 11.5 | 86.1 |
52
+ | [Tulu 2 Llama 3.1 70b](https://huggingface.co/allenai/llama-3.1-tulu-2-70b) | 76.0 | 83.5 | 78.5 | 84.1 | 85.9 | 13.2 | 59.7 | 59.1 | 13.6 | 15.5 | 75.2 |
53
+ | [Tulu 2 Llama 3.1 70b DPO](https://huggingface.co/allenai/llama-3.1-tulu-2-dpo-70b) | 76.0 | 88.5 | 79.9 | 89.0 | 96.8 | 24.8 | 78.3 | 63.6 | 9.2 | 14.0 | 80.5 |
54
+
55
+ You can find all models Ai2 trained as part of this family [here](https://huggingface.co/collections/hamishivi/tulu-2-llama-3-update-6674a1cbd1bb4d33b5dec246), alongside our prior Llama 3.0 versions.
56
+
 
57
 
58
  ## Input Format
59