Text Generation
Transformers
Safetensors
English
llama
conversational
text-generation-inference
Inference Endpoints
hamishivi commited on
Commit
bdc7ac1
1 Parent(s): 73884e0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -5
README.md CHANGED
@@ -40,11 +40,18 @@ For more details on the training mixture, read the paper: [Camels in a Changing
40
 
41
  | Model | MMLU 5-shot | GSM8k 8-shot cot | BBH 3-shot cot | TydiQA 1-shot Gold Passage | Codex HumanEval Pass@10 |AlpacaEval 1 | AlpacaEval 2 LC | TruthfulQA %Info+True | IFEval loose acc | XSTest safe but ref. | XSTest unsafe but follow | Average |
42
  |-|-|-|-|-|-|-|-|-|-|-|-|-|
43
- | Llama 3 70b base | 0.790 | 0.840 | 0.801 | 73.35 | 0.745 | - | - | 0.469 | 0.163 | 0.256 | 0.330 | 65.60 |
44
- | Llama 3 70b instruct | 0.786 | 0.930 | 0.801 | 59.21 | 0.908 | 96.71 | 39.99 | 0.701 | 0.828 | 0.060 | 0.140 | 79.22 |
45
- | Llama 3 Tulu 2 70b | 0.752 | 0.845 | 0.779 | 69.798 | 0.861 | 86.007 | 17.51 | 0.646 | 0.591 | 0.108 | 0.130 | 73.01 |
46
- | Llama 3 Tulu 2+DPO 70b | 0.754 | 0.860 | 0.785 | 23.443 | 0.878 | 96.65 | 27.34 | 0.780 | 0.643 | 0.080 | 0.140 | 71.60 |
47
-
 
 
 
 
 
 
 
48
 
49
  ## Input Format
50
 
 
40
 
41
  | Model | MMLU 5-shot | GSM8k 8-shot cot | BBH 3-shot cot | TydiQA 1-shot Gold Passage | Codex HumanEval Pass@10 |AlpacaEval 1 | AlpacaEval 2 LC | TruthfulQA %Info+True | IFEval loose acc | XSTest safe but ref. | XSTest unsafe but follow | Average |
42
  |-|-|-|-|-|-|-|-|-|-|-|-|-|
43
+ | [Llama 3 8b base](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 0.649 | 0.565 | 0.653 | 66.80 | 0.664 | - | - | 0.299 | 0.146 | 0.200 | 0.390 | 54.36 |
44
+ | [Llama 3 8b instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | 0.626 | 0.770 | 0.606 | 59.04 | 0.799 | 94.65 | 23.12 | 0.682 | 0.741 | 0.028 | 0.115 | 70.36 |
45
+ | [Llama 3 Tulu 2 8b](https://huggingface.co/allenai/llama-3-tulu-2-8b) | 0.606 | 0.610 | 0.592 | 56.24 | 0.685 | 79.40 | 10.16 | 0.503 | 0.468 | 0.092 | 0.165 | 59.39 |
46
+ | [Llama 3 Tulu 2+DPO 8b](https://huggingface.co/allenai/llama-3-tulu-2-dpo-8b) | 0.609 | 0.650 | 0.584 | 21.18 | 0.688 | 93.02 | 13.94 | 0.698 | 0.518 | 0.092 | 0.165 | 59.61 |
47
+ | [Llama 3 70b base](https://huggingface.co/meta-llama/Meta-Llama-3-70B) | 0.790 | 0.840 | 0.801 | 73.35 | 0.745 | - | - | 0.469 | 0.163 | 0.256 | 0.330 | 65.60 |
48
+ | [Llama 3 70b instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) | 0.786 | 0.930 | 0.801 | 59.21 | 0.908 | 96.71 | 39.99 | 0.701 | 0.828 | 0.060 | 0.140 | 79.22 |
49
+ | [Llama 3 Tulu 2 70b](https://huggingface.co/allenai/llama-3-tulu-2-70b) | 0.752 | 0.845 | 0.779 | 69.798 | 0.861 | 86.007 | 17.51 | 0.646 | 0.591 | 0.108 | 0.130 | 73.01 |
50
+ | **[Llama 3 Tulu 2+DPO 70b](https://huggingface.co/allenai/llama-3-tulu-2-dpo-70b) (this model)** | 0.754 | 0.860 | 0.785 | 23.443 | 0.878 | 96.65 | 27.34 | 0.780 | 0.643 | 0.080 | 0.140 | 71.60 |
51
+
52
+ We also release reward models based off Llama 3 8b and 70b respectively:
53
+ - [Llama 3 Tulu 2 8b UltraFeedback RM](https://huggingface.co/allenai/llama-3-tulu-2-8b-uf-mean-rm)
54
+ - [Llama 3 Tulu 2 70b UltraFeedback RM](https://huggingface.co/allenai/llama-3-tulu-2-70b-uf-mean-rm)
55
 
56
  ## Input Format
57