Add benchmarks
Browse files
README.md
CHANGED
@@ -5,12 +5,14 @@ language:
|
|
5 |
- en
|
6 |
- fr
|
7 |
- es
|
|
|
|
|
8 |
library_name: transformers
|
9 |
tags:
|
10 |
- moe
|
11 |
- text-generation-inference
|
12 |
---
|
13 |
-
# Mixtral-
|
14 |
New MoE model by MistralAI
|
15 |
|
16 |
## Model Details:
|
@@ -19,6 +21,17 @@ New MoE model by MistralAI
|
|
19 |
- 56 layers
|
20 |
- 8 experts
|
21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
The model is split into 27 parts, from the original torrent.
|
23 |
|
24 |
Magnet link and checksum: [https://twitter.com/mistralai/status/1777869263778291896](https://twitter.com/mistralai/status/1777869263778291896)
|
@@ -34,7 +47,6 @@ Make sure you have Python 2 or 3 installed (HuggingFace libraries not required):
|
|
34 |
python merge.py
|
35 |
```
|
36 |
This should take approximately 2 hours, you will be left with a 274GB file.
|
37 |
-
|
38 |
Check the MD5 hash of consolidated.safetensors:
|
39 |
```
|
40 |
3816cd2c4f827b4b868bc6481d5d3ba2
|
@@ -43,6 +55,7 @@ Check the MD5 hash of consolidated.safetensors:
|
|
43 |
That's it! Now you have the complete torrent download on your computer.
|
44 |
|
45 |
## Credit to:
|
|
|
46 |
|
47 |
βββββ
|
48 |
ββββββββββββββββββ
|
@@ -84,23 +97,17 @@ That's it! Now you have the complete torrent download on your computer.
|
|
84 |
ββββββββββββββββββββββββββββ
|
85 |
βββββββββββββββββ
|
86 |
βββββ
|
87 |
-
|
88 |
-
|
89 |
-
|
90 |
-
|
91 |
-
|
92 |
-
|
93 |
-
|
94 |
-
|
95 |
-
|
96 |
-
|
97 |
-
|
98 |
-
|
99 |
-
|
100 |
-
|
101 |
-
β Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, β
|
102 |
-
β Teven Le Scao, Thibaut Lavril, TimothΓ©e Lacroix, ThΓ©ophile Gervet, β
|
103 |
-
β Thomas Wang, Valera Nemychnikova, William El Sayed, William Marshall β
|
104 |
-
β β
|
105 |
-
|
106 |
-
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
|
|
5 |
- en
|
6 |
- fr
|
7 |
- es
|
8 |
+
- it
|
9 |
+
- de
|
10 |
library_name: transformers
|
11 |
tags:
|
12 |
- moe
|
13 |
- text-generation-inference
|
14 |
---
|
15 |
+
# Mixtral-8x22B-v0.1
|
16 |
New MoE model by MistralAI
|
17 |
|
18 |
## Model Details:
|
|
|
21 |
- 56 layers
|
22 |
- 8 experts
|
23 |
|
24 |
+
## Benchmarks
|
25 |
+
```
|
26 |
+
ARC C (25-shot): 70.5
|
27 |
+
Hellaswag (10-shot): 88.9
|
28 |
+
MMLU (5-shot): 77.3
|
29 |
+
TruthfulQA: 52.3
|
30 |
+
Winogrande (5-shot): 85.2
|
31 |
+
GSM8K (5-shot): 76.5
|
32 |
+
|
33 |
+
Source: https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1/discussions/4#6616c393b8d25135997cdd45
|
34 |
+
```
|
35 |
The model is split into 27 parts, from the original torrent.
|
36 |
|
37 |
Magnet link and checksum: [https://twitter.com/mistralai/status/1777869263778291896](https://twitter.com/mistralai/status/1777869263778291896)
|
|
|
47 |
python merge.py
|
48 |
```
|
49 |
This should take approximately 2 hours, you will be left with a 274GB file.
|
|
|
50 |
Check the MD5 hash of consolidated.safetensors:
|
51 |
```
|
52 |
3816cd2c4f827b4b868bc6481d5d3ba2
|
|
|
55 |
That's it! Now you have the complete torrent download on your computer.
|
56 |
|
57 |
## Credit to:
|
58 |
+
```
|
59 |
|
60 |
βββββ
|
61 |
ββββββββββββββββββ
|
|
|
97 |
ββββββββββββββββββββββββββββ
|
98 |
βββββββββββββββββ
|
99 |
βββββ
|
100 |
+
```
|
101 |
+
## Released by the Mistral AI team:
|
102 |
+
Albert Jiang, Alexandre Sablayrolles, Alexis Tacnet, Antoine Roux,
|
103 |
+
Arthur Mensch, Audrey Herblin-Stoop, Baptiste Bout, Baudouin de Monicault,
|
104 |
+
Blanche Savary, Bam4d, Caroline Feldman, Devendra Singh Chaplot,
|
105 |
+
Diego de las Casas, Eleonore Arcelin, Emma Bou Hanna, Etienne Metzger,
|
106 |
+
Gianna Lengyel, Guillaume Bour, Guillaume Lample, Harizo Rajaona,
|
107 |
+
Jean-Malo Delignon, Jia Li, Justus Murke, Louis Martin, Louis Ternon
|
108 |
+
Lucile Saulnier, LΓ©lio Renard Lavaud, Margaret Jennings, Marie Pellat,
|
109 |
+
Marie Torelli, Marie-Anne Lachaux, Nicolas Schuhl, Patrick von Platen,
|
110 |
+
Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak,
|
111 |
+
Teven Le Scao, Thibaut Lavril, TimothΓ©e Lacroix, ThΓ©ophile Gervet,
|
112 |
+
Thomas Wang, Valera Nemychnikova, William El Sayed, William Marshall
|
113 |
+
|
|
|
|
|
|
|
|
|
|
|
|