Files changed (1) hide show
  1. README.md +120 -4
README.md CHANGED
@@ -1,5 +1,4 @@
1
  ---
2
- license: apache-2.0
3
  language:
4
  - ar
5
  - he
@@ -62,8 +61,7 @@ language:
62
  - et
63
  - fi
64
  - hu
65
-
66
- pipeline_tag: text-generation
67
  tags:
68
  - multilingual
69
  - PyTorch
@@ -75,7 +73,111 @@ tags:
75
  datasets:
76
  - mc4
77
  - wikipedia
78
- thumbnail: "https://github.com/sberbank-ai/mgpt"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79
  ---
80
 
81
  # Multilingual GPT model
@@ -141,3 +243,17 @@ Languages:
141
  The model was trained with sequence length 512 using Megatron and Deepspeed libs by [SberDevices](https://sberdevices.ru/) team on a dataset of 600 GB of texts in 61 languages. The model has seen 440 billion BPE tokens in total.
142
 
143
  Total training time was around 14 days on 256 Nvidia V100 GPUs.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  language:
3
  - ar
4
  - he
 
61
  - et
62
  - fi
63
  - hu
64
+ license: apache-2.0
 
65
  tags:
66
  - multilingual
67
  - PyTorch
 
73
  datasets:
74
  - mc4
75
  - wikipedia
76
+ pipeline_tag: text-generation
77
+ thumbnail: https://github.com/sberbank-ai/mgpt
78
+ model-index:
79
+ - name: mGPT
80
+ results:
81
+ - task:
82
+ type: text-generation
83
+ name: Text Generation
84
+ dataset:
85
+ name: AI2 Reasoning Challenge (25-Shot)
86
+ type: ai2_arc
87
+ config: ARC-Challenge
88
+ split: test
89
+ args:
90
+ num_few_shot: 25
91
+ metrics:
92
+ - type: acc_norm
93
+ value: 23.81
94
+ name: normalized accuracy
95
+ source:
96
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ai-forever/mGPT
97
+ name: Open LLM Leaderboard
98
+ - task:
99
+ type: text-generation
100
+ name: Text Generation
101
+ dataset:
102
+ name: HellaSwag (10-Shot)
103
+ type: hellaswag
104
+ split: validation
105
+ args:
106
+ num_few_shot: 10
107
+ metrics:
108
+ - type: acc_norm
109
+ value: 26.37
110
+ name: normalized accuracy
111
+ source:
112
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ai-forever/mGPT
113
+ name: Open LLM Leaderboard
114
+ - task:
115
+ type: text-generation
116
+ name: Text Generation
117
+ dataset:
118
+ name: MMLU (5-Shot)
119
+ type: cais/mmlu
120
+ config: all
121
+ split: test
122
+ args:
123
+ num_few_shot: 5
124
+ metrics:
125
+ - type: acc
126
+ value: 25.17
127
+ name: accuracy
128
+ source:
129
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ai-forever/mGPT
130
+ name: Open LLM Leaderboard
131
+ - task:
132
+ type: text-generation
133
+ name: Text Generation
134
+ dataset:
135
+ name: TruthfulQA (0-shot)
136
+ type: truthful_qa
137
+ config: multiple_choice
138
+ split: validation
139
+ args:
140
+ num_few_shot: 0
141
+ metrics:
142
+ - type: mc2
143
+ value: 39.62
144
+ source:
145
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ai-forever/mGPT
146
+ name: Open LLM Leaderboard
147
+ - task:
148
+ type: text-generation
149
+ name: Text Generation
150
+ dataset:
151
+ name: Winogrande (5-shot)
152
+ type: winogrande
153
+ config: winogrande_xl
154
+ split: validation
155
+ args:
156
+ num_few_shot: 5
157
+ metrics:
158
+ - type: acc
159
+ value: 50.67
160
+ name: accuracy
161
+ source:
162
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ai-forever/mGPT
163
+ name: Open LLM Leaderboard
164
+ - task:
165
+ type: text-generation
166
+ name: Text Generation
167
+ dataset:
168
+ name: GSM8k (5-shot)
169
+ type: gsm8k
170
+ config: main
171
+ split: test
172
+ args:
173
+ num_few_shot: 5
174
+ metrics:
175
+ - type: acc
176
+ value: 0.0
177
+ name: accuracy
178
+ source:
179
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ai-forever/mGPT
180
+ name: Open LLM Leaderboard
181
  ---
182
 
183
  # Multilingual GPT model
 
243
  The model was trained with sequence length 512 using Megatron and Deepspeed libs by [SberDevices](https://sberdevices.ru/) team on a dataset of 600 GB of texts in 61 languages. The model has seen 440 billion BPE tokens in total.
244
 
245
  Total training time was around 14 days on 256 Nvidia V100 GPUs.
246
+
247
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
248
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ai-forever__mGPT)
249
+
250
+ | Metric |Value|
251
+ |---------------------------------|----:|
252
+ |Avg. |27.61|
253
+ |AI2 Reasoning Challenge (25-Shot)|23.81|
254
+ |HellaSwag (10-Shot) |26.37|
255
+ |MMLU (5-Shot) |25.17|
256
+ |TruthfulQA (0-shot) |39.62|
257
+ |Winogrande (5-shot) |50.67|
258
+ |GSM8k (5-shot) | 0.00|
259
+