ai-forever
commited on
Commit
•
cc20058
1
Parent(s):
0fb1ca9
Update README.md
Browse files
README.md
CHANGED
@@ -87,18 +87,42 @@ We reproduce the GPT-3 architecture using GPT-2 sources and the sparse attention
|
|
87 |
The source code for the mGPT XL model is available on [Github](https://github.com/sberbank-ai/mgpt)
|
88 |
|
89 |
## Paper
|
90 |
-
|
|
|
|
|
91 |
|
92 |
-
|
93 |
-
|
94 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
95 |
}
|
96 |
-
|
|
|
|
|
97 |
|
98 |
## Languages
|
99 |
|
100 |
-
Model
|
|
|
101 |
```az, sw, af, ar, ba, be, bxr, bg, bn, cv, hy, da, de, el, es, eu, fa, fi, fr, he, hi, hu, kk, id, it, ja, ka, ky, ko, lt, lv, mn, ml, os, mr, ms, my, nl, ro, pl, pt, sah, ru, tg, sv, ta, te, tk, th, tr, tl, tt, tyv, uk, en, ur, vi, uz, yo, zh, xal```
|
|
|
|
|
|
|
102 |
|
103 |
## Training Data Statistics
|
104 |
|
|
|
87 |
The source code for the mGPT XL model is available on [Github](https://github.com/sberbank-ai/mgpt)
|
88 |
|
89 |
## Paper
|
90 |
+
mGPT: Few-Shot Learners Go Multilingual
|
91 |
+
|
92 |
+
[Abstract](https://arxiv.org/abs/2204.07580) [PDF](https://arxiv.org/pdf/2204.07580.pdf)
|
93 |
|
94 |
+
![](https://habrastorage.org/webt/1q/ru/yt/1qruytul6m2m-upyk9frq3pgrds.png)
|
95 |
+
|
96 |
+
```
|
97 |
+
@misc{https://doi.org/10.48550/arxiv.2204.07580,
|
98 |
+
doi = {10.48550/ARXIV.2204.07580},
|
99 |
+
|
100 |
+
url = {https://arxiv.org/abs/2204.07580},
|
101 |
+
|
102 |
+
author = {Shliazhko, Oleh and Fenogenova, Alena and Tikhonova, Maria and Mikhailov, Vladislav and Kozlova, Anastasia and Shavrina, Tatiana},
|
103 |
+
|
104 |
+
keywords = {Computation and Language (cs.CL), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences, I.2; I.2.7, 68-06, 68-04, 68T50, 68T01},
|
105 |
+
|
106 |
+
title = {mGPT: Few-Shot Learners Go Multilingual},
|
107 |
+
|
108 |
+
publisher = {arXiv},
|
109 |
+
|
110 |
+
year = {2022},
|
111 |
+
|
112 |
+
copyright = {Creative Commons Attribution 4.0 International}
|
113 |
}
|
114 |
+
|
115 |
+
```
|
116 |
+
|
117 |
|
118 |
## Languages
|
119 |
|
120 |
+
Model supports 60 languages:
|
121 |
+
ISO codes:
|
122 |
```az, sw, af, ar, ba, be, bxr, bg, bn, cv, hy, da, de, el, es, eu, fa, fi, fr, he, hi, hu, kk, id, it, ja, ka, ky, ko, lt, lv, mn, ml, os, mr, ms, my, nl, ro, pl, pt, sah, ru, tg, sv, ta, te, tk, th, tr, tl, tt, tyv, uk, en, ur, vi, uz, yo, zh, xal```
|
123 |
+
Languages:
|
124 |
+
```Afrikaans, Azerbaijani, Belarusian, Bengali, Chuvash, German, English, Basque, Finnish, Hebrew (modern), Hungarian, Indonesian, Japanese, Kazakh, Kirghiz, Kyrgyz, Latvian, Mongolian, Malay, Dutch, Polish, Romanian, Moldavan, Yakut, Swahili, Telugu, Thai, Turkish, Tuvinian, Urdu, Vietnamese, Yoruba, Arabic, Bashkir, Bulgarian, Buriat, Danish, Greek, Modern, Spanish; Castilian, Persian, French, Hindi, Armenian, Italian, Georgian, Korean, Lithuanian, Malayalam, Marathi, Burmese, Ossetian, Ossetic, Portuguese, Russian, Swedish, Tamil, Tajik, Turkmen, Tatar, Ukrainian, Uzbek, Kalmyk, Chinese
|
125 |
+
```
|
126 |
|
127 |
## Training Data Statistics
|
128 |
|