bourdoiscatie
commited on
Commit
•
b0ca4c7
1
Parent(s):
5593741
Update README.md
Browse filesTo help with the referencing of the dataset by indicating the languages included in the multilingual model, only the xx language tags are listed here because they are the only currently taken into account by HF (not the xx-XX)
README.md
CHANGED
@@ -1,5 +1,107 @@
|
|
1 |
---
|
2 |
-
language:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
datasets:
|
4 |
- mc4
|
5 |
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- multilingual
|
4 |
+
- af
|
5 |
+
- am
|
6 |
+
- ar
|
7 |
+
- az
|
8 |
+
- be
|
9 |
+
- bg
|
10 |
+
- bn
|
11 |
+
- ca
|
12 |
+
- ceb
|
13 |
+
- co
|
14 |
+
- cs
|
15 |
+
- cy
|
16 |
+
- da
|
17 |
+
- de
|
18 |
+
- el
|
19 |
+
- en
|
20 |
+
- eo
|
21 |
+
- es
|
22 |
+
- et
|
23 |
+
- eu
|
24 |
+
- fa
|
25 |
+
- fi
|
26 |
+
- fil
|
27 |
+
- fr
|
28 |
+
- fy
|
29 |
+
- ga
|
30 |
+
- gd
|
31 |
+
- gl
|
32 |
+
- gu
|
33 |
+
- ha
|
34 |
+
- haw
|
35 |
+
- hi
|
36 |
+
- hmn
|
37 |
+
- ht
|
38 |
+
- hu
|
39 |
+
- hy
|
40 |
+
- ig
|
41 |
+
- is
|
42 |
+
- it
|
43 |
+
- iw
|
44 |
+
- ja
|
45 |
+
- jv
|
46 |
+
- ka
|
47 |
+
- kk
|
48 |
+
- km
|
49 |
+
- kn
|
50 |
+
- ko
|
51 |
+
- ku
|
52 |
+
- ky
|
53 |
+
- la
|
54 |
+
- lb
|
55 |
+
- lo
|
56 |
+
- lt
|
57 |
+
- lv
|
58 |
+
- mg
|
59 |
+
- mi
|
60 |
+
- mk
|
61 |
+
- ml
|
62 |
+
- mn
|
63 |
+
- mr
|
64 |
+
- ms
|
65 |
+
- mt
|
66 |
+
- my
|
67 |
+
- ne
|
68 |
+
- nl
|
69 |
+
- no
|
70 |
+
- ny
|
71 |
+
- pa
|
72 |
+
- pl
|
73 |
+
- ps
|
74 |
+
- pt
|
75 |
+
- ro
|
76 |
+
- ru
|
77 |
+
- sd
|
78 |
+
- si
|
79 |
+
- sk
|
80 |
+
- sl
|
81 |
+
- sm
|
82 |
+
- sn
|
83 |
+
- so
|
84 |
+
- sq
|
85 |
+
- sr
|
86 |
+
- st
|
87 |
+
- su
|
88 |
+
- sv
|
89 |
+
- sw
|
90 |
+
- ta
|
91 |
+
- te
|
92 |
+
- tg
|
93 |
+
- th
|
94 |
+
- tr
|
95 |
+
- uk
|
96 |
+
- und
|
97 |
+
- ur
|
98 |
+
- uz
|
99 |
+
- vi
|
100 |
+
- xh
|
101 |
+
- yi
|
102 |
+
- yo
|
103 |
+
- zh
|
104 |
+
- zu
|
105 |
datasets:
|
106 |
- mc4
|
107 |
|