m3hrdadfi commited on
Commit
751e963
1 Parent(s): a8a053b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +779 -0
README.md ADDED
@@ -0,0 +1,779 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: multilingual
3
+ license: apache-2.0
4
+ datasets:
5
+ - wili_2018
6
+ ---
7
+
8
+ # Transformer Language Detector
9
+
10
+ zabanshenas (زبان‌شناس / zæbænʒenæs) which has two meanings
11
+
12
+ - A person who studies linguistics.
13
+ - A way to identify the type of written language.
14
+
15
+
16
+ ## How to use
17
+
18
+ ### Requirements
19
+
20
+ ```bash
21
+ !pip install git+https://github.com/huggingface/datasets.git
22
+ !pip install git+https://github.com/huggingface/transformers.git
23
+ ```
24
+
25
+ ### Prediction
26
+
27
+ ```python
28
+ #
29
+ ```
30
+
31
+ ```python
32
+ #
33
+ ```
34
+
35
+ ```python
36
+ #
37
+ ```
38
+
39
+ ```text
40
+ Soon
41
+ ```
42
+
43
+
44
+ ## Evaluation
45
+ The following tables summarize the scores obtained by model overall and per each class.
46
+
47
+
48
+ ### By Paragraph
49
+
50
+ | language | precision | recall | f1-score |
51
+ |:--------------------------------------:|:---------:|:--------:|:--------:|
52
+ | Achinese (ace) | 1.000000 | 0.982143 | 0.990991 |
53
+ | Afrikaans (afr) | 1.000000 | 1.000000 | 1.000000 |
54
+ | Alemannic German (als) | 1.000000 | 0.946429 | 0.972477 |
55
+ | Amharic (amh) | 1.000000 | 0.982143 | 0.990991 |
56
+ | Old English (ang) | 0.981818 | 0.964286 | 0.972973 |
57
+ | Arabic (ara) | 0.846154 | 0.982143 | 0.909091 |
58
+ | Aragonese (arg) | 1.000000 | 1.000000 | 1.000000 |
59
+ | Egyptian Arabic (arz) | 0.979592 | 0.857143 | 0.914286 |
60
+ | Assamese (asm) | 0.981818 | 0.964286 | 0.972973 |
61
+ | Asturian (ast) | 0.964912 | 0.982143 | 0.973451 |
62
+ | Avar (ava) | 0.941176 | 0.905660 | 0.923077 |
63
+ | Aymara (aym) | 0.964912 | 0.982143 | 0.973451 |
64
+ | South Azerbaijani (azb) | 0.965517 | 1.000000 | 0.982456 |
65
+ | Azerbaijani (aze) | 1.000000 | 1.000000 | 1.000000 |
66
+ | Bashkir (bak) | 1.000000 | 0.978261 | 0.989011 |
67
+ | Bavarian (bar) | 0.843750 | 0.964286 | 0.900000 |
68
+ | Central Bikol (bcl) | 1.000000 | 0.982143 | 0.990991 |
69
+ | Belarusian (Taraschkewiza) (be-tarask) | 1.000000 | 0.875000 | 0.933333 |
70
+ | Belarusian (bel) | 0.870968 | 0.964286 | 0.915254 |
71
+ | Bengali (ben) | 0.982143 | 0.982143 | 0.982143 |
72
+ | Bhojpuri (bho) | 1.000000 | 0.928571 | 0.962963 |
73
+ | Banjar (bjn) | 0.981132 | 0.945455 | 0.962963 |
74
+ | Tibetan (bod) | 1.000000 | 0.982143 | 0.990991 |
75
+ | Bosnian (bos) | 0.552632 | 0.375000 | 0.446809 |
76
+ | Bishnupriya (bpy) | 1.000000 | 0.982143 | 0.990991 |
77
+ | Breton (bre) | 1.000000 | 0.964286 | 0.981818 |
78
+ | Bulgarian (bul) | 1.000000 | 0.964286 | 0.981818 |
79
+ | Buryat (bxr) | 0.946429 | 0.946429 | 0.946429 |
80
+ | Catalan (cat) | 0.982143 | 0.982143 | 0.982143 |
81
+ | Chavacano (cbk) | 0.914894 | 0.767857 | 0.834951 |
82
+ | Min Dong (cdo) | 1.000000 | 0.982143 | 0.990991 |
83
+ | Cebuano (ceb) | 1.000000 | 1.000000 | 1.000000 |
84
+ | Czech (ces) | 1.000000 | 1.000000 | 1.000000 |
85
+ | Chechen (che) | 1.000000 | 1.000000 | 1.000000 |
86
+ | Cherokee (chr) | 1.000000 | 0.963636 | 0.981481 |
87
+ | Chuvash (chv) | 0.938776 | 0.958333 | 0.948454 |
88
+ | Central Kurdish (ckb) | 1.000000 | 1.000000 | 1.000000 |
89
+ | Cornish (cor) | 1.000000 | 1.000000 | 1.000000 |
90
+ | Corsican (cos) | 1.000000 | 0.982143 | 0.990991 |
91
+ | Crimean Tatar (crh) | 1.000000 | 0.946429 | 0.972477 |
92
+ | Kashubian (csb) | 1.000000 | 0.963636 | 0.981481 |
93
+ | Welsh (cym) | 1.000000 | 1.000000 | 1.000000 |
94
+ | Danish (dan) | 1.000000 | 1.000000 | 1.000000 |
95
+ | German (deu) | 0.828125 | 0.946429 | 0.883333 |
96
+ | Dimli (diq) | 0.964912 | 0.982143 | 0.973451 |
97
+ | Dhivehi (div) | 1.000000 | 1.000000 | 1.000000 |
98
+ | Lower Sorbian (dsb) | 1.000000 | 0.982143 | 0.990991 |
99
+ | Doteli (dty) | 0.940000 | 0.854545 | 0.895238 |
100
+ | Emilian (egl) | 1.000000 | 0.928571 | 0.962963 |
101
+ | Modern Greek (ell) | 1.000000 | 1.000000 | 1.000000 |
102
+ | English (eng) | 0.588889 | 0.946429 | 0.726027 |
103
+ | Esperanto (epo) | 1.000000 | 0.982143 | 0.990991 |
104
+ | Estonian (est) | 0.963636 | 0.946429 | 0.954955 |
105
+ | Basque (eus) | 1.000000 | 0.982143 | 0.990991 |
106
+ | Extremaduran (ext) | 0.982143 | 0.982143 | 0.982143 |
107
+ | Faroese (fao) | 1.000000 | 1.000000 | 1.000000 |
108
+ | Persian (fas) | 0.948276 | 0.982143 | 0.964912 |
109
+ | Finnish (fin) | 1.000000 | 1.000000 | 1.000000 |
110
+ | French (fra) | 0.710145 | 0.875000 | 0.784000 |
111
+ | Arpitan (frp) | 1.000000 | 0.946429 | 0.972477 |
112
+ | Western Frisian (fry) | 0.982143 | 0.982143 | 0.982143 |
113
+ | Friulian (fur) | 1.000000 | 0.982143 | 0.990991 |
114
+ | Gagauz (gag) | 0.981132 | 0.945455 | 0.962963 |
115
+ | Scottish Gaelic (gla) | 0.982143 | 0.982143 | 0.982143 |
116
+ | Irish (gle) | 0.949153 | 1.000000 | 0.973913 |
117
+ | Galician (glg) | 1.000000 | 1.000000 | 1.000000 |
118
+ | Gilaki (glk) | 0.981132 | 0.945455 | 0.962963 |
119
+ | Manx (glv) | 1.000000 | 1.000000 | 1.000000 |
120
+ | Guarani (grn) | 1.000000 | 0.964286 | 0.981818 |
121
+ | Gujarati (guj) | 1.000000 | 0.982143 | 0.990991 |
122
+ | Hakka Chinese (hak) | 0.981818 | 0.964286 | 0.972973 |
123
+ | Haitian Creole (hat) | 1.000000 | 1.000000 | 1.000000 |
124
+ | Hausa (hau) | 1.000000 | 0.945455 | 0.971963 |
125
+ | Serbo-Croatian (hbs) | 0.448276 | 0.464286 | 0.456140 |
126
+ | Hebrew (heb) | 1.000000 | 0.982143 | 0.990991 |
127
+ | Fiji Hindi (hif) | 0.890909 | 0.890909 | 0.890909 |
128
+ | Hindi (hin) | 0.981481 | 0.946429 | 0.963636 |
129
+ | Croatian (hrv) | 0.500000 | 0.636364 | 0.560000 |
130
+ | Upper Sorbian (hsb) | 0.955556 | 1.000000 | 0.977273 |
131
+ | Hungarian (hun) | 1.000000 | 1.000000 | 1.000000 |
132
+ | Armenian (hye) | 1.000000 | 0.981818 | 0.990826 |
133
+ | Igbo (ibo) | 0.918033 | 1.000000 | 0.957265 |
134
+ | Ido (ido) | 1.000000 | 1.000000 | 1.000000 |
135
+ | Interlingue (ile) | 1.000000 | 0.962264 | 0.980769 |
136
+ | Iloko (ilo) | 0.947368 | 0.964286 | 0.955752 |
137
+ | Interlingua (ina) | 1.000000 | 1.000000 | 1.000000 |
138
+ | Indonesian (ind) | 0.761905 | 0.872727 | 0.813559 |
139
+ | Icelandic (isl) | 1.000000 | 1.000000 | 1.000000 |
140
+ | Italian (ita) | 0.861538 | 1.000000 | 0.925620 |
141
+ | Jamaican Patois (jam) | 1.000000 | 0.946429 | 0.972477 |
142
+ | Javanese (jav) | 0.964912 | 0.982143 | 0.973451 |
143
+ | Lojban (jbo) | 1.000000 | 1.000000 | 1.000000 |
144
+ | Japanese (jpn) | 1.000000 | 1.000000 | 1.000000 |
145
+ | Karakalpak (kaa) | 0.965517 | 1.000000 | 0.982456 |
146
+ | Kabyle (kab) | 1.000000 | 0.964286 | 0.981818 |
147
+ | Kannada (kan) | 0.982143 | 0.982143 | 0.982143 |
148
+ | Georgian (kat) | 1.000000 | 0.964286 | 0.981818 |
149
+ | Kazakh (kaz) | 0.980769 | 0.980769 | 0.980769 |
150
+ | Kabardian (kbd) | 1.000000 | 0.982143 | 0.990991 |
151
+ | Central Khmer (khm) | 0.960784 | 0.875000 | 0.915888 |
152
+ | Kinyarwanda (kin) | 0.981132 | 0.928571 | 0.954128 |
153
+ | Kirghiz (kir) | 1.000000 | 1.000000 | 1.000000 |
154
+ | Komi-Permyak (koi) | 0.962264 | 0.910714 | 0.935780 |
155
+ | Konkani (kok) | 0.964286 | 0.981818 | 0.972973 |
156
+ | Komi (kom) | 1.000000 | 0.962264 | 0.980769 |
157
+ | Korean (kor) | 1.000000 | 1.000000 | 1.000000 |
158
+ | Karachay-Balkar (krc) | 1.000000 | 0.982143 | 0.990991 |
159
+ | Ripuarisch (ksh) | 1.000000 | 0.964286 | 0.981818 |
160
+ | Kurdish (kur) | 1.000000 | 0.964286 | 0.981818 |
161
+ | Ladino (lad) | 1.000000 | 1.000000 | 1.000000 |
162
+ | Lao (lao) | 0.961538 | 0.909091 | 0.934579 |
163
+ | Latin (lat) | 0.877193 | 0.943396 | 0.909091 |
164
+ | Latvian (lav) | 0.963636 | 0.946429 | 0.954955 |
165
+ | Lezghian (lez) | 1.000000 | 0.964286 | 0.981818 |
166
+ | Ligurian (lij) | 1.000000 | 0.964286 | 0.981818 |
167
+ | Limburgan (lim) | 0.938776 | 1.000000 | 0.968421 |
168
+ | Lingala (lin) | 0.980769 | 0.927273 | 0.953271 |
169
+ | Lithuanian (lit) | 0.982456 | 1.000000 | 0.991150 |
170
+ | Lombard (lmo) | 1.000000 | 1.000000 | 1.000000 |
171
+ | Northern Luri (lrc) | 1.000000 | 0.928571 | 0.962963 |
172
+ | Latgalian (ltg) | 1.000000 | 0.982143 | 0.990991 |
173
+ | Luxembourgish (ltz) | 0.949153 | 1.000000 | 0.973913 |
174
+ | Luganda (lug) | 1.000000 | 1.000000 | 1.000000 |
175
+ | Literary Chinese (lzh) | 1.000000 | 1.000000 | 1.000000 |
176
+ | Maithili (mai) | 0.931034 | 0.964286 | 0.947368 |
177
+ | Malayalam (mal) | 1.000000 | 0.982143 | 0.990991 |
178
+ | Banyumasan (map-bms) | 0.977778 | 0.785714 | 0.871287 |
179
+ | Marathi (mar) | 0.949153 | 1.000000 | 0.973913 |
180
+ | Moksha (mdf) | 0.980000 | 0.890909 | 0.933333 |
181
+ | Eastern Mari (mhr) | 0.981818 | 0.964286 | 0.972973 |
182
+ | Minangkabau (min) | 1.000000 | 1.000000 | 1.000000 |
183
+ | Macedonian (mkd) | 1.000000 | 0.981818 | 0.990826 |
184
+ | Malagasy (mlg) | 0.981132 | 1.000000 | 0.990476 |
185
+ | Maltese (mlt) | 0.982456 | 1.000000 | 0.991150 |
186
+ | Min Nan Chinese (nan) | 1.000000 | 1.000000 | 1.000000 |
187
+ | Mongolian (mon) | 1.000000 | 0.981818 | 0.990826 |
188
+ | Maori (mri) | 1.000000 | 1.000000 | 1.000000 |
189
+ | Western Mari (mrj) | 0.982456 | 1.000000 | 0.991150 |
190
+ | Malay (msa) | 0.862069 | 0.892857 | 0.877193 |
191
+ | Mirandese (mwl) | 1.000000 | 0.982143 | 0.990991 |
192
+ | Burmese (mya) | 1.000000 | 1.000000 | 1.000000 |
193
+ | Erzya (myv) | 0.818182 | 0.964286 | 0.885246 |
194
+ | Mazanderani (mzn) | 0.981481 | 1.000000 | 0.990654 |
195
+ | Neapolitan (nap) | 1.000000 | 0.981818 | 0.990826 |
196
+ | Navajo (nav) | 1.000000 | 1.000000 | 1.000000 |
197
+ | Classical Nahuatl (nci) | 0.981481 | 0.946429 | 0.963636 |
198
+ | Low German (nds) | 0.982143 | 0.982143 | 0.982143 |
199
+ | West Low German (nds-nl) | 1.000000 | 1.000000 | 1.000000 |
200
+ | Nepali (macrolanguage) (nep) | 0.881356 | 0.928571 | 0.904348 |
201
+ | Newari (new) | 1.000000 | 0.909091 | 0.952381 |
202
+ | Dutch (nld) | 0.982143 | 0.982143 | 0.982143 |
203
+ | Norwegian Nynorsk (nno) | 1.000000 | 1.000000 | 1.000000 |
204
+ | Bokmål (nob) | 1.000000 | 1.000000 | 1.000000 |
205
+ | Narom (nrm) | 0.981818 | 0.964286 | 0.972973 |
206
+ | Northern Sotho (nso) | 1.000000 | 1.000000 | 1.000000 |
207
+ | Occitan (oci) | 0.903846 | 0.839286 | 0.870370 |
208
+ | Livvi-Karelian (olo) | 0.982456 | 1.000000 | 0.991150 |
209
+ | Oriya (ori) | 0.964912 | 0.982143 | 0.973451 |
210
+ | Oromo (orm) | 0.982143 | 0.982143 | 0.982143 |
211
+ | Ossetian (oss) | 0.982143 | 1.000000 | 0.990991 |
212
+ | Pangasinan (pag) | 0.980000 | 0.875000 | 0.924528 |
213
+ | Pampanga (pam) | 0.928571 | 0.896552 | 0.912281 |
214
+ | Panjabi (pan) | 1.000000 | 1.000000 | 1.000000 |
215
+ | Papiamento (pap) | 1.000000 | 0.964286 | 0.981818 |
216
+ | Picard (pcd) | 0.849057 | 0.849057 | 0.849057 |
217
+ | Pennsylvania German (pdc) | 0.854839 | 0.946429 | 0.898305 |
218
+ | Palatine German (pfl) | 0.946429 | 0.946429 | 0.946429 |
219
+ | Western Panjabi (pnb) | 0.981132 | 0.962963 | 0.971963 |
220
+ | Polish (pol) | 0.933333 | 1.000000 | 0.965517 |
221
+ | Portuguese (por) | 0.774648 | 0.982143 | 0.866142 |
222
+ | Pushto (pus) | 1.000000 | 0.910714 | 0.953271 |
223
+ | Quechua (que) | 0.962963 | 0.928571 | 0.945455 |
224
+ | Tarantino dialect (roa-tara) | 1.000000 | 0.964286 | 0.981818 |
225
+ | Romansh (roh) | 1.000000 | 0.928571 | 0.962963 |
226
+ | Romanian (ron) | 0.965517 | 1.000000 | 0.982456 |
227
+ | Rusyn (rue) | 0.946429 | 0.946429 | 0.946429 |
228
+ | Aromanian (rup) | 0.962963 | 0.928571 | 0.945455 |
229
+ | Russian (rus) | 0.859375 | 0.982143 | 0.916667 |
230
+ | Yakut (sah) | 1.000000 | 0.982143 | 0.990991 |
231
+ | Sanskrit (san) | 0.982143 | 0.982143 | 0.982143 |
232
+ | Sicilian (scn) | 1.000000 | 1.000000 | 1.000000 |
233
+ | Scots (sco) | 0.982143 | 0.982143 | 0.982143 |
234
+ | Samogitian (sgs) | 1.000000 | 0.982143 | 0.990991 |
235
+ | Sinhala (sin) | 0.964912 | 0.982143 | 0.973451 |
236
+ | Slovak (slk) | 1.000000 | 0.982143 | 0.990991 |
237
+ | Slovene (slv) | 1.000000 | 0.981818 | 0.990826 |
238
+ | Northern Sami (sme) | 0.962264 | 0.962264 | 0.962264 |
239
+ | Shona (sna) | 0.933333 | 1.000000 | 0.965517 |
240
+ | Sindhi (snd) | 1.000000 | 1.000000 | 1.000000 |
241
+ | Somali (som) | 0.948276 | 1.000000 | 0.973451 |
242
+ | Spanish (spa) | 0.739130 | 0.910714 | 0.816000 |
243
+ | Albanian (sqi) | 0.982143 | 0.982143 | 0.982143 |
244
+ | Sardinian (srd) | 1.000000 | 0.982143 | 0.990991 |
245
+ | Sranan (srn) | 1.000000 | 1.000000 | 1.000000 |
246
+ | Serbian (srp) | 1.000000 | 0.946429 | 0.972477 |
247
+ | Saterfriesisch (stq) | 1.000000 | 0.964286 | 0.981818 |
248
+ | Sundanese (sun) | 1.000000 | 0.977273 | 0.988506 |
249
+ | Swahili (macrolanguage) (swa) | 1.000000 | 1.000000 | 1.000000 |
250
+ | Swedish (swe) | 1.000000 | 1.000000 | 1.000000 |
251
+ | Silesian (szl) | 1.000000 | 0.981481 | 0.990654 |
252
+ | Tamil (tam) | 0.982143 | 1.000000 | 0.990991 |
253
+ | Tatar (tat) | 1.000000 | 1.000000 | 1.000000 |
254
+ | Tulu (tcy) | 0.982456 | 1.000000 | 0.991150 |
255
+ | Telugu (tel) | 1.000000 | 0.920000 | 0.958333 |
256
+ | Tetum (tet) | 1.000000 | 0.964286 | 0.981818 |
257
+ | Tajik (tgk) | 1.000000 | 1.000000 | 1.000000 |
258
+ | Tagalog (tgl) | 1.000000 | 1.000000 | 1.000000 |
259
+ | Thai (tha) | 0.932203 | 0.982143 | 0.956522 |
260
+ | Tongan (ton) | 1.000000 | 0.964286 | 0.981818 |
261
+ | Tswana (tsn) | 1.000000 | 1.000000 | 1.000000 |
262
+ | Turkmen (tuk) | 1.000000 | 0.982143 | 0.990991 |
263
+ | Turkish (tur) | 0.901639 | 0.982143 | 0.940171 |
264
+ | Tuvan (tyv) | 1.000000 | 0.964286 | 0.981818 |
265
+ | Udmurt (udm) | 1.000000 | 0.982143 | 0.990991 |
266
+ | Uighur (uig) | 1.000000 | 0.982143 | 0.990991 |
267
+ | Ukrainian (ukr) | 0.963636 | 0.946429 | 0.954955 |
268
+ | Urdu (urd) | 1.000000 | 0.982143 | 0.990991 |
269
+ | Uzbek (uzb) | 1.000000 | 1.000000 | 1.000000 |
270
+ | Venetian (vec) | 1.000000 | 0.982143 | 0.990991 |
271
+ | Veps (vep) | 0.982456 | 1.000000 | 0.991150 |
272
+ | Vietnamese (vie) | 0.964912 | 0.982143 | 0.973451 |
273
+ | Vlaams (vls) | 1.000000 | 0.982143 | 0.990991 |
274
+ | Volapük (vol) | 1.000000 | 1.000000 | 1.000000 |
275
+ | Võro (vro) | 0.964286 | 0.964286 | 0.964286 |
276
+ | Waray (war) | 1.000000 | 0.982143 | 0.990991 |
277
+ | Walloon (wln) | 1.000000 | 1.000000 | 1.000000 |
278
+ | Wolof (wol) | 0.981481 | 0.963636 | 0.972477 |
279
+ | Wu Chinese (wuu) | 0.981481 | 0.946429 | 0.963636 |
280
+ | Xhosa (xho) | 1.000000 | 0.964286 | 0.981818 |
281
+ | Mingrelian (xmf) | 1.000000 | 0.964286 | 0.981818 |
282
+ | Yiddish (yid) | 1.000000 | 1.000000 | 1.000000 |
283
+ | Yoruba (yor) | 0.964912 | 0.982143 | 0.973451 |
284
+ | Zeeuws (zea) | 1.000000 | 0.982143 | 0.990991 |
285
+ | Cantonese (zh-yue) | 0.981481 | 0.946429 | 0.963636 |
286
+ | Standard Chinese (zho) | 0.932203 | 0.982143 | 0.956522 |
287
+ | accuracy | 0.963055 | 0.963055 | 0.963055 |
288
+ | macro avg | 0.966424 | 0.963216 | 0.963891 |
289
+ | weighted avg | 0.966040 | 0.963055 | 0.963606 |
290
+
291
+ ### By Sentence
292
+
293
+ | language | precision | recall | f1-score |
294
+ |:--------------------------------------:|:---------:|:--------:|:--------:|
295
+ | Achinese (ace) | 0.754545 | 0.873684 | 0.809756 |
296
+ | Afrikaans (afr) | 0.708955 | 0.940594 | 0.808511 |
297
+ | Alemannic German (als) | 0.870130 | 0.752809 | 0.807229 |
298
+ | Amharic (amh) | 1.000000 | 0.820000 | 0.901099 |
299
+ | Old English (ang) | 0.966667 | 0.906250 | 0.935484 |
300
+ | Arabic (ara) | 0.907692 | 0.967213 | 0.936508 |
301
+ | Aragonese (arg) | 0.921569 | 0.959184 | 0.940000 |
302
+ | Egyptian Arabic (arz) | 0.964286 | 0.843750 | 0.900000 |
303
+ | Assamese (asm) | 0.964286 | 0.870968 | 0.915254 |
304
+ | Asturian (ast) | 0.880000 | 0.795181 | 0.835443 |
305
+ | Avar (ava) | 0.864198 | 0.843373 | 0.853659 |
306
+ | Aymara (aym) | 1.000000 | 0.901961 | 0.948454 |
307
+ | South Azerbaijani (azb) | 0.979381 | 0.989583 | 0.984456 |
308
+ | Azerbaijani (aze) | 0.989899 | 0.960784 | 0.975124 |
309
+ | Bashkir (bak) | 0.837209 | 0.857143 | 0.847059 |
310
+ | Bavarian (bar) | 0.741935 | 0.766667 | 0.754098 |
311
+ | Central Bikol (bcl) | 0.962963 | 0.928571 | 0.945455 |
312
+ | Belarusian (Taraschkewiza) (be-tarask) | 0.857143 | 0.733333 | 0.790419 |
313
+ | Belarusian (bel) | 0.775510 | 0.752475 | 0.763819 |
314
+ | Bengali (ben) | 0.861111 | 0.911765 | 0.885714 |
315
+ | Bhojpuri (bho) | 0.965517 | 0.933333 | 0.949153 |
316
+ | Banjar (bjn) | 0.891566 | 0.880952 | 0.886228 |
317
+ | Tibetan (bod) | 1.000000 | 1.000000 | 1.000000 |
318
+ | Bosnian (bos) | 0.375000 | 0.323077 | 0.347107 |
319
+ | Bishnupriya (bpy) | 0.986301 | 1.000000 | 0.993103 |
320
+ | Breton (bre) | 0.951613 | 0.893939 | 0.921875 |
321
+ | Bulgarian (bul) | 0.945055 | 0.877551 | 0.910053 |
322
+ | Buryat (bxr) | 0.955556 | 0.843137 | 0.895833 |
323
+ | Catalan (cat) | 0.692308 | 0.750000 | 0.720000 |
324
+ | Chavacano (cbk) | 0.842857 | 0.641304 | 0.728395 |
325
+ | Min Dong (cdo) | 0.972973 | 1.000000 | 0.986301 |
326
+ | Cebuano (ceb) | 0.981308 | 0.954545 | 0.967742 |
327
+ | Czech (ces) | 0.944444 | 0.915385 | 0.929687 |
328
+ | Chechen (che) | 0.875000 | 0.700000 | 0.777778 |
329
+ | Cherokee (chr) | 1.000000 | 0.970588 | 0.985075 |
330
+ | Chuvash (chv) | 0.875000 | 0.836957 | 0.855556 |
331
+ | Central Kurdish (ckb) | 1.000000 | 0.983051 | 0.991453 |
332
+ | Cornish (cor) | 0.979592 | 0.969697 | 0.974619 |
333
+ | Corsican (cos) | 0.986842 | 0.925926 | 0.955414 |
334
+ | Crimean Tatar (crh) | 0.958333 | 0.907895 | 0.932432 |
335
+ | Kashubian (csb) | 0.920354 | 0.904348 | 0.912281 |
336
+ | Welsh (cym) | 0.971014 | 0.943662 | 0.957143 |
337
+ | Danish (dan) | 0.865169 | 0.777778 | 0.819149 |
338
+ | German (deu) | 0.721311 | 0.822430 | 0.768559 |
339
+ | Dimli (diq) | 0.915966 | 0.923729 | 0.919831 |
340
+ | Dhivehi (div) | 1.000000 | 0.991228 | 0.995595 |
341
+ | Lower Sorbian (dsb) | 0.898876 | 0.879121 | 0.888889 |
342
+ | Doteli (dty) | 0.821429 | 0.638889 | 0.718750 |
343
+ | Emilian (egl) | 0.988095 | 0.922222 | 0.954023 |
344
+ | Modern Greek (ell) | 0.988636 | 0.966667 | 0.977528 |
345
+ | English (eng) | 0.522727 | 0.784091 | 0.627273 |
346
+ | Esperanto (epo) | 0.963855 | 0.930233 | 0.946746 |
347
+ | Estonian (est) | 0.922222 | 0.873684 | 0.897297 |
348
+ | Basque (eus) | 1.000000 | 0.941176 | 0.969697 |
349
+ | Extremaduran (ext) | 0.925373 | 0.885714 | 0.905109 |
350
+ | Faroese (fao) | 0.855072 | 0.887218 | 0.870849 |
351
+ | Persian (fas) | 0.879630 | 0.979381 | 0.926829 |
352
+ | Finnish (fin) | 0.952830 | 0.943925 | 0.948357 |
353
+ | French (fra) | 0.676768 | 0.943662 | 0.788235 |
354
+ | Arpitan (frp) | 0.867925 | 0.807018 | 0.836364 |
355
+ | Western Frisian (fry) | 0.956989 | 0.890000 | 0.922280 |
356
+ | Friulian (fur) | 1.000000 | 0.857143 | 0.923077 |
357
+ | Gagauz (gag) | 0.939024 | 0.802083 | 0.865169 |
358
+ | Scottish Gaelic (gla) | 1.000000 | 0.879121 | 0.935673 |
359
+ | Irish (gle) | 0.989247 | 0.958333 | 0.973545 |
360
+ | Galician (glg) | 0.910256 | 0.922078 | 0.916129 |
361
+ | Gilaki (glk) | 0.964706 | 0.872340 | 0.916201 |
362
+ | Manx (glv) | 1.000000 | 0.965517 | 0.982456 |
363
+ | Guarani (grn) | 0.983333 | 1.000000 | 0.991597 |
364
+ | Gujarati (guj) | 1.000000 | 0.991525 | 0.995745 |
365
+ | Hakka Chinese (hak) | 0.955224 | 0.955224 | 0.955224 |
366
+ | Haitian Creole (hat) | 0.833333 | 0.666667 | 0.740741 |
367
+ | Hausa (hau) | 0.936709 | 0.913580 | 0.925000 |
368
+ | Serbo-Croatian (hbs) | 0.452830 | 0.410256 | 0.430493 |
369
+ | Hebrew (heb) | 0.988235 | 0.976744 | 0.982456 |
370
+ | Fiji Hindi (hif) | 0.936709 | 0.840909 | 0.886228 |
371
+ | Hindi (hin) | 0.965517 | 0.756757 | 0.848485 |
372
+ | Croatian (hrv) | 0.443820 | 0.537415 | 0.486154 |
373
+ | Upper Sorbian (hsb) | 0.951613 | 0.830986 | 0.887218 |
374
+ | Hungarian (hun) | 0.854701 | 0.909091 | 0.881057 |
375
+ | Armenian (hye) | 1.000000 | 0.816327 | 0.898876 |
376
+ | Igbo (ibo) | 0.974359 | 0.926829 | 0.950000 |
377
+ | Ido (ido) | 0.975000 | 0.987342 | 0.981132 |
378
+ | Interlingue (ile) | 0.880597 | 0.921875 | 0.900763 |
379
+ | Iloko (ilo) | 0.882353 | 0.821918 | 0.851064 |
380
+ | Interlingua (ina) | 0.952381 | 0.895522 | 0.923077 |
381
+ | Indonesian (ind) | 0.606383 | 0.695122 | 0.647727 |
382
+ | Icelandic (isl) | 0.978261 | 0.882353 | 0.927835 |
383
+ | Italian (ita) | 0.910448 | 0.910448 | 0.910448 |
384
+ | Jamaican Patois (jam) | 0.988764 | 0.967033 | 0.977778 |
385
+ | Javanese (jav) | 0.903614 | 0.862069 | 0.882353 |
386
+ | Lojban (jbo) | 0.943878 | 0.929648 | 0.936709 |
387
+ | Japanese (jpn) | 1.000000 | 0.764706 | 0.866667 |
388
+ | Karakalpak (kaa) | 0.940171 | 0.901639 | 0.920502 |
389
+ | Kabyle (kab) | 0.985294 | 0.837500 | 0.905405 |
390
+ | Kannada (kan) | 0.975806 | 0.975806 | 0.975806 |
391
+ | Georgian (kat) | 0.953704 | 0.903509 | 0.927928 |
392
+ | Kazakh (kaz) | 0.934579 | 0.877193 | 0.904977 |
393
+ | Kabardian (kbd) | 0.987952 | 0.953488 | 0.970414 |
394
+ | Central Khmer (khm) | 0.928571 | 0.829787 | 0.876404 |
395
+ | Kinyarwanda (kin) | 0.953125 | 0.938462 | 0.945736 |
396
+ | Kirghiz (kir) | 0.927632 | 0.881250 | 0.903846 |
397
+ | Komi-Permyak (koi) | 0.750000 | 0.776786 | 0.763158 |
398
+ | Konkani (kok) | 0.893491 | 0.872832 | 0.883041 |
399
+ | Komi (kom) | 0.734177 | 0.690476 | 0.711656 |
400
+ | Korean (kor) | 0.989899 | 0.989899 | 0.989899 |
401
+ | Karachay-Balkar (krc) | 0.928571 | 0.917647 | 0.923077 |
402
+ | Ripuarisch (ksh) | 0.915789 | 0.896907 | 0.906250 |
403
+ | Kurdish (kur) | 0.977528 | 0.935484 | 0.956044 |
404
+ | Ladino (lad) | 0.985075 | 0.904110 | 0.942857 |
405
+ | Lao (lao) | 0.896552 | 0.812500 | 0.852459 |
406
+ | Latin (lat) | 0.741935 | 0.831325 | 0.784091 |
407
+ | Latvian (lav) | 0.710526 | 0.878049 | 0.785455 |
408
+ | Lezghian (lez) | 0.975309 | 0.877778 | 0.923977 |
409
+ | Ligurian (lij) | 0.951807 | 0.897727 | 0.923977 |
410
+ | Limburgan (lim) | 0.909091 | 0.921053 | 0.915033 |
411
+ | Lingala (lin) | 0.942857 | 0.814815 | 0.874172 |
412
+ | Lithuanian (lit) | 0.892857 | 0.925926 | 0.909091 |
413
+ | Lombard (lmo) | 0.766234 | 0.951613 | 0.848921 |
414
+ | Northern Luri (lrc) | 0.972222 | 0.875000 | 0.921053 |
415
+ | Latgalian (ltg) | 0.895349 | 0.865169 | 0.880000 |
416
+ | Luxembourgish (ltz) | 0.882353 | 0.750000 | 0.810811 |
417
+ | Luganda (lug) | 0.946429 | 0.883333 | 0.913793 |
418
+ | Literary Chinese (lzh) | 1.000000 | 1.000000 | 1.000000 |
419
+ | Maithili (mai) | 0.893617 | 0.823529 | 0.857143 |
420
+ | Malayalam (mal) | 1.000000 | 0.975000 | 0.987342 |
421
+ | Banyumasan (map-bms) | 0.924242 | 0.772152 | 0.841379 |
422
+ | Marathi (mar) | 0.874126 | 0.919118 | 0.896057 |
423
+ | Moksha (mdf) | 0.771242 | 0.830986 | 0.800000 |
424
+ | Eastern Mari (mhr) | 0.820000 | 0.860140 | 0.839590 |
425
+ | Minangkabau (min) | 0.973684 | 0.973684 | 0.973684 |
426
+ | Macedonian (mkd) | 0.895652 | 0.953704 | 0.923767 |
427
+ | Malagasy (mlg) | 1.000000 | 0.966102 | 0.982759 |
428
+ | Maltese (mlt) | 0.987952 | 0.964706 | 0.976190 |
429
+ | Min Nan Chinese (nan) | 0.975000 | 1.000000 | 0.987342 |
430
+ | Mongolian (mon) | 0.954545 | 0.933333 | 0.943820 |
431
+ | Maori (mri) | 0.985294 | 1.000000 | 0.992593 |
432
+ | Western Mari (mrj) | 0.966292 | 0.914894 | 0.939891 |
433
+ | Malay (msa) | 0.770270 | 0.695122 | 0.730769 |
434
+ | Mirandese (mwl) | 0.970588 | 0.891892 | 0.929577 |
435
+ | Burmese (mya) | 1.000000 | 0.964286 | 0.981818 |
436
+ | Erzya (myv) | 0.535714 | 0.681818 | 0.600000 |
437
+ | Mazanderani (mzn) | 0.968750 | 0.898551 | 0.932331 |
438
+ | Neapolitan (nap) | 0.892308 | 0.865672 | 0.878788 |
439
+ | Navajo (nav) | 0.984375 | 0.984375 | 0.984375 |
440
+ | Classical Nahuatl (nci) | 0.901408 | 0.761905 | 0.825806 |
441
+ | Low German (nds) | 0.896226 | 0.913462 | 0.904762 |
442
+ | West Low German (nds-nl) | 0.873563 | 0.835165 | 0.853933 |
443
+ | Nepali (macrolanguage) (nep) | 0.704545 | 0.861111 | 0.775000 |
444
+ | Newari (new) | 0.920000 | 0.741935 | 0.821429 |
445
+ | Dutch (nld) | 0.925926 | 0.872093 | 0.898204 |
446
+ | Norwegian Nynorsk (nno) | 0.847059 | 0.808989 | 0.827586 |
447
+ | Bokmål (nob) | 0.861386 | 0.852941 | 0.857143 |
448
+ | Narom (nrm) | 0.966667 | 0.983051 | 0.974790 |
449
+ | Northern Sotho (nso) | 0.897436 | 0.921053 | 0.909091 |
450
+ | Occitan (oci) | 0.958333 | 0.696970 | 0.807018 |
451
+ | Livvi-Karelian (olo) | 0.967742 | 0.937500 | 0.952381 |
452
+ | Oriya (ori) | 0.933333 | 1.000000 | 0.965517 |
453
+ | Oromo (orm) | 0.977528 | 0.915789 | 0.945652 |
454
+ | Ossetian (oss) | 0.958333 | 0.841463 | 0.896104 |
455
+ | Pangasinan (pag) | 0.847328 | 0.909836 | 0.877470 |
456
+ | Pampanga (pam) | 0.969697 | 0.780488 | 0.864865 |
457
+ | Panjabi (pan) | 1.000000 | 1.000000 | 1.000000 |
458
+ | Papiamento (pap) | 0.876190 | 0.920000 | 0.897561 |
459
+ | Picard (pcd) | 0.707317 | 0.568627 | 0.630435 |
460
+ | Pennsylvania German (pdc) | 0.827273 | 0.827273 | 0.827273 |
461
+ | Palatine German (pfl) | 0.882353 | 0.914634 | 0.898204 |
462
+ | Western Panjabi (pnb) | 0.964286 | 0.931034 | 0.947368 |
463
+ | Polish (pol) | 0.859813 | 0.910891 | 0.884615 |
464
+ | Portuguese (por) | 0.535714 | 0.833333 | 0.652174 |
465
+ | Pushto (pus) | 0.989362 | 0.902913 | 0.944162 |
466
+ | Quechua (que) | 0.979167 | 0.903846 | 0.940000 |
467
+ | Tarantino dialect (roa-tara) | 0.964912 | 0.901639 | 0.932203 |
468
+ | Romansh (roh) | 0.914894 | 0.895833 | 0.905263 |
469
+ | Romanian (ron) | 0.880597 | 0.880597 | 0.880597 |
470
+ | Rusyn (rue) | 0.932584 | 0.805825 | 0.864583 |
471
+ | Aromanian (rup) | 0.783333 | 0.758065 | 0.770492 |
472
+ | Russian (rus) | 0.517986 | 0.765957 | 0.618026 |
473
+ | Yakut (sah) | 0.954023 | 0.922222 | 0.937853 |
474
+ | Sanskrit (san) | 0.866667 | 0.951220 | 0.906977 |
475
+ | Sicilian (scn) | 0.984375 | 0.940299 | 0.961832 |
476
+ | Scots (sco) | 0.851351 | 0.900000 | 0.875000 |
477
+ | Samogitian (sgs) | 0.977011 | 0.876289 | 0.923913 |
478
+ | Sinhala (sin) | 0.406154 | 0.985075 | 0.575163 |
479
+ | Slovak (slk) | 0.956989 | 0.872549 | 0.912821 |
480
+ | Slovene (slv) | 0.907216 | 0.854369 | 0.880000 |
481
+ | Northern Sami (sme) | 0.949367 | 0.892857 | 0.920245 |
482
+ | Shona (sna) | 0.936508 | 0.855072 | 0.893939 |
483
+ | Sindhi (snd) | 0.984962 | 0.992424 | 0.988679 |
484
+ | Somali (som) | 0.949153 | 0.848485 | 0.896000 |
485
+ | Spanish (spa) | 0.584158 | 0.746835 | 0.655556 |
486
+ | Albanian (sqi) | 0.988095 | 0.912088 | 0.948571 |
487
+ | Sardinian (srd) | 0.957746 | 0.931507 | 0.944444 |
488
+ | Sranan (srn) | 0.985714 | 0.945205 | 0.965035 |
489
+ | Serbian (srp) | 0.950980 | 0.889908 | 0.919431 |
490
+ | Saterfriesisch (stq) | 0.962500 | 0.875000 | 0.916667 |
491
+ | Sundanese (sun) | 0.778846 | 0.910112 | 0.839378 |
492
+ | Swahili (macrolanguage) (swa) | 0.915493 | 0.878378 | 0.896552 |
493
+ | Swedish (swe) | 0.989247 | 0.958333 | 0.973545 |
494
+ | Silesian (szl) | 0.944444 | 0.904255 | 0.923913 |
495
+ | Tamil (tam) | 0.990000 | 0.970588 | 0.980198 |
496
+ | Tatar (tat) | 0.942029 | 0.902778 | 0.921986 |
497
+ | Tulu (tcy) | 0.980519 | 0.967949 | 0.974194 |
498
+ | Telugu (tel) | 0.965986 | 0.965986 | 0.965986 |
499
+ | Tetum (tet) | 0.898734 | 0.855422 | 0.876543 |
500
+ | Tajik (tgk) | 0.974684 | 0.939024 | 0.956522 |
501
+ | Tagalog (tgl) | 0.965909 | 0.934066 | 0.949721 |
502
+ | Thai (tha) | 0.923077 | 0.882353 | 0.902256 |
503
+ | Tongan (ton) | 0.970149 | 0.890411 | 0.928571 |
504
+ | Tswana (tsn) | 0.888889 | 0.926316 | 0.907216 |
505
+ | Turkmen (tuk) | 0.968000 | 0.889706 | 0.927203 |
506
+ | Turkish (tur) | 0.871287 | 0.926316 | 0.897959 |
507
+ | Tuvan (tyv) | 0.948454 | 0.859813 | 0.901961 |
508
+ | Udmurt (udm) | 0.989362 | 0.894231 | 0.939394 |
509
+ | Uighur (uig) | 1.000000 | 0.953333 | 0.976109 |
510
+ | Ukrainian (ukr) | 0.893617 | 0.875000 | 0.884211 |
511
+ | Urdu (urd) | 1.000000 | 1.000000 | 1.000000 |
512
+ | Uzbek (uzb) | 0.636042 | 0.886700 | 0.740741 |
513
+ | Venetian (vec) | 1.000000 | 0.941176 | 0.969697 |
514
+ | Veps (vep) | 0.858586 | 0.965909 | 0.909091 |
515
+ | Vietnamese (vie) | 1.000000 | 0.940476 | 0.969325 |
516
+ | Vlaams (vls) | 0.885714 | 0.898551 | 0.892086 |
517
+ | Volapük (vol) | 0.975309 | 0.975309 | 0.975309 |
518
+ | Võro (vro) | 0.855670 | 0.864583 | 0.860104 |
519
+ | Waray (war) | 0.972222 | 0.909091 | 0.939597 |
520
+ | Walloon (wln) | 0.742138 | 0.893939 | 0.810997 |
521
+ | Wolof (wol) | 0.882979 | 0.954023 | 0.917127 |
522
+ | Wu Chinese (wuu) | 0.961538 | 0.833333 | 0.892857 |
523
+ | Xhosa (xho) | 0.934066 | 0.867347 | 0.899471 |
524
+ | Mingrelian (xmf) | 0.958333 | 0.929293 | 0.943590 |
525
+ | Yiddish (yid) | 0.984375 | 0.875000 | 0.926471 |
526
+ | Yoruba (yor) | 0.868421 | 0.857143 | 0.862745 |
527
+ | Zeeuws (zea) | 0.879518 | 0.793478 | 0.834286 |
528
+ | Cantonese (zh-yue) | 0.896552 | 0.812500 | 0.852459 |
529
+ | Standard Chinese (zho) | 0.906250 | 0.935484 | 0.920635 |
530
+ | accuracy | 0.881051 | 0.881051 | 0.881051 |
531
+ | macro avg | 0.903245 | 0.880618 | 0.888996 |
532
+ | weighted avg | 0.894174 | 0.881051 | 0.884520 |
533
+
534
+ ### By Token (3 to 5)
535
+
536
+ | language | precision | recall | f1-score |
537
+ |:--------------------------------------:|:---------:|:--------:|:--------:|
538
+ | Achinese (ace) | 0.873846 | 0.827988 | 0.850299 |
539
+ | Afrikaans (afr) | 0.638060 | 0.732334 | 0.681954 |
540
+ | Alemannic German (als) | 0.673780 | 0.547030 | 0.603825 |
541
+ | Amharic (amh) | 0.997743 | 0.954644 | 0.975717 |
542
+ | Old English (ang) | 0.840816 | 0.693603 | 0.760148 |
543
+ | Arabic (ara) | 0.768737 | 0.840749 | 0.803132 |
544
+ | Aragonese (arg) | 0.493671 | 0.505181 | 0.499360 |
545
+ | Egyptian Arabic (arz) | 0.823529 | 0.741935 | 0.780606 |
546
+ | Assamese (asm) | 0.948454 | 0.893204 | 0.920000 |
547
+ | Asturian (ast) | 0.490000 | 0.508299 | 0.498982 |
548
+ | Avar (ava) | 0.813636 | 0.655678 | 0.726166 |
549
+ | Aymara (aym) | 0.795833 | 0.779592 | 0.787629 |
550
+ | South Azerbaijani (azb) | 0.832836 | 0.863777 | 0.848024 |
551
+ | Azerbaijani (aze) | 0.867470 | 0.800000 | 0.832370 |
552
+ | Bashkir (bak) | 0.851852 | 0.750000 | 0.797688 |
553
+ | Bavarian (bar) | 0.560897 | 0.522388 | 0.540958 |
554
+ | Central Bikol (bcl) | 0.708229 | 0.668235 | 0.687651 |
555
+ | Belarusian (Taraschkewiza) (be-tarask) | 0.615635 | 0.526462 | 0.567568 |
556
+ | Belarusian (bel) | 0.539952 | 0.597855 | 0.567430 |
557
+ | Bengali (ben) | 0.830275 | 0.885086 | 0.856805 |
558
+ | Bhojpuri (bho) | 0.723118 | 0.691517 | 0.706965 |
559
+ | Banjar (bjn) | 0.619586 | 0.726269 | 0.668699 |
560
+ | Tibetan (bod) | 0.999537 | 0.991728 | 0.995617 |
561
+ | Bosnian (bos) | 0.330849 | 0.403636 | 0.363636 |
562
+ | Bishnupriya (bpy) | 0.941634 | 0.949020 | 0.945312 |
563
+ | Breton (bre) | 0.772222 | 0.745308 | 0.758527 |
564
+ | Bulgarian (bul) | 0.771505 | 0.706897 | 0.737789 |
565
+ | Buryat (bxr) | 0.741935 | 0.753149 | 0.747500 |
566
+ | Catalan (cat) | 0.528716 | 0.610136 | 0.566516 |
567
+ | Chavacano (cbk) | 0.409449 | 0.312625 | 0.354545 |
568
+ | Min Dong (cdo) | 0.951264 | 0.936057 | 0.943599 |
569
+ | Cebuano (ceb) | 0.888298 | 0.876640 | 0.882431 |
570
+ | Czech (ces) | 0.806045 | 0.758294 | 0.781441 |
571
+ | Chechen (che) | 0.857143 | 0.600000 | 0.705882 |
572
+ | Cherokee (chr) | 0.997840 | 0.952577 | 0.974684 |
573
+ | Chuvash (chv) | 0.874346 | 0.776744 | 0.822660 |
574
+ | Central Kurdish (ckb) | 0.984848 | 0.953545 | 0.968944 |
575
+ | Cornish (cor) | 0.747596 | 0.807792 | 0.776529 |
576
+ | Corsican (cos) | 0.673913 | 0.708571 | 0.690808 |
577
+ | Crimean Tatar (crh) | 0.498801 | 0.700337 | 0.582633 |
578
+ | Kashubian (csb) | 0.797059 | 0.794721 | 0.795888 |
579
+ | Welsh (cym) | 0.829609 | 0.841360 | 0.835443 |
580
+ | Danish (dan) | 0.649789 | 0.622222 | 0.635707 |
581
+ | German (deu) | 0.559406 | 0.763514 | 0.645714 |
582
+ | Dimli (diq) | 0.835580 | 0.763547 | 0.797941 |
583
+ | Dhivehi (div) | 1.000000 | 0.980645 | 0.990228 |
584
+ | Lower Sorbian (dsb) | 0.740484 | 0.694805 | 0.716918 |
585
+ | Doteli (dty) | 0.616314 | 0.527132 | 0.568245 |
586
+ | Emilian (egl) | 0.822993 | 0.769625 | 0.795414 |
587
+ | Modern Greek (ell) | 0.972043 | 0.963753 | 0.967880 |
588
+ | English (eng) | 0.260492 | 0.724346 | 0.383183 |
589
+ | Esperanto (epo) | 0.766764 | 0.716621 | 0.740845 |
590
+ | Estonian (est) | 0.698885 | 0.673835 | 0.686131 |
591
+ | Basque (eus) | 0.882716 | 0.841176 | 0.861446 |
592
+ | Extremaduran (ext) | 0.570605 | 0.511628 | 0.539510 |
593
+ | Faroese (fao) | 0.773987 | 0.784017 | 0.778970 |
594
+ | Persian (fas) | 0.709836 | 0.809346 | 0.756332 |
595
+ | Finnish (fin) | 0.866261 | 0.796089 | 0.829694 |
596
+ | French (fra) | 0.496263 | 0.700422 | 0.580927 |
597
+ | Arpitan (frp) | 0.663366 | 0.584302 | 0.621329 |
598
+ | Western Frisian (fry) | 0.750000 | 0.756148 | 0.753061 |
599
+ | Friulian (fur) | 0.713555 | 0.675545 | 0.694030 |
600
+ | Gagauz (gag) | 0.728125 | 0.677326 | 0.701807 |
601
+ | Scottish Gaelic (gla) | 0.831601 | 0.817996 | 0.824742 |
602
+ | Irish (gle) | 0.868852 | 0.801296 | 0.833708 |
603
+ | Galician (glg) | 0.469816 | 0.454315 | 0.461935 |
604
+ | Gilaki (glk) | 0.703883 | 0.687204 | 0.695444 |
605
+ | Manx (glv) | 0.873047 | 0.886905 | 0.879921 |
606
+ | Guarani (grn) | 0.848580 | 0.793510 | 0.820122 |
607
+ | Gujarati (guj) | 0.995643 | 0.926978 | 0.960084 |
608
+ | Hakka Chinese (hak) | 0.898403 | 0.904971 | 0.901675 |
609
+ | Haitian Creole (hat) | 0.719298 | 0.518987 | 0.602941 |
610
+ | Hausa (hau) | 0.815353 | 0.829114 | 0.822176 |
611
+ | Serbo-Croatian (hbs) | 0.343465 | 0.244589 | 0.285714 |
612
+ | Hebrew (heb) | 0.891304 | 0.933941 | 0.912125 |
613
+ | Fiji Hindi (hif) | 0.662577 | 0.664615 | 0.663594 |
614
+ | Hindi (hin) | 0.782301 | 0.778169 | 0.780229 |
615
+ | Croatian (hrv) | 0.360308 | 0.374000 | 0.367026 |
616
+ | Upper Sorbian (hsb) | 0.745763 | 0.611111 | 0.671756 |
617
+ | Hungarian (hun) | 0.876812 | 0.846154 | 0.861210 |
618
+ | Armenian (hye) | 0.988201 | 0.917808 | 0.951705 |
619
+ | Igbo (ibo) | 0.825397 | 0.696429 | 0.755448 |
620
+ | Ido (ido) | 0.760479 | 0.814103 | 0.786378 |
621
+ | Interlingue (ile) | 0.701299 | 0.580645 | 0.635294 |
622
+ | Iloko (ilo) | 0.688356 | 0.844538 | 0.758491 |
623
+ | Interlingua (ina) | 0.577889 | 0.588235 | 0.583016 |
624
+ | Indonesian (ind) | 0.415879 | 0.514019 | 0.459770 |
625
+ | Icelandic (isl) | 0.855263 | 0.790754 | 0.821745 |
626
+ | Italian (ita) | 0.474576 | 0.561247 | 0.514286 |
627
+ | Jamaican Patois (jam) | 0.826087 | 0.791667 | 0.808511 |
628
+ | Javanese (jav) | 0.670130 | 0.658163 | 0.664093 |
629
+ | Lojban (jbo) | 0.896861 | 0.917431 | 0.907029 |
630
+ | Japanese (jpn) | 0.931373 | 0.848214 | 0.887850 |
631
+ | Karakalpak (kaa) | 0.790393 | 0.827744 | 0.808637 |
632
+ | Kabyle (kab) | 0.828571 | 0.759162 | 0.792350 |
633
+ | Kannada (kan) | 0.879357 | 0.847545 | 0.863158 |
634
+ | Georgian (kat) | 0.916399 | 0.907643 | 0.912000 |
635
+ | Kazakh (kaz) | 0.900901 | 0.819672 | 0.858369 |
636
+ | Kabardian (kbd) | 0.923345 | 0.892256 | 0.907534 |
637
+ | Central Khmer (khm) | 0.976667 | 0.816156 | 0.889226 |
638
+ | Kinyarwanda (kin) | 0.824324 | 0.726190 | 0.772152 |
639
+ | Kirghiz (kir) | 0.674766 | 0.779698 | 0.723447 |
640
+ | Komi-Permyak (koi) | 0.652830 | 0.633700 | 0.643123 |
641
+ | Konkani (kok) | 0.778865 | 0.728938 | 0.753075 |
642
+ | Komi (kom) | 0.737374 | 0.572549 | 0.644592 |
643
+ | Korean (kor) | 0.984615 | 0.967603 | 0.976035 |
644
+ | Karachay-Balkar (krc) | 0.869416 | 0.857627 | 0.863481 |
645
+ | Ripuarisch (ksh) | 0.709859 | 0.649485 | 0.678331 |
646
+ | Kurdish (kur) | 0.883777 | 0.862884 | 0.873206 |
647
+ | Ladino (lad) | 0.660920 | 0.576441 | 0.615797 |
648
+ | Lao (lao) | 0.986175 | 0.918455 | 0.951111 |
649
+ | Latin (lat) | 0.581250 | 0.636986 | 0.607843 |
650
+ | Latvian (lav) | 0.824513 | 0.797844 | 0.810959 |
651
+ | Lezghian (lez) | 0.898955 | 0.793846 | 0.843137 |
652
+ | Ligurian (lij) | 0.662903 | 0.677100 | 0.669927 |
653
+ | Limburgan (lim) | 0.615385 | 0.581818 | 0.598131 |
654
+ | Lingala (lin) | 0.836207 | 0.763780 | 0.798354 |
655
+ | Lithuanian (lit) | 0.756329 | 0.804714 | 0.779772 |
656
+ | Lombard (lmo) | 0.556818 | 0.536986 | 0.546722 |
657
+ | Northern Luri (lrc) | 0.838574 | 0.753296 | 0.793651 |
658
+ | Latgalian (ltg) | 0.759531 | 0.755102 | 0.757310 |
659
+ | Luxembourgish (ltz) | 0.645062 | 0.614706 | 0.629518 |
660
+ | Luganda (lug) | 0.787535 | 0.805797 | 0.796562 |
661
+ | Literary Chinese (lzh) | 0.921951 | 0.949749 | 0.935644 |
662
+ | Maithili (mai) | 0.777778 | 0.761658 | 0.769634 |
663
+ | Malayalam (mal) | 0.993377 | 0.949367 | 0.970874 |
664
+ | Banyumasan (map-bms) | 0.531429 | 0.453659 | 0.489474 |
665
+ | Marathi (mar) | 0.748744 | 0.818681 | 0.782152 |
666
+ | Moksha (mdf) | 0.728745 | 0.800000 | 0.762712 |
667
+ | Eastern Mari (mhr) | 0.790323 | 0.760870 | 0.775316 |
668
+ | Minangkabau (min) | 0.953271 | 0.886957 | 0.918919 |
669
+ | Macedonian (mkd) | 0.816399 | 0.849722 | 0.832727 |
670
+ | Malagasy (mlg) | 0.925187 | 0.918317 | 0.921739 |
671
+ | Maltese (mlt) | 0.869421 | 0.890017 | 0.879599 |
672
+ | Min Nan Chinese (nan) | 0.743707 | 0.820707 | 0.780312 |
673
+ | Mongolian (mon) | 0.852194 | 0.838636 | 0.845361 |
674
+ | Maori (mri) | 0.934726 | 0.937173 | 0.935948 |
675
+ | Western Mari (mrj) | 0.818792 | 0.827119 | 0.822934 |
676
+ | Malay (msa) | 0.508065 | 0.376119 | 0.432247 |
677
+ | Mirandese (mwl) | 0.650407 | 0.685225 | 0.667362 |
678
+ | Burmese (mya) | 0.995968 | 0.972441 | 0.984064 |
679
+ | Erzya (myv) | 0.475783 | 0.503012 | 0.489019 |
680
+ | Mazanderani (mzn) | 0.775362 | 0.701639 | 0.736661 |
681
+ | Neapolitan (nap) | 0.628993 | 0.595349 | 0.611708 |
682
+ | Navajo (nav) | 0.955882 | 0.937500 | 0.946602 |
683
+ | Classical Nahuatl (nci) | 0.679758 | 0.589005 | 0.631136 |
684
+ | Low German (nds) | 0.669789 | 0.690821 | 0.680143 |
685
+ | West Low German (nds-nl) | 0.513889 | 0.504545 | 0.509174 |
686
+ | Nepali (macrolanguage) (nep) | 0.640476 | 0.649758 | 0.645084 |
687
+ | Newari (new) | 0.928571 | 0.745902 | 0.827273 |
688
+ | Dutch (nld) | 0.553763 | 0.553763 | 0.553763 |
689
+ | Norwegian Nynorsk (nno) | 0.569277 | 0.519231 | 0.543103 |
690
+ | Bokmål (nob) | 0.519856 | 0.562500 | 0.540338 |
691
+ | Narom (nrm) | 0.691275 | 0.605882 | 0.645768 |
692
+ | Northern Sotho (nso) | 0.950276 | 0.815166 | 0.877551 |
693
+ | Occitan (oci) | 0.483444 | 0.366834 | 0.417143 |
694
+ | Livvi-Karelian (olo) | 0.816850 | 0.790780 | 0.803604 |
695
+ | Oriya (ori) | 0.981481 | 0.963636 | 0.972477 |
696
+ | Oromo (orm) | 0.885714 | 0.829218 | 0.856536 |
697
+ | Ossetian (oss) | 0.822006 | 0.855219 | 0.838284 |
698
+ | Pangasinan (pag) | 0.842105 | 0.715655 | 0.773748 |
699
+ | Pampanga (pam) | 0.770000 | 0.435028 | 0.555957 |
700
+ | Panjabi (pan) | 0.996154 | 0.984791 | 0.990440 |
701
+ | Papiamento (pap) | 0.674672 | 0.661670 | 0.668108 |
702
+ | Picard (pcd) | 0.407895 | 0.356322 | 0.380368 |
703
+ | Pennsylvania German (pdc) | 0.487047 | 0.509485 | 0.498013 |
704
+ | Palatine German (pfl) | 0.614173 | 0.570732 | 0.591656 |
705
+ | Western Panjabi (pnb) | 0.926267 | 0.887417 | 0.906426 |
706
+ | Polish (pol) | 0.797059 | 0.734417 | 0.764457 |
707
+ | Portuguese (por) | 0.500914 | 0.586724 | 0.540434 |
708
+ | Pushto (pus) | 0.941489 | 0.898477 | 0.919481 |
709
+ | Quechua (que) | 0.854167 | 0.797665 | 0.824950 |
710
+ | Tarantino dialect (roa-tara) | 0.669794 | 0.724138 | 0.695906 |
711
+ | Romansh (roh) | 0.745527 | 0.760649 | 0.753012 |
712
+ | Romanian (ron) | 0.805486 | 0.769048 | 0.786845 |
713
+ | Rusyn (rue) | 0.718543 | 0.645833 | 0.680251 |
714
+ | Aromanian (rup) | 0.288482 | 0.730245 | 0.413580 |
715
+ | Russian (rus) | 0.530120 | 0.690583 | 0.599805 |
716
+ | Yakut (sah) | 0.853521 | 0.865714 | 0.859574 |
717
+ | Sanskrit (san) | 0.931343 | 0.896552 | 0.913616 |
718
+ | Sicilian (scn) | 0.734139 | 0.618321 | 0.671271 |
719
+ | Scots (sco) | 0.571429 | 0.540816 | 0.555701 |
720
+ | Samogitian (sgs) | 0.829167 | 0.748120 | 0.786561 |
721
+ | Sinhala (sin) | 0.909474 | 0.935065 | 0.922092 |
722
+ | Slovak (slk) | 0.738235 | 0.665782 | 0.700139 |
723
+ | Slovene (slv) | 0.671123 | 0.662269 | 0.666667 |
724
+ | Northern Sami (sme) | 0.800676 | 0.825784 | 0.813036 |
725
+ | Shona (sna) | 0.761702 | 0.724696 | 0.742739 |
726
+ | Sindhi (snd) | 0.950172 | 0.946918 | 0.948542 |
727
+ | Somali (som) | 0.849462 | 0.802030 | 0.825065 |
728
+ | Spanish (spa) | 0.325234 | 0.413302 | 0.364017 |
729
+ | Albanian (sqi) | 0.875899 | 0.832479 | 0.853637 |
730
+ | Sardinian (srd) | 0.750000 | 0.711061 | 0.730012 |
731
+ | Sranan (srn) | 0.888889 | 0.771084 | 0.825806 |
732
+ | Serbian (srp) | 0.824561 | 0.814356 | 0.819427 |
733
+ | Saterfriesisch (stq) | 0.790087 | 0.734417 | 0.761236 |
734
+ | Sundanese (sun) | 0.764192 | 0.631769 | 0.691700 |
735
+ | Swahili (macrolanguage) (swa) | 0.763496 | 0.796247 | 0.779528 |
736
+ | Swedish (swe) | 0.838284 | 0.723647 | 0.776758 |
737
+ | Silesian (szl) | 0.819788 | 0.750809 | 0.783784 |
738
+ | Tamil (tam) | 0.985765 | 0.955172 | 0.970228 |
739
+ | Tatar (tat) | 0.469780 | 0.795349 | 0.590674 |
740
+ | Tulu (tcy) | 0.893300 | 0.873786 | 0.883436 |
741
+ | Telugu (tel) | 1.000000 | 0.913690 | 0.954899 |
742
+ | Tetum (tet) | 0.765116 | 0.744344 | 0.754587 |
743
+ | Tajik (tgk) | 0.828418 | 0.813158 | 0.820717 |
744
+ | Tagalog (tgl) | 0.751468 | 0.757396 | 0.754420 |
745
+ | Thai (tha) | 0.933884 | 0.807143 | 0.865900 |
746
+ | Tongan (ton) | 0.920245 | 0.923077 | 0.921659 |
747
+ | Tswana (tsn) | 0.873397 | 0.889070 | 0.881164 |
748
+ | Turkmen (tuk) | 0.898438 | 0.837887 | 0.867107 |
749
+ | Turkish (tur) | 0.666667 | 0.716981 | 0.690909 |
750
+ | Tuvan (tyv) | 0.857143 | 0.805063 | 0.830287 |
751
+ | Udmurt (udm) | 0.865517 | 0.756024 | 0.807074 |
752
+ | Uighur (uig) | 0.991597 | 0.967213 | 0.979253 |
753
+ | Ukrainian (ukr) | 0.771341 | 0.702778 | 0.735465 |
754
+ | Urdu (urd) | 0.877647 | 0.855505 | 0.866434 |
755
+ | Uzbek (uzb) | 0.655652 | 0.797040 | 0.719466 |
756
+ | Venetian (vec) | 0.611111 | 0.527233 | 0.566082 |
757
+ | Veps (vep) | 0.672862 | 0.688213 | 0.680451 |
758
+ | Vietnamese (vie) | 0.932406 | 0.914230 | 0.923228 |
759
+ | Vlaams (vls) | 0.594427 | 0.501305 | 0.543909 |
760
+ | Volapük (vol) | 0.765625 | 0.942308 | 0.844828 |
761
+ | Võro (vro) | 0.797203 | 0.740260 | 0.767677 |
762
+ | Waray (war) | 0.930876 | 0.930876 | 0.930876 |
763
+ | Walloon (wln) | 0.636804 | 0.693931 | 0.664141 |
764
+ | Wolof (wol) | 0.864220 | 0.845601 | 0.854809 |
765
+ | Wu Chinese (wuu) | 0.848921 | 0.830986 | 0.839858 |
766
+ | Xhosa (xho) | 0.837398 | 0.759214 | 0.796392 |
767
+ | Mingrelian (xmf) | 0.943396 | 0.874126 | 0.907441 |
768
+ | Yiddish (yid) | 0.955729 | 0.897311 | 0.925599 |
769
+ | Yoruba (yor) | 0.812010 | 0.719907 | 0.763190 |
770
+ | Zeeuws (zea) | 0.617737 | 0.550409 | 0.582133 |
771
+ | Cantonese (zh-yue) | 0.859649 | 0.649007 | 0.739623 |
772
+ | Standard Chinese (zho) | 0.845528 | 0.781955 | 0.812500 |
773
+ | accuracy | 0.749527 | 0.749527 | 0.749527 |
774
+ | macro avg | 0.762866 | 0.742101 | 0.749261 |
775
+ | weighted avg | 0.762006 | 0.749527 | 0.752910 |
776
+
777
+
778
+ ## Questions?
779
+ Post a Github issue from [HERE](https://github.com/m3hrdadfi/zabanshenas/issues).