samkenxstream
#34
by
samkenxstream
- opened
README.md
CHANGED
@@ -1,10 +1,65 @@
|
|
1 |
---
|
2 |
language:
|
3 |
- en
|
|
|
|
|
|
|
|
|
4 |
- fr
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
- ro
|
6 |
-
-
|
7 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
|
9 |
widget:
|
10 |
- text: "Translate to German: My name is Arthur"
|
@@ -47,8 +102,7 @@ license: apache-2.0
|
|
47 |
|
48 |
# Model Card for FLAN-T5 XXL
|
49 |
|
50 |
-
|
51 |
-
alt="drawing" width="600"/>
|
52 |
|
53 |
# Table of Contents
|
54 |
|
@@ -76,7 +130,7 @@ As mentioned in the first few lines of the abstract :
|
|
76 |
|
77 |
|
78 |
- **Model type:** Language model
|
79 |
-
- **Language(s) (NLP):** English, German,
|
80 |
- **License:** Apache 2.0
|
81 |
- **Related Models:** [All FLAN-T5 Checkpoints](https://huggingface.co/models?search=flan-t5)
|
82 |
- **Original Checkpoints:** [All Original FLAN-T5 Checkpoints](https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints)
|
@@ -216,7 +270,7 @@ The information below in this section are copied from the model's [official mode
|
|
216 |
|
217 |
The model was trained on a mixture of tasks, that includes the tasks described in the table below (from the original paper, figure 2):
|
218 |
|
219 |
-
![table.png](https://
|
220 |
|
221 |
|
222 |
## Training Procedure
|
@@ -233,7 +287,7 @@ The model has been trained on TPU v3 or TPU v4 pods, using [`t5x`](https://githu
|
|
233 |
## Testing Data, Factors & Metrics
|
234 |
|
235 |
The authors evaluated the model on various tasks covering several languages (1836 in total). See the table below for some quantitative evaluation:
|
236 |
-
![image.png](https://
|
237 |
For full details, please check the [research paper](https://arxiv.org/pdf/2210.11416.pdf).
|
238 |
|
239 |
## Results
|
|
|
1 |
---
|
2 |
language:
|
3 |
- en
|
4 |
+
- sp
|
5 |
+
- ja
|
6 |
+
- pe
|
7 |
+
- hi
|
8 |
- fr
|
9 |
+
- ch
|
10 |
+
- be
|
11 |
+
- gu
|
12 |
+
- ge
|
13 |
+
- te
|
14 |
+
- it
|
15 |
+
- ar
|
16 |
+
- po
|
17 |
+
- ta
|
18 |
+
- ma
|
19 |
+
- ma
|
20 |
+
- or
|
21 |
+
- pa
|
22 |
+
- po
|
23 |
+
- ur
|
24 |
+
- ga
|
25 |
+
- he
|
26 |
+
- ko
|
27 |
+
- ca
|
28 |
+
- th
|
29 |
+
- du
|
30 |
+
- in
|
31 |
+
- vi
|
32 |
+
- bu
|
33 |
+
- fi
|
34 |
+
- ce
|
35 |
+
- la
|
36 |
+
- tu
|
37 |
+
- ru
|
38 |
+
- cr
|
39 |
+
- sw
|
40 |
+
- yo
|
41 |
+
- ku
|
42 |
+
- bu
|
43 |
+
- ma
|
44 |
+
- cz
|
45 |
+
- fi
|
46 |
+
- so
|
47 |
+
- ta
|
48 |
+
- sw
|
49 |
+
- si
|
50 |
+
- ka
|
51 |
+
- zh
|
52 |
+
- ig
|
53 |
+
- xh
|
54 |
- ro
|
55 |
+
- ha
|
56 |
+
- es
|
57 |
+
- sl
|
58 |
+
- li
|
59 |
+
- gr
|
60 |
+
- ne
|
61 |
+
- as
|
62 |
+
- no
|
63 |
|
64 |
widget:
|
65 |
- text: "Translate to German: My name is Arthur"
|
|
|
102 |
|
103 |
# Model Card for FLAN-T5 XXL
|
104 |
|
105 |
+
![model image](https://s3.amazonaws.com/moonup/production/uploads/1666363435475-62441d1d9fdefb55a0b7d12c.png)
|
|
|
106 |
|
107 |
# Table of Contents
|
108 |
|
|
|
130 |
|
131 |
|
132 |
- **Model type:** Language model
|
133 |
+
- **Language(s) (NLP):** English, Spanish, Japanese, Persian, Hindi, French, Chinese, Bengali, Gujarati, German, Telugu, Italian, Arabic, Polish, Tamil, Marathi, Malayalam, Oriya, Panjabi, Portuguese, Urdu, Galician, Hebrew, Korean, Catalan, Thai, Dutch, Indonesian, Vietnamese, Bulgarian, Filipino, Central Khmer, Lao, Turkish, Russian, Croatian, Swedish, Yoruba, Kurdish, Burmese, Malay, Czech, Finnish, Somali, Tagalog, Swahili, Sinhala, Kannada, Zhuang, Igbo, Xhosa, Romanian, Haitian, Estonian, Slovak, Lithuanian, Greek, Nepali, Assamese, Norwegian
|
134 |
- **License:** Apache 2.0
|
135 |
- **Related Models:** [All FLAN-T5 Checkpoints](https://huggingface.co/models?search=flan-t5)
|
136 |
- **Original Checkpoints:** [All Original FLAN-T5 Checkpoints](https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints)
|
|
|
270 |
|
271 |
The model was trained on a mixture of tasks, that includes the tasks described in the table below (from the original paper, figure 2):
|
272 |
|
273 |
+
![table.png](https://s3.amazonaws.com/moonup/production/uploads/1666363265279-62441d1d9fdefb55a0b7d12c.png)
|
274 |
|
275 |
|
276 |
## Training Procedure
|
|
|
287 |
## Testing Data, Factors & Metrics
|
288 |
|
289 |
The authors evaluated the model on various tasks covering several languages (1836 in total). See the table below for some quantitative evaluation:
|
290 |
+
![image.png](https://s3.amazonaws.com/moonup/production/uploads/1668072995230-62441d1d9fdefb55a0b7d12c.png)
|
291 |
For full details, please check the [research paper](https://arxiv.org/pdf/2210.11416.pdf).
|
292 |
|
293 |
## Results
|