Update README.md
Browse files
README.md
CHANGED
@@ -11,6 +11,7 @@ This model adapts T5 on the Arabic Language by pre-training T5 on :
|
|
11 |
|
12 |
Total Corpora size is 17GB. This model uses an efficient implementation of T5 which reduces the fine-tuning and memory used [Link](https://arxiv.org/abs/2109.10686) and uses T5x for pre-training [Link](https://github.com/google-research/t5x)
|
13 |
|
|
|
14 |
## Pre-training Settings and Results on TyDi QA Development Dataset ( Model in this card is highlighted in bold )
|
15 |
|
16 |
| Model | Hidden Layer | Atten. head | Atten. Layers | Vocab | Hardware |Training Steps | Batch | Train x Batch Factor |Corpora |
|
@@ -29,18 +30,22 @@ Total Corpora size is 17GB. This model uses an efficient implementation of T5 wh
|
|
29 |
|
30 |
## Results on TyDi QA, HARD, Sentiment Analysis, Sarcasm Detection ( Best Score is highlighted in bold )
|
31 |
|
32 |
-
|
|
33 |
-
|
34 |
-
|
|
35 |
-
|
|
36 |
-
|
|
37 |
-
|
|
38 |
-
|
|
39 |
-
|
|
40 |
-
|
|
41 |
-
|
|
42 |
-
|
|
43 |
-
|
|
|
|
|
|
|
|
|
|
44 |
|
45 |
Evaluation Metrics: TyDi QA (EM/F1), HARD (Accuracy), Sentiment Analysis (Accuracy / F1-PN positive-negative), Sarcasm Detection (F1-sarcastic), XL-SUM (Rouge-L with Stemmer).
|
46 |
|
@@ -48,6 +53,8 @@ You can download the full details of our grid search for all models in all tasks
|
|
48 |
|
49 |
For the XL-Sum task, we choose our best run for each model using the eval set. We use the official evaluation script from XL-Sum, which uses the stemmer function, which may show better results than papers that don't use the stemmer function. The official XL-Sum paper uses a stemmer function.
|
50 |
|
|
|
|
|
51 |
# FineTuning our efficient ArabicT5-49GB-Small model with Torch on 3070 laptop GPU ###
|
52 |
|
53 |
[![Open In Colab][COLAB]](https://colab.research.google.com/github/salrowili/ArabicT5/blob/main/ArabicT5_49GB_Small_on_3070_Laptop_GPU.ipynb)
|
|
|
11 |
|
12 |
Total Corpora size is 17GB. This model uses an efficient implementation of T5 which reduces the fine-tuning and memory used [Link](https://arxiv.org/abs/2109.10686) and uses T5x for pre-training [Link](https://github.com/google-research/t5x)
|
13 |
|
14 |
+
|
15 |
## Pre-training Settings and Results on TyDi QA Development Dataset ( Model in this card is highlighted in bold )
|
16 |
|
17 |
| Model | Hidden Layer | Atten. head | Atten. Layers | Vocab | Hardware |Training Steps | Batch | Train x Batch Factor |Corpora |
|
|
|
30 |
|
31 |
## Results on TyDi QA, HARD, Sentiment Analysis, Sarcasm Detection ( Best Score is highlighted in bold )
|
32 |
|
33 |
+
| Model Type | Model | <center>TyDi QA| <center>HARD| <center>ArSarcasm-v2-Sentiment| <center>ArSarcasm-v2-Sarcasm| XL-SUM |
|
34 |
+
|--------------|------------------------|---------------------|----------------|-----------------|------------|------------|
|
35 |
+
| Generative | AraT5-base | <center>70.4/84.2 |<center>96.5|<center>69.7/72.6|<center>60.4|<center>30.3|
|
36 |
+
| Generative | AraT5-msa-base | <center>70.9/84.0 |<center>96.5|<center>70.0/72.7|<center>60.7|<center>27.4|
|
37 |
+
| Generative | AraT5-tweets-base | <center>65.1/79.0 |<center>96.3|<center>70.7/73.5|<center>61.1|<center>25.1|
|
38 |
+
| Generative | mT5-base | <center>72.2/84.1 |<center>96.2|<center>67.3/68.8|<center>52.2|<center>25.7|
|
39 |
+
| Generative | AraBART-base | <center>48.8/71.2 |<center>96.1|<center>66.2/68.2|<center>56.3|<center>31.2|
|
40 |
+
| Generative | ArabicT5-17GB-small | <center>70.8/84.8 |<center>96.4|<center>68.9/71.2|<center>58.9|<center>29.2|
|
41 |
+
| Generative | ArabicT5-49GB-small | <center>72.4/85.1 |<center>96.4|<center>70.2/73.4|<center>61.0|<center>30.2|
|
42 |
+
| Generative | ArabicT5-17GB-base | <center>73.3/86.1 |<center>96.4|<center>70.4/73.0|<center>59.8|<center>30.3|
|
43 |
+
| Generative | ArabicT5-49GB-base | <center>72.1/85.1 |<center>96.5|<center>71.3/74.1|<center>60.4|<center>30.9|
|
44 |
+
| Generative | ArabicT5-17GB-large | <center>75.5/87.1 |<center>96.5| <center>72.2/75.2|<center>61.7|<center>31.7|
|
45 |
+
| Exctractive | AraBERTv02-Large | <center>73.7/86.0 |<center>96.4|<center>69.5/71.8|<center>-|<center> N/A|
|
46 |
+
| Exctractive | AraBERTv2-Large | <center>64.5/82.2 |<center>96.5|<center>70.0/72.4|<center>-|<center> N/A|
|
47 |
+
| Exctractive | AraELECTRA-base | <center>74.9/86.7 |<center>96.4|<center>69.6/72.3|<center>-|<center>N/A|
|
48 |
+
| Exctractive | ArabicTransformer-base | <center>75.4/87.2 |<center>96.6|<center>70.8/74.0|<center>-|<center> N/A|
|
49 |
|
50 |
Evaluation Metrics: TyDi QA (EM/F1), HARD (Accuracy), Sentiment Analysis (Accuracy / F1-PN positive-negative), Sarcasm Detection (F1-sarcastic), XL-SUM (Rouge-L with Stemmer).
|
51 |
|
|
|
53 |
|
54 |
For the XL-Sum task, we choose our best run for each model using the eval set. We use the official evaluation script from XL-Sum, which uses the stemmer function, which may show better results than papers that don't use the stemmer function. The official XL-Sum paper uses a stemmer function.
|
55 |
|
56 |
+
Reported numbers for extractive models is taken from ArabicTransformer paper --> https://aclanthology.org/2021.findings-emnlp.108/
|
57 |
+
|
58 |
# FineTuning our efficient ArabicT5-49GB-Small model with Torch on 3070 laptop GPU ###
|
59 |
|
60 |
[![Open In Colab][COLAB]](https://colab.research.google.com/github/salrowili/ArabicT5/blob/main/ArabicT5_49GB_Small_on_3070_Laptop_GPU.ipynb)
|