sultan commited on
Commit
4137950
1 Parent(s): 604708e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -12
README.md CHANGED
@@ -31,18 +31,22 @@ Total Corpora size is 49GB. This model uses an efficient implementation of T5 wh
31
 
32
  ## Results on TyDi QA, HARD, Sentiment Analysis, Sarcasm Detection ( Best Score is highlighted in bold )
33
 
34
- | Model | <center>TyDi QA| <center>HARD| <center>ArSarcasm-v2-Sentiment| <center>ArSarcasm-v2-Sarcasm| XL-SUM |
35
- |----------------------|---------------|---------------------|-------------------------------------|----------------------------------|----------------------------------
36
- | AraT5-base | <center>70.4/84.2 |<center>**96.5**|<center>69.7/72.6|<center>60.4|<center>30.3|
37
- | AraT5-msa-base | <center>70.9/84.0 |<center>**96.5**|<center>70.0/72.7|<center>60.7|<center>27.4|
38
- | AraT5-tweets-base | <center>65.1/79.0 |<center>96.3|<center>70.7/73.5|<center>61.1|<center>25.1|
39
- | mT5-base | <center>72.2/84.1 |<center>96.2|<center>67.3/68.8|<center>52.2|<center>25.7|
40
- | AraBART-base | <center>48.8/71.2 |<center>96.1|<center>66.2/68.2|<center>56.3|<center>31.2|
41
- | ArabicT5-17GB-small | <center>70.8/84.8 |<center>96.4|<center>68.9/71.2|<center>58.9|<center>29.2|
42
- | ArabicT5-49GB-small | <center>72.4/85.1 |<center>96.4|<center>70.2/73.4|<center>61.0|<center>30.2|
43
- | ArabicT5-17GB-base | <center>73.3/86.1 |<center>96.4|<center>70.4/73.0|<center>59.8|<center>30.3|
44
- | ArabicT5-49GB-base | <center>72.1/85.1 |<center>**96.5**|<center>71.3/74.1|<center>60.4|<center>30.9|
45
- | ArabicT5-17GB-large | <center>**75.5/87.1** |<center>**96.5**| <center>**72.2/75.2**|<center>**61.7**|<center>**31.7**|
 
 
 
 
46
 
47
  Evaluation Metrics: TyDi QA (EM/F1), HARD (Accuracy), Sentiment Analysis (Accuracy / F1-PN positive-negative), Sarcasm Detection (F1-sarcastic), XL-SUM (Rouge-L with Stemmer).
48
 
@@ -50,6 +54,8 @@ You can download the full details of our grid search for all models in all tasks
50
 
51
  For the XL-Sum task, we choose our best run for each model using the eval set. We use the official evaluation script from XL-Sum, which uses the stemmer function, which may show better results than papers that don't use the stemmer function. The official XL-Sum paper uses a stemmer function.
52
 
 
 
53
  # FineTuning our efficient ArabicT5-49GB-Small model with Torch on 3070 laptop GPU ###
54
 
55
  [![Open In Colab][COLAB]](https://colab.research.google.com/github/salrowili/ArabicT5/blob/main/ArabicT5_49GB_Small_on_3070_Laptop_GPU.ipynb)
 
31
 
32
  ## Results on TyDi QA, HARD, Sentiment Analysis, Sarcasm Detection ( Best Score is highlighted in bold )
33
 
34
+ | Model Type | Model | <center>TyDi QA| <center>HARD| <center>ArSarcasm-v2-Sentiment| <center>ArSarcasm-v2-Sarcasm| XL-SUM |
35
+ |--------------|------------------------|---------------------|----------------|-----------------|------------|------------|
36
+ | Generative | AraT5-base | <center>70.4/84.2 |<center>96.5|<center>69.7/72.6|<center>60.4|<center>30.3|
37
+ | Generative | AraT5-msa-base | <center>70.9/84.0 |<center>96.5|<center>70.0/72.7|<center>60.7|<center>27.4|
38
+ | Generative | AraT5-tweets-base | <center>65.1/79.0 |<center>96.3|<center>70.7/73.5|<center>61.1|<center>25.1|
39
+ | Generative | mT5-base | <center>72.2/84.1 |<center>96.2|<center>67.3/68.8|<center>52.2|<center>25.7|
40
+ | Generative | AraBART-base | <center>48.8/71.2 |<center>96.1|<center>66.2/68.2|<center>56.3|<center>31.2|
41
+ | Generative | ArabicT5-17GB-small | <center>70.8/84.8 |<center>96.4|<center>68.9/71.2|<center>58.9|<center>29.2|
42
+ | Generative | ArabicT5-49GB-small | <center>72.4/85.1 |<center>96.4|<center>70.2/73.4|<center>61.0|<center>30.2|
43
+ | Generative | ArabicT5-17GB-base | <center>73.3/86.1 |<center>96.4|<center>70.4/73.0|<center>59.8|<center>30.3|
44
+ | Generative | ArabicT5-49GB-base | <center>72.1/85.1 |<center>96.5|<center>71.3/74.1|<center>60.4|<center>30.9|
45
+ | Generative | ArabicT5-17GB-large | <center>75.5/87.1 |<center>96.5| <center>72.2/75.2|<center>61.7|<center>31.7|
46
+ | Exctractive | AraBERTv02-Large | <center>73.7/86.0 |<center>96.4|<center>69.5/71.8|<center>-|<center> N/A|
47
+ | Exctractive | AraBERTv2-Large | <center>64.5/82.2 |<center>96.5|<center>70.0/72.4|<center>-|<center> N/A|
48
+ | Exctractive | AraELECTRA-base | <center>74.9/86.7 |<center>96.4|<center>69.6/72.3|<center>-|<center>N/A|
49
+ | Exctractive | ArabicTransformer-base | <center>75.4/87.2 |<center>96.6|<center>70.8/74.0|<center>-|<center> N/A|
50
 
51
  Evaluation Metrics: TyDi QA (EM/F1), HARD (Accuracy), Sentiment Analysis (Accuracy / F1-PN positive-negative), Sarcasm Detection (F1-sarcastic), XL-SUM (Rouge-L with Stemmer).
52
 
 
54
 
55
  For the XL-Sum task, we choose our best run for each model using the eval set. We use the official evaluation script from XL-Sum, which uses the stemmer function, which may show better results than papers that don't use the stemmer function. The official XL-Sum paper uses a stemmer function.
56
 
57
+ Reported numbers for extractive models is taken from ArabicTransformer paper --> https://aclanthology.org/2021.findings-emnlp.108/
58
+
59
  # FineTuning our efficient ArabicT5-49GB-Small model with Torch on 3070 laptop GPU ###
60
 
61
  [![Open In Colab][COLAB]](https://colab.research.google.com/github/salrowili/ArabicT5/blob/main/ArabicT5_49GB_Small_on_3070_Laptop_GPU.ipynb)