m-nagoudi commited on
Commit
8212b28
1 Parent(s): 50b99b9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -10
README.md CHANGED
@@ -1,12 +1,15 @@
1
  # AraT5-msa-base
2
- <img src="https://raw.githubusercontent.com/UBC-NLP/araT5/main/AraT5_logo.jpg" alt="drawing" width="30%" height="30%" align="right"/>
3
 
4
- **AraT5-msa-base** is one of three models described in our [**AraT5: Text-to-Text Transformers for Arabic Language Understanding and Generation**
5
- ](https://arxiv.org/abs/2109.12068). In this paper, we introduce three powerful Arabic-specific text-to-text transformer models trained on large Modern Standard Arabic (MSA) and/or Dialectal Arabic (DA) data. **AraT5** is trained on 248GB of text (29B tokens) of MSA and DA, **AraT5-msa** is trained on 70GB of text (7.1B tokens) from MSA data, and **AraT5-tweet** is trained on 178Gb of text (21.9B tokens) from 1.5B Arabic tweets which contains multiple varieties of dialectical Arabic.
6
 
7
- In addition, we provide the three models on two architectures small and base. For all models, we use a learning rate of 0.01, a batch size of 128 sequences, and a maximum sequence length of 512 whereas AraT5-tweet 128 maximum sequence is used. Hence, the original implementation of T5 in the TensorFlow framework is used to train the models. We train the models for 1M steps.8 Training took ∼ 80 days on 1 on Google Cloud TPU with 8 cores (v3.8) from TensorFlow Research Cloud (TFRC).
 
 
 
8
 
9
 
 
10
  # How to use AraT5 models
11
  Below is an example for fine-tuning **AraT5-base** for News Title Generation on the Aranews dataset
12
  ``` bash
@@ -29,6 +32,10 @@ In addition, we release the fine-tuned checkpoint of the News Title Generation (
29
 
30
  For more details, please visit our own [GitHub](https://github.com/UBC-NLP/araT5).
31
 
 
 
 
 
32
  # AraT5 Models Checkpoints
33
 
34
  AraT5 Pytorch and TensorFlow checkpoints are available on the Huggingface website for direct download and use ```exclusively for research```. ```For commercial use, please contact the authors via email @ (muhammad.mageed[at]ubc[dot]ca).```
@@ -45,15 +52,19 @@ AraT5 Pytorch and TensorFlow checkpoints are available on the Huggingface websit
45
 
46
  If you use our models (Arat5-base, Arat5-msa-base, Arat5-tweet-base, Arat5-msa-small, or Arat5-tweet-small ) for your scientific publication, or if you find the resources in this repository useful, please cite our paper as follows (to be updated):
47
  ```bibtex
48
- @inproceedings{araT5-2021,
49
- title = "{AraT5: Text-to-Text Transformers for Arabic Language Understanding and Generation",
50
  author = "Nagoudi, El Moatez Billah and
51
  Elmadany, AbdelRahim and
52
  Abdul-Mageed, Muhammad",
53
- booktitle = "https://arxiv.org/abs/2109.12068",
54
- month = aug,
55
- year = "2021"}
 
 
 
56
  ```
57
 
58
  ## Acknowledgments
59
- We gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada, the Social Sciences and Humanities Research Council of Canada, Canadian Foundation for Innovation, [ComputeCanada](www.computecanada.ca) and [UBC ARC-Sockeye](https://doi.org/10.14288/SOCKEYE). We also thank the [Google TensorFlow Research Cloud (TFRC)](https://www.tensorflow.org/tfrc) program for providing us with free TPU access.
 
 
1
  # AraT5-msa-base
2
+ # AraT5: Text-to-Text Transformers for Arabic Language Generation
3
 
4
+ <img src="https://huggingface.co/UBC-NLP/AraT5-base/resolve/main/AraT5_CR_new.png" alt="AraT5" width="45%" height="35%" align="right"/>
 
5
 
6
+ This is the repository accompanying our paper [AraT5: Text-to-Text Transformers for Arabic Language Understanding and Generation](https://arxiv.org/abs/2109.12068). In this is the repository we introduce:
7
+ * Introduce **AraT5<sub>MSA</sub>**, **AraT5<sub>Tweet</sub>**, and **AraT5**: three powerful Arabic-specific text-to-text Transformer based models;
8
+ * Introduce **ARGEN**: A new benchmark for Arabic language generation and evaluation for four Arabic NLP tasks, namely, ```machine translation```, ```summarization```, ```news title generation```, ```question generation```, , ```paraphrasing```, ```transliteration```, and ```code-switched translation```.
9
+ * Evaluate ```AraT5``` models on ```ARGEN``` and compare against available language models.
10
 
11
 
12
+ ---
13
  # How to use AraT5 models
14
  Below is an example for fine-tuning **AraT5-base** for News Title Generation on the Aranews dataset
15
  ``` bash
 
32
 
33
  For more details, please visit our own [GitHub](https://github.com/UBC-NLP/araT5).
34
 
35
+
36
+
37
+
38
+
39
  # AraT5 Models Checkpoints
40
 
41
  AraT5 Pytorch and TensorFlow checkpoints are available on the Huggingface website for direct download and use ```exclusively for research```. ```For commercial use, please contact the authors via email @ (muhammad.mageed[at]ubc[dot]ca).```
 
52
 
53
  If you use our models (Arat5-base, Arat5-msa-base, Arat5-tweet-base, Arat5-msa-small, or Arat5-tweet-small ) for your scientific publication, or if you find the resources in this repository useful, please cite our paper as follows (to be updated):
54
  ```bibtex
55
+ @inproceedings{nagoudi-2022-arat5,
56
+ title = "{AraT5: Text-to-Text Transformers for Arabic Language Generation",
57
  author = "Nagoudi, El Moatez Billah and
58
  Elmadany, AbdelRahim and
59
  Abdul-Mageed, Muhammad",
60
+ booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics",
61
+ month = May,
62
+ year = "2022",
63
+ address = "Online",
64
+ publisher = "Association for Computational Linguistics",
65
+ }
66
  ```
67
 
68
  ## Acknowledgments
69
+ We gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada, the Social Sciences and Humanities Research Council of Canada, Canadian Foundation for Innovation, [ComputeCanada](www.computecanada.ca) and [UBC ARC-Sockeye](https://doi.org/10.14288/SOCKEYE). We also thank the [Google TensorFlow Research Cloud (TFRC)](https://www.tensorflow.org/tfrc) program for providing us with free TPU access.
70
+ (TFRC)](https://www.tensorflow.org/tfrc) program for providing us with free TPU access.