wissamantoun commited on
Commit
ce67844
1 Parent(s): ff757c0

added citation

Browse files
Files changed (1) hide show
  1. README.md +33 -28
README.md CHANGED
@@ -65,7 +65,7 @@ Follow the guide linked [here](https://towardsdatascience.com/fine-tuning-gpt2-o
65
 
66
  ## Finetuning using our code with TF 1.15.4:
67
 
68
- - Create the Training TFRecords:
69
  ```bash
70
  python create_pretraining_data.py
71
  --input_file=<RAW TEXT FILE with documents/article sperated by an empty line>
@@ -73,26 +73,26 @@ python create_pretraining_data.py
73
  --tokenizer_dir=<Directory with the GPT2 Tokenizer files>
74
  ```
75
 
76
- - Finetuning:
77
  ```bash
78
- python3 run_pretraining.py \
79
- --input_file="gs://<GS_BUCKET>/pretraining_data/*" \
80
- --output_dir="gs://<GS_BUCKET>/pretraining_model/" \
81
- --config_file="config/small_hparams.json" \
82
- --batch_size=128 \
83
- --eval_batch_size=8 \
84
- --num_train_steps= \
85
- --num_warmup_steps= \
86
- --learning_rate= \
87
- --save_checkpoints_steps= \
88
- --max_seq_length=1024 \
89
- --max_eval_steps= \
90
- --optimizer="lamb" \
91
- --iterations_per_loop=5000 \
92
- --keep_checkpoint_max=10 \
93
- --use_tpu=True \
94
- --tpu_name=<TPU NAME> \
95
- --do_train=True \
96
  --do_eval=False
97
  ```
98
  # Model Sizes
@@ -137,18 +137,23 @@ For the new dataset we added the unshuffled OSCAR corpus, after we thoroughly fi
137
  # If you used this model please cite us as :
138
 
139
  ```
140
- @misc{antoun2020aragpt2,
141
- title={AraGPT2: Pre-Trained Transformer for Arabic Language Generation},
142
- author={Wissam Antoun and Fady Baly and Hazem Hajj},
143
- year={2020},
144
- eprint={2012.15520},
145
- archivePrefix={arXiv},
146
- primaryClass={cs.CL}
 
 
 
 
 
147
  }
148
  ```
149
 
150
  # Acknowledgments
151
- Thanks to TensorFlow Research Cloud (TFRC) for the free access to Cloud TPUs, couldn't have done it without this program, and to the [AUB MIND Lab](https://sites.aub.edu.lb/mindlab/) Members for the continous support. Also thanks to [Yakshof](https://www.yakshof.com/#/) and Assafir for data and storage access. Another thanks for Habib Rahal (https://www.behance.net/rahalhabib), for putting a face to AraBERT.
152
 
153
  # Contacts
154
  **Wissam Antoun**: [Linkedin](https://www.linkedin.com/in/wissam-antoun-622142b4/) | [Twitter](https://twitter.com/wissam_antoun) | [Github](https://github.com/WissamAntoun) | <wfa07@mail.aub.edu> | <wissam.antoun@gmail.com>
 
65
 
66
  ## Finetuning using our code with TF 1.15.4:
67
 
68
+ Create the Training TFRecords:
69
  ```bash
70
  python create_pretraining_data.py
71
  --input_file=<RAW TEXT FILE with documents/article sperated by an empty line>
 
73
  --tokenizer_dir=<Directory with the GPT2 Tokenizer files>
74
  ```
75
 
76
+ Finetuning:
77
  ```bash
78
+ python3 run_pretraining.py \\
79
+ --input_file="gs://<GS_BUCKET>/pretraining_data/*" \\
80
+ --output_dir="gs://<GS_BUCKET>/pretraining_model/" \\
81
+ --config_file="config/small_hparams.json" \\
82
+ --batch_size=128 \\
83
+ --eval_batch_size=8 \\
84
+ --num_train_steps= \\
85
+ --num_warmup_steps= \\
86
+ --learning_rate= \\
87
+ --save_checkpoints_steps= \\
88
+ --max_seq_length=1024 \\
89
+ --max_eval_steps= \\
90
+ --optimizer="lamb" \\
91
+ --iterations_per_loop=5000 \\
92
+ --keep_checkpoint_max=10 \\
93
+ --use_tpu=True \\
94
+ --tpu_name=<TPU NAME> \\
95
+ --do_train=True \\
96
  --do_eval=False
97
  ```
98
  # Model Sizes
 
137
  # If you used this model please cite us as :
138
 
139
  ```
140
+ @inproceedings{antoun-etal-2021-aragpt2,
141
+ title = "{A}ra{GPT}2: Pre-Trained Transformer for {A}rabic Language Generation",
142
+ author = "Antoun, Wissam and
143
+ Baly, Fady and
144
+ Hajj, Hazem",
145
+ booktitle = "Proceedings of the Sixth Arabic Natural Language Processing Workshop",
146
+ month = apr,
147
+ year = "2021",
148
+ address = "Kyiv, Ukraine (Virtual)",
149
+ publisher = "Association for Computational Linguistics",
150
+ url = "https://www.aclweb.org/anthology/2021.wanlp-1.21",
151
+ pages = "196--207",
152
  }
153
  ```
154
 
155
  # Acknowledgments
156
+ Thanks to TensorFlow Research Cloud (TFRC) for the free access to Cloud TPUs, couldn't have done it without this program, and to the [AUB MIND Lab](https://sites.aub.edu.lb/mindlab/) Members for the continuous support. Also thanks to [Yakshof](https://www.yakshof.com/#/) and Assafir for data and storage access. Another thanks for Habib Rahal (https://www.behance.net/rahalhabib), for putting a face to AraBERT.
157
 
158
  # Contacts
159
  **Wissam Antoun**: [Linkedin](https://www.linkedin.com/in/wissam-antoun-622142b4/) | [Twitter](https://twitter.com/wissam_antoun) | [Github](https://github.com/WissamAntoun) | <wfa07@mail.aub.edu> | <wissam.antoun@gmail.com>