wissamantoun
commited on
Commit
•
5c01fd5
1
Parent(s):
aac1054
added citation
Browse files
README.md
CHANGED
@@ -63,7 +63,7 @@ Follow the guide linked [here](https://towardsdatascience.com/fine-tuning-gpt2-o
|
|
63 |
|
64 |
## Finetuning using our code with TF 1.15.4:
|
65 |
|
66 |
-
|
67 |
```bash
|
68 |
python create_pretraining_data.py
|
69 |
--input_file=<RAW TEXT FILE with documents/article sperated by an empty line>
|
@@ -71,26 +71,26 @@ python create_pretraining_data.py
|
|
71 |
--tokenizer_dir=<Directory with the GPT2 Tokenizer files>
|
72 |
```
|
73 |
|
74 |
-
|
75 |
```bash
|
76 |
-
python3 run_pretraining.py
|
77 |
-
--input_file="gs://<GS_BUCKET>/pretraining_data/*"
|
78 |
-
--output_dir="gs://<GS_BUCKET>/pretraining_model/"
|
79 |
-
--config_file="config/small_hparams.json"
|
80 |
-
--batch_size=128
|
81 |
-
--eval_batch_size=8
|
82 |
-
--num_train_steps=
|
83 |
-
--num_warmup_steps=
|
84 |
-
--learning_rate=
|
85 |
-
--save_checkpoints_steps=
|
86 |
-
--max_seq_length=1024
|
87 |
-
--max_eval_steps=
|
88 |
-
--optimizer="lamb"
|
89 |
-
--iterations_per_loop=5000
|
90 |
-
--keep_checkpoint_max=10
|
91 |
-
--use_tpu=True
|
92 |
-
--tpu_name=<TPU NAME>
|
93 |
-
--do_train=True
|
94 |
--do_eval=False
|
95 |
```
|
96 |
# Model Sizes
|
@@ -133,13 +133,18 @@ The text generated by AraGPT2 is automatically generated by a neural network mod
|
|
133 |
# If you used this model please cite us as :
|
134 |
|
135 |
```
|
136 |
-
@
|
137 |
-
|
138 |
-
|
139 |
-
|
140 |
-
|
141 |
-
|
142 |
-
|
|
|
|
|
|
|
|
|
|
|
143 |
}
|
144 |
```
|
145 |
|
|
|
63 |
|
64 |
## Finetuning using our code with TF 1.15.4:
|
65 |
|
66 |
+
Create the Training TFRecords:
|
67 |
```bash
|
68 |
python create_pretraining_data.py
|
69 |
--input_file=<RAW TEXT FILE with documents/article sperated by an empty line>
|
|
|
71 |
--tokenizer_dir=<Directory with the GPT2 Tokenizer files>
|
72 |
```
|
73 |
|
74 |
+
Finetuning:
|
75 |
```bash
|
76 |
+
python3 run_pretraining.py \\
|
77 |
+
--input_file="gs://<GS_BUCKET>/pretraining_data/*" \\
|
78 |
+
--output_dir="gs://<GS_BUCKET>/pretraining_model/" \\
|
79 |
+
--config_file="config/small_hparams.json" \\
|
80 |
+
--batch_size=128 \\
|
81 |
+
--eval_batch_size=8 \\
|
82 |
+
--num_train_steps= \\
|
83 |
+
--num_warmup_steps= \\
|
84 |
+
--learning_rate= \\
|
85 |
+
--save_checkpoints_steps= \\
|
86 |
+
--max_seq_length=1024 \\
|
87 |
+
--max_eval_steps= \\
|
88 |
+
--optimizer="lamb" \\
|
89 |
+
--iterations_per_loop=5000 \\
|
90 |
+
--keep_checkpoint_max=10 \\
|
91 |
+
--use_tpu=True \\
|
92 |
+
--tpu_name=<TPU NAME> \\
|
93 |
+
--do_train=True \\
|
94 |
--do_eval=False
|
95 |
```
|
96 |
# Model Sizes
|
|
|
133 |
# If you used this model please cite us as :
|
134 |
|
135 |
```
|
136 |
+
@inproceedings{antoun-etal-2021-aragpt2,
|
137 |
+
title = "{A}ra{GPT}2: Pre-Trained Transformer for {A}rabic Language Generation",
|
138 |
+
author = "Antoun, Wissam and
|
139 |
+
Baly, Fady and
|
140 |
+
Hajj, Hazem",
|
141 |
+
booktitle = "Proceedings of the Sixth Arabic Natural Language Processing Workshop",
|
142 |
+
month = apr,
|
143 |
+
year = "2021",
|
144 |
+
address = "Kyiv, Ukraine (Virtual)",
|
145 |
+
publisher = "Association for Computational Linguistics",
|
146 |
+
url = "https://www.aclweb.org/anthology/2021.wanlp-1.21",
|
147 |
+
pages = "196--207",
|
148 |
}
|
149 |
```
|
150 |
|