wissamantoun commited on
Commit
5c01fd5
1 Parent(s): aac1054

added citation

Browse files
Files changed (1) hide show
  1. README.md +32 -27
README.md CHANGED
@@ -63,7 +63,7 @@ Follow the guide linked [here](https://towardsdatascience.com/fine-tuning-gpt2-o
63
 
64
  ## Finetuning using our code with TF 1.15.4:
65
 
66
- - Create the Training TFRecords:
67
  ```bash
68
  python create_pretraining_data.py
69
  --input_file=<RAW TEXT FILE with documents/article sperated by an empty line>
@@ -71,26 +71,26 @@ python create_pretraining_data.py
71
  --tokenizer_dir=<Directory with the GPT2 Tokenizer files>
72
  ```
73
 
74
- - Finetuning:
75
  ```bash
76
- python3 run_pretraining.py \
77
- --input_file="gs://<GS_BUCKET>/pretraining_data/*" \
78
- --output_dir="gs://<GS_BUCKET>/pretraining_model/" \
79
- --config_file="config/small_hparams.json" \
80
- --batch_size=128 \
81
- --eval_batch_size=8 \
82
- --num_train_steps= \
83
- --num_warmup_steps= \
84
- --learning_rate= \
85
- --save_checkpoints_steps= \
86
- --max_seq_length=1024 \
87
- --max_eval_steps= \
88
- --optimizer="lamb" \
89
- --iterations_per_loop=5000 \
90
- --keep_checkpoint_max=10 \
91
- --use_tpu=True \
92
- --tpu_name=<TPU NAME> \
93
- --do_train=True \
94
  --do_eval=False
95
  ```
96
  # Model Sizes
@@ -133,13 +133,18 @@ The text generated by AraGPT2 is automatically generated by a neural network mod
133
  # If you used this model please cite us as :
134
 
135
  ```
136
- @misc{antoun2020aragpt2,
137
- title={AraGPT2: Pre-Trained Transformer for Arabic Language Generation},
138
- author={Wissam Antoun and Fady Baly and Hazem Hajj},
139
- year={2020},
140
- eprint={2012.15520},
141
- archivePrefix={arXiv},
142
- primaryClass={cs.CL}
 
 
 
 
 
143
  }
144
  ```
145
 
 
63
 
64
  ## Finetuning using our code with TF 1.15.4:
65
 
66
+ Create the Training TFRecords:
67
  ```bash
68
  python create_pretraining_data.py
69
  --input_file=<RAW TEXT FILE with documents/article sperated by an empty line>
 
71
  --tokenizer_dir=<Directory with the GPT2 Tokenizer files>
72
  ```
73
 
74
+ Finetuning:
75
  ```bash
76
+ python3 run_pretraining.py \\
77
+ --input_file="gs://<GS_BUCKET>/pretraining_data/*" \\
78
+ --output_dir="gs://<GS_BUCKET>/pretraining_model/" \\
79
+ --config_file="config/small_hparams.json" \\
80
+ --batch_size=128 \\
81
+ --eval_batch_size=8 \\
82
+ --num_train_steps= \\
83
+ --num_warmup_steps= \\
84
+ --learning_rate= \\
85
+ --save_checkpoints_steps= \\
86
+ --max_seq_length=1024 \\
87
+ --max_eval_steps= \\
88
+ --optimizer="lamb" \\
89
+ --iterations_per_loop=5000 \\
90
+ --keep_checkpoint_max=10 \\
91
+ --use_tpu=True \\
92
+ --tpu_name=<TPU NAME> \\
93
+ --do_train=True \\
94
  --do_eval=False
95
  ```
96
  # Model Sizes
 
133
  # If you used this model please cite us as :
134
 
135
  ```
136
+ @inproceedings{antoun-etal-2021-aragpt2,
137
+ title = "{A}ra{GPT}2: Pre-Trained Transformer for {A}rabic Language Generation",
138
+ author = "Antoun, Wissam and
139
+ Baly, Fady and
140
+ Hajj, Hazem",
141
+ booktitle = "Proceedings of the Sixth Arabic Natural Language Processing Workshop",
142
+ month = apr,
143
+ year = "2021",
144
+ address = "Kyiv, Ukraine (Virtual)",
145
+ publisher = "Association for Computational Linguistics",
146
+ url = "https://www.aclweb.org/anthology/2021.wanlp-1.21",
147
+ pages = "196--207",
148
  }
149
  ```
150