license: mit | |
datasets: | |
- bookcorpus/bookcorpus | |
language: | |
- en | |
library_name: transformers | |
* The GPT -2 model was trained on the BookCorpus dataset for 60K steps. | |
* No position embedding was used (NoPE). | |
* [Here](https://wandb.ai/a-arun283-iit-madras/gpt-2-BooKcorpus-WarmUpLr/reports/Pretraining-GPT-2---Vmlldzo5MDY3MDk5) is the wandb report | |
* This is for educational purposes only. |