datasets: - wikitext language: - en metrics: - perplexity
GPT-2 Pretrained on Wikitext-103 (180M sentences) on 32GB V100 GPU for around 1.10L iterations.
Val_loss vs train_loss:
Perplexity: 22.87
Just a test model. Please don't expect good results.