Edit model card

Model Details

GPT-2 Pretrained on Wikitext-103 (180M sentences) on 32GB V100 GPU for around 1.10L iterations.

Val_loss vs train_loss: Loss curve

Model Description

Perplexity: 22.87

Out-of-Scope Use

Just a test model. Please don't expect good results.

Downloads last month
3

Dataset used to train himanshubeniwal/gpt2-wikitext103