- The GPT -2 model was trained on the BookCorpus dataset for 60K steps.
- No position embedding was used (NoPE).
- Here is the wandb report
- This is for educational purposes only.
- Downloads last month
- 14
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.