Model Details
- Developed by: Rafael Espinosa Mena
- Model type: GPT2 124M
- Language(s) (NLP): English
Uses
Pre-Trained from scratch on 200 wikipedia articles and intended for fine tunning.
Training Details
Trained for 200 epochs on 200 wikipedia articles. Used a learning rate of 3e-5, and a sliding window approach with 1024 tokens per chunk, and a 200 token window.
Training Data
https://huggingface.co/datasets/wikipedia
Hardware
Trained on a single V100 GPU
Model Card Authors
Rafael Espinosa Mena rafaelespinosamena@gmail.com
- Downloads last month
- 9
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.