Edit model card

Model Details

  • Developed by: Rafael Espinosa Mena
  • Model type: GPT2 124M
  • Language(s) (NLP): English

Uses

Pre-Trained from scratch on 200 wikipedia articles and intended for fine tunning.

Training Details

Trained for 200 epochs on 200 wikipedia articles. Used a learning rate of 3e-5, and a sliding window approach with 1024 tokens per chunk, and a 200 token window.

Training Data

https://huggingface.co/datasets/wikipedia

Hardware

Trained on a single V100 GPU

Model Card Authors

Rafael Espinosa Mena rafaelespinosamena@gmail.com

Downloads last month
3
Safetensors
Model size
124M params
Tensor type
F32
·