Edit model card

Model Details

  • Developed by: Rafael Espinosa Mena
  • Model type: GPT2 124M
  • Language(s) (NLP): English

Uses

Pre-Trained from scratch on 200 wikipedia articles and intended for fine tunning.

Training Details

Trained for 200 epochs on 200 wikipedia articles. Used a learning rate of 3e-5, and a sliding window approach with 1024 tokens per chunk, and a 200 token window.

Training Data

https://huggingface.co/datasets/wikipedia

Hardware

Trained on a single V100 GPU

Model Card Authors

Rafael Espinosa Mena rafaelespinosamena@gmail.com

Downloads last month
9
Safetensors
Model size
124M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.