Edit model card

kant-gpt2-large

This model is a fine-tuned version of benjamin/gerpt2-large. It was trained on the "Akademie Ausgabe" of the works of Immanuel Kant. It achieves the following results on the evaluation set:

  • Loss: 3.4257

Model description

A large version of gpt2

Intended uses & limitations

It could be used for the analysis of knowledge representation in and extraction from large language models

Training and evaluation data

Akademie Ausgabe Immanuel Kant

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 4

Training results

Training Loss Epoch Step Validation Loss
3.4094 1.0 11308 3.3838
3.0445 2.0 22616 3.3107
2.7161 3.0 33924 3.3409
2.4793 4.0 45232 3.4257

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.11.0+cu113
  • Datasets 2.3.2
  • Tokenizers 0.12.1
Downloads last month
5