debiased_gpt_2 / README.md
ncantalupa's picture
End of training
db655c9 verified
|
raw
history blame
3.87 kB
metadata
library_name: transformers
license: mit
base_model: openai-community/gpt2
tags:
  - generated_from_trainer
model-index:
  - name: debiased_gpt_2
    results: []

debiased_gpt_2

This model is a fine-tuned version of openai-community/gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5492

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 4 1.8397
No log 2.0 8 1.7641
No log 3.0 12 1.5924
No log 4.0 16 1.4858
No log 5.0 20 1.4142
No log 6.0 24 1.3215
No log 7.0 28 1.2303
No log 8.0 32 1.1587
No log 9.0 36 1.1007
No log 10.0 40 1.0435
No log 11.0 44 0.9819
No log 12.0 48 0.9346
No log 13.0 52 0.8925
No log 14.0 56 0.8457
No log 15.0 60 0.8009
No log 16.0 64 0.7653
No log 17.0 68 0.7394
No log 18.0 72 0.7197
No log 19.0 76 0.6985
No log 20.0 80 0.6767
No log 21.0 84 0.6623
No log 22.0 88 0.6489
No log 23.0 92 0.6306
No log 24.0 96 0.6136
No log 25.0 100 0.6019
No log 26.0 104 0.5947
No log 27.0 108 0.5925
No log 28.0 112 0.5917
No log 29.0 116 0.5887
No log 30.0 120 0.5837
No log 31.0 124 0.5771
No log 32.0 128 0.5718
No log 33.0 132 0.5679
No log 34.0 136 0.5652
No log 35.0 140 0.5611
No log 36.0 144 0.5581
No log 37.0 148 0.5562
No log 38.0 152 0.5568
No log 39.0 156 0.5556
No log 40.0 160 0.5559
No log 41.0 164 0.5555
No log 42.0 168 0.5534
No log 43.0 172 0.5516
No log 44.0 176 0.5506
No log 45.0 180 0.5503
No log 46.0 184 0.5499
No log 47.0 188 0.5496
No log 48.0 192 0.5492
No log 49.0 196 0.5492
No log 50.0 200 0.5492

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.20.3