Edit model card

gpt2-ear_1-hs_cn_decay

This model is a fine-tuned version of gpt2-medium on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5369

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 21
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss
74.0475 0.02 10 72.6562
45.5005 0.04 20 34.4060
10.921 0.06 30 11.2525
2.7976 0.08 40 4.4890
0.0358 0.1 50 2.1808
-1.4128 0.12 60 1.1029
-1.7651 0.14 70 0.9684
-1.9542 0.16 80 0.7785
-2.1013 0.18 90 0.6533
-2.14 0.2 100 0.6666
-2.1001 0.22 110 0.6334
-2.1169 0.24 120 0.5926
-2.1216 0.26 130 0.5903
-2.1191 0.28 140 0.5741
-2.1319 0.3 150 0.5702
-2.122 0.32 160 0.5679
-2.0754 0.34 170 0.5671
-2.06 0.36 180 0.5630
-2.0477 0.38 190 0.5591
-2.0569 0.4 200 0.5546
-1.9666 0.42 210 0.5513
-1.9673 0.44 220 0.5558
-2.0044 0.46 230 0.5560
-1.9923 0.48 240 0.5507
-1.9056 0.5 250 0.5494
-1.9658 0.52 260 0.5498
-1.9104 0.54 270 0.5474
-1.8967 0.56 280 0.5459
-1.8759 0.58 290 0.5458
-1.8432 0.6 300 0.5477
-1.835 0.62 310 0.5446
-1.7823 0.64 320 0.5448
-1.8058 0.66 330 0.5412
-1.8138 0.68 340 0.5375
-1.7656 0.7 350 0.5385
-1.7174 0.72 360 0.5376
-1.7461 0.74 370 0.5365
-1.7145 0.76 380 0.5342
-1.6737 0.78 390 0.5326
-1.7162 0.8 400 0.5342
-1.6792 0.82 410 0.5361
-1.6783 0.84 420 0.5369

Framework versions

  • Transformers 4.24.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.6.1
  • Tokenizers 0.12.1
Downloads last month
8