Edit model card

model

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6646

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 19 1.3686
No log 2.0 38 1.2421
No log 3.0 57 1.1842
No log 4.0 76 1.1521
No log 5.0 95 1.1380
1.0495 6.0 114 1.1546
1.0495 7.0 133 1.1386
1.0495 8.0 152 1.1506
1.0495 9.0 171 1.1830
1.0495 10.0 190 1.1794
0.5118 11.0 209 1.2288
0.5118 12.0 228 1.2412
0.5118 13.0 247 1.2557
0.5118 14.0 266 1.2424
0.5118 15.0 285 1.2588
0.3204 16.0 304 1.3023
0.3204 17.0 323 1.3486
0.3204 18.0 342 1.3655
0.3204 19.0 361 1.3789
0.3204 20.0 380 1.4344
0.3204 21.0 399 1.4447
0.2116 22.0 418 1.4595
0.2116 23.0 437 1.5016
0.2116 24.0 456 1.4811
0.2116 25.0 475 1.5362
0.2116 26.0 494 1.5024
0.1447 27.0 513 1.5767
0.1447 28.0 532 1.5252
0.1447 29.0 551 1.5311
0.1447 30.0 570 1.5737
0.1447 31.0 589 1.5664
0.1165 32.0 608 1.5666
0.1165 33.0 627 1.6021
0.1165 34.0 646 1.5742
0.1165 35.0 665 1.6422
0.1165 36.0 684 1.6173
0.0943 37.0 703 1.6439
0.0943 38.0 722 1.6504
0.0943 39.0 741 1.6185
0.0943 40.0 760 1.6250
0.0943 41.0 779 1.6312
0.0943 42.0 798 1.6583
0.0786 43.0 817 1.6477
0.0786 44.0 836 1.6553
0.0786 45.0 855 1.6627
0.0786 46.0 874 1.6686
0.0786 47.0 893 1.6655
0.0722 48.0 912 1.6617
0.0722 49.0 931 1.6661
0.0722 50.0 950 1.6646

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.0.1+cu117
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
124M params
Tensor type
F32
·

Finetuned from