Edit model card

gpt2english98

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.9145

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
6.4337 0.0102 10000 6.3333
6.163 0.0204 20000 6.0858
6.0254 0.0306 30000 5.9577
5.9641 0.0408 40000 5.8612
5.9122 0.0510 50000 5.7882
5.7888 0.0612 60000 5.7222
5.7087 0.0714 70000 5.6873
5.7093 0.0816 80000 5.6355
5.6654 0.0918 90000 5.5949
5.6532 0.1020 100000 5.5734
5.6147 0.1122 110000 5.5327
5.6729 0.1224 120000 5.5168
5.5226 0.1327 130000 5.4808
5.4966 0.1429 140000 5.4611
5.4708 0.1531 150000 5.4348
5.4733 0.1633 160000 5.4144
5.4892 0.1735 170000 5.3892
5.5325 0.1837 180000 5.3728
5.4798 0.1939 190000 5.3587
5.4556 0.2041 200000 5.3404
5.3283 0.2143 210000 5.3255
5.3728 0.2245 220000 5.3096
5.42 0.2347 230000 5.3045
5.4695 0.2449 240000 5.2806
5.3529 0.2551 250000 5.2689
5.3567 0.2653 260000 5.2556
5.2927 0.2755 270000 5.2452
5.3838 0.2857 280000 5.2276
5.3734 0.2959 290000 5.2212
5.3353 0.3061 300000 5.2074
5.392 0.3163 310000 5.2042
5.3286 0.3265 320000 5.2012
5.3774 0.3367 330000 5.1871
5.2164 0.3469 340000 5.1758
5.3587 0.3571 350000 5.1685
5.3574 0.3673 360000 5.1575
5.2087 0.3776 370000 5.1482
5.1612 0.3878 380000 5.1489
5.225 0.3980 390000 5.1398
5.241 0.4082 400000 5.1222
5.2267 0.4184 410000 5.1288
5.1924 0.4286 420000 5.1110
5.2413 0.4388 430000 5.1047
5.2015 0.4490 440000 5.1049
5.2847 0.4592 450000 5.0944
5.1406 0.4694 460000 5.0888
5.1992 0.4796 470000 5.0786
5.0754 0.4898 480000 5.0810
5.1644 0.5 490000 5.0697
5.1464 0.5102 500000 5.0609
5.1771 0.5204 510000 5.0560
5.1896 0.5306 520000 5.0574
5.1355 0.5408 530000 5.0498
5.115 0.5510 540000 5.0494
5.1575 0.5612 550000 5.0357
5.191 0.5714 560000 5.0305
5.1694 0.5816 570000 5.0303
5.1591 0.5918 580000 5.0267
5.138 0.6020 590000 5.0264
5.0825 0.6122 600000 5.0195
5.1669 0.6224 610000 5.0147
5.0309 0.6327 620000 5.0156
5.0886 0.6429 630000 5.0077
5.1049 0.6531 640000 5.0021
5.1385 0.6633 650000 5.0052
5.1294 0.6735 660000 4.9955
5.0726 0.6837 670000 4.9947
5.1084 0.6939 680000 4.9912
5.0205 0.7041 690000 4.9869
5.111 0.7143 700000 4.9826
5.0809 0.7245 710000 4.9773
5.1221 0.7347 720000 4.9775
5.1516 0.7449 730000 4.9721
5.1347 0.7551 740000 4.9655
5.0744 0.7653 750000 4.9664
5.0715 0.7755 760000 4.9626
5.1118 0.7857 770000 4.9592
5.0933 0.7959 780000 4.9558
5.0685 0.8061 790000 4.9543
5.1237 0.8163 800000 4.9514
4.9532 0.8265 810000 4.9493
5.0854 0.8367 820000 4.9478
5.0865 0.8469 830000 4.9417
5.085 0.8571 840000 4.9419
5.0835 0.8673 850000 4.9385
5.0347 0.8776 860000 4.9345
4.9784 0.8878 870000 4.9332
5.0046 0.8980 880000 4.9317
4.9069 0.9082 890000 4.9296
5.0209 0.9184 900000 4.9270
5.1551 0.9286 910000 4.9234
5.1849 0.9388 920000 4.9230
5.07 0.9490 930000 4.9200
4.9804 0.9592 940000 4.9195
5.0419 0.9694 950000 4.9174
5.0447 0.9796 960000 4.9165
5.0839 0.9898 970000 4.9161
4.9989 1.0 980000 4.9145

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
12
Safetensors
Model size
14.1M params
Tensor type
F32
·
Inference API
This model can be loaded on Inference API (serverless).