Edit model card

gpt2-lichess-uci-202306

This model is a fine-tuned version of austindavis/gpt2-lichess-uci-2016-01_11 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8839

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.002
  • train_batch_size: 20
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
1.022 0.1323 165000 1.0013
1.0204 0.1443 180000 1.0001
1.0186 0.1563 195000 0.9973
1.0155 0.1684 210000 0.9954
1.0133 0.1804 225000 0.9935
1.0118 0.1924 240000 0.9924
1.0092 0.2044 255000 0.9893
1.007 0.2165 270000 0.9881
1.0057 0.2285 285000 0.9868
1.0035 0.2405 300000 0.9879
1.004 0.2525 315000 0.9843
1.0005 0.2646 330000 0.9807
0.9986 0.2766 345000 0.9805
0.9983 0.2886 360000 0.9776
0.9965 0.3006 375000 0.9781
0.9935 0.3127 390000 0.9754
0.9935 0.3247 405000 0.9761
0.9916 0.3367 420000 0.9743
0.989 0.3487 435000 0.9712
0.988 0.3608 450000 0.9702
0.9862 0.3728 465000 0.9703
0.9837 0.3848 480000 0.9680
0.983 0.3968 495000 0.9643
0.9816 0.4089 510000 0.9634
0.9796 0.4209 525000 0.9628
0.9777 0.4329 540000 0.9612
0.9744 0.4449 555000 0.9587
0.9733 0.4570 570000 0.9590
0.97 0.4690 585000 0.9566
0.9693 0.4810 600000 0.9539
0.9684 0.4930 615000 0.9532
0.9652 0.5051 630000 0.9509
0.9644 0.5171 645000 0.9501
0.9614 0.5291 660000 0.9479
0.9606 0.5411 675000 0.9466
0.9597 0.5532 690000 0.9444
0.9556 0.5652 705000 0.9416
0.9541 0.5772 720000 0.9413
0.9522 0.5892 735000 0.9382
0.9491 0.6013 750000 0.9367
0.9471 0.6133 765000 0.9354
0.9459 0.6253 780000 0.9321
0.9416 0.6373 795000 0.9309
0.9401 0.6494 810000 0.9287
0.9383 0.6614 825000 0.9265
0.9375 0.6734 840000 0.9238
0.9354 0.6854 855000 0.9225
0.9323 0.6975 870000 0.9196
0.9291 0.7095 885000 0.9189
0.9276 0.7215 900000 0.9165
0.9266 0.7335 915000 0.9142
0.9221 0.7456 930000 0.9130
0.9216 0.7576 945000 0.9106
0.9191 0.7696 960000 0.9084
0.9152 0.7816 975000 0.9062
0.9127 0.7937 990000 0.9039
0.9133 0.8057 1005000 0.9014
0.9086 0.8177 1020000 0.8997
0.9078 0.8297 1035000 0.8978
0.9054 0.8418 1050000 0.8955
0.9037 0.8538 1065000 0.8943
0.9015 0.8658 1080000 0.8926
0.9006 0.8778 1095000 0.8912
0.8991 0.8899 1110000 0.8897
0.897 0.9019 1125000 0.8885
0.8971 0.9139 1140000 0.8873
0.894 0.9259 1155000 0.8864
0.8938 0.9380 1170000 0.8854
0.893 0.9500 1185000 0.8848
0.8922 0.9620 1200000 0.8844
0.8936 0.9740 1215000 0.8841
0.8923 0.9861 1230000 0.8840
0.8922 0.9981 1245000 0.8839

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.3.0
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
88
Safetensors
Model size
25.5M params
Tensor type
F32
·

Finetuned from

Space using austindavis/chess-gpt2-uci-8x8x512 1

Collection including austindavis/chess-gpt2-uci-8x8x512