Edit model card

hkpr_cricket_1

This model is a fine-tuned version of distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1668

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 1 3.0003
No log 2.0 2 2.9375
No log 3.0 3 2.8773
No log 4.0 4 2.8241
No log 5.0 5 2.7757
No log 6.0 6 2.7318
No log 7.0 7 2.6911
No log 8.0 8 2.6538
No log 9.0 9 2.6194
No log 10.0 10 2.5878
No log 11.0 11 2.5584
No log 12.0 12 2.5304
No log 13.0 13 2.5038
No log 14.0 14 2.4789
No log 15.0 15 2.4554
No log 16.0 16 2.4340
No log 17.0 17 2.4163
No log 18.0 18 2.4008
No log 19.0 19 2.3853
No log 20.0 20 2.3716
No log 21.0 21 2.3577
No log 22.0 22 2.3425
No log 23.0 23 2.3279
No log 24.0 24 2.3151
No log 25.0 25 2.3031
No log 26.0 26 2.2914
No log 27.0 27 2.2794
No log 28.0 28 2.2679
No log 29.0 29 2.2572
No log 30.0 30 2.2468
No log 31.0 31 2.2381
No log 32.0 32 2.2299
No log 33.0 33 2.2220
No log 34.0 34 2.2150
No log 35.0 35 2.2089
No log 36.0 36 2.2028
No log 37.0 37 2.1969
No log 38.0 38 2.1915
No log 39.0 39 2.1869
No log 40.0 40 2.1832
No log 41.0 41 2.1798
No log 42.0 42 2.1768
No log 43.0 43 2.1744
No log 44.0 44 2.1724
No log 45.0 45 2.1707
No log 46.0 46 2.1693
No log 47.0 47 2.1683
No log 48.0 48 2.1676
No log 49.0 49 2.1671
No log 50.0 50 2.1668

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
2
Safetensors
Model size
81.9M params
Tensor type
F32
·

Finetuned from