Edit model card

ckt1

This model is a fine-tuned version of distilbert/distilgpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9517

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 1 1.9943
No log 2.0 2 1.9329
No log 3.0 3 1.8768
No log 4.0 4 1.8236
No log 5.0 5 1.7748
No log 6.0 6 1.7300
No log 7.0 7 1.6890
No log 8.0 8 1.6514
No log 9.0 9 1.6152
No log 10.0 10 1.5820
No log 11.0 11 1.5501
No log 12.0 12 1.5189
No log 13.0 13 1.4888
No log 14.0 14 1.4590
No log 15.0 15 1.4322
No log 16.0 16 1.4058
No log 17.0 17 1.3810
No log 18.0 18 1.3556
No log 19.0 19 1.3308
No log 20.0 20 1.3066
No log 21.0 21 1.2823
No log 22.0 22 1.2586
No log 23.0 23 1.2353
No log 24.0 24 1.2135
No log 25.0 25 1.1919
No log 26.0 26 1.1719
No log 27.0 27 1.1529
No log 28.0 28 1.1349
No log 29.0 29 1.1180
No log 30.0 30 1.1012
No log 31.0 31 1.0854
No log 32.0 32 1.0700
No log 33.0 33 1.0561
No log 34.0 34 1.0429
No log 35.0 35 1.0313
No log 36.0 36 1.0208
No log 37.0 37 1.0112
No log 38.0 38 1.0024
No log 39.0 39 0.9942
No log 40.0 40 0.9869
No log 41.0 41 0.9804
No log 42.0 42 0.9745
No log 43.0 43 0.9694
No log 44.0 44 0.9649
No log 45.0 45 0.9611
No log 46.0 46 0.9580
No log 47.0 47 0.9555
No log 48.0 48 0.9536
No log 49.0 49 0.9523
No log 50.0 50 0.9517

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
4
Safetensors
Model size
81.9M params
Tensor type
F32
·

Finetuned from