Edit model card

lig_model_1

This model is a fine-tuned version of gpt2 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.7688

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 10
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss
6.8679 0.2310 8 6.6211
6.3894 0.4621 16 6.3666
6.2641 0.6931 24 6.2481
6.1285 0.9242 32 6.0829
5.9436 1.1552 40 5.8900
5.8073 1.3863 48 5.7490
5.7164 1.6173 56 5.6617
5.6019 1.8484 64 5.5778
5.5427 2.0794 72 5.4886
5.454 2.3105 80 5.3954
5.3546 2.5415 88 5.3066
5.3014 2.7726 96 5.2124
5.2448 3.0036 104 5.1365
5.1185 3.2347 112 5.0765
5.0938 3.4657 120 5.0071
5.0347 3.6968 128 4.9339
4.9681 3.9278 136 4.8552
4.8323 4.1588 144 4.7821
4.7912 4.3899 152 4.7215
4.7225 4.6209 160 4.6431
4.6433 4.8520 168 4.5701
4.5309 5.0830 176 4.5002
4.4506 5.3141 184 4.4442
4.4097 5.5451 192 4.3820
4.3871 5.7762 200 4.3290
4.3345 6.0072 208 4.2869
4.2004 6.2383 216 4.2412
4.1716 6.4693 224 4.1978
4.1536 6.7004 232 4.1607
4.0975 6.9314 240 4.1294
3.9743 7.1625 248 4.1014
3.922 7.3935 256 4.0654
3.939 7.6245 264 4.0378
3.9208 7.8556 272 4.0102
3.8083 8.0866 280 3.9812
3.7611 8.3177 288 3.9630
3.7668 8.5487 296 3.9407
3.7285 8.7798 304 3.9183
3.6996 9.0108 312 3.8958
3.5754 9.2419 320 3.8825
3.5708 9.4729 328 3.8702
3.5607 9.7040 336 3.8510
3.5688 9.9350 344 3.8387
3.4188 10.1661 352 3.8350
3.432 10.3971 360 3.8261
3.4236 10.6282 368 3.8131
3.3985 10.8592 376 3.8026
3.306 11.0903 384 3.7934
3.3196 11.3213 392 3.7919
3.3031 11.5523 400 3.7908
3.2851 11.7834 408 3.7817
3.2703 12.0144 416 3.7789
3.2132 12.2455 424 3.7818
3.1829 12.4765 432 3.7778
3.1968 12.7076 440 3.7749
3.2206 12.9386 448 3.7711
3.1521 13.1697 456 3.7694
3.1412 13.4007 464 3.7700
3.1415 13.6318 472 3.7709
3.1402 13.8628 480 3.7694
3.129 14.0939 488 3.7689
3.1221 14.3249 496 3.7687
3.1576 14.5560 504 3.7688

Framework versions

  • Transformers 4.41.0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
3
Safetensors
Model size
87.8M params
Tensor type
F32
·

Finetuned from