Marcin1304's picture
End of training
2fe1818 verified
|
raw
history blame
6.29 kB
metadata
tags:
  - generated_from_trainer
model-index:
  - name: calculator_model_test_third_version
    results: []

calculator_model_test_third_version

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2789

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
3.4163 1.0 6 2.7578
2.3764 2.0 12 1.9673
1.794 3.0 18 1.6871
1.682 4.0 24 1.6571
1.6127 5.0 30 1.5569
1.5398 6.0 36 1.5012
1.5125 7.0 42 1.5110
1.521 8.0 48 1.4432
1.4421 9.0 54 1.3939
1.3958 10.0 60 1.3717
1.3615 11.0 66 1.2885
1.2466 12.0 72 1.2391
1.2277 13.0 78 1.1908
1.1821 14.0 84 1.2211
1.1512 15.0 90 1.1313
1.1268 16.0 96 1.0571
1.0752 17.0 102 1.0546
1.0988 18.0 108 1.0526
1.0499 19.0 114 1.0063
0.9957 20.0 120 0.9662
1.0018 21.0 126 0.9062
0.9747 22.0 132 0.9273
0.9545 23.0 138 1.0973
1.0842 24.0 144 1.1854
1.0838 25.0 150 1.0031
0.9812 26.0 156 1.0129
1.0154 27.0 162 0.9693
0.9441 28.0 168 0.8257
0.863 29.0 174 0.8138
0.8591 30.0 180 0.8746
0.8913 31.0 186 0.8592
0.8719 32.0 192 0.7773
0.8187 33.0 198 0.7999
0.8013 34.0 204 0.7491
0.7976 35.0 210 0.7171
0.8033 36.0 216 0.7610
0.7785 37.0 222 0.8047
0.8141 38.0 228 0.7245
0.7726 39.0 234 0.6725
0.7832 40.0 240 0.8410
0.8625 41.0 246 0.7093
0.7307 42.0 252 0.6675
0.6832 43.0 258 0.6891
0.7126 44.0 264 0.7927
0.7808 45.0 270 0.7757
0.8031 46.0 276 0.6749
0.728 47.0 282 0.7299
0.741 48.0 288 0.6062
0.6772 49.0 294 0.6449
0.6505 50.0 300 0.6041
0.6315 51.0 306 0.5769
0.6339 52.0 312 0.6003
0.6388 53.0 318 0.6115
0.6436 54.0 324 0.5778
0.5993 55.0 330 0.5777
0.5936 56.0 336 0.5614
0.5873 57.0 342 0.5445
0.5819 58.0 348 0.5437
0.5523 59.0 354 0.4961
0.5527 60.0 360 0.4939
0.5622 61.0 366 0.4922
0.5603 62.0 372 0.5252
0.612 63.0 378 0.5024
0.6284 64.0 384 0.5152
0.573 65.0 390 0.5300
0.5407 66.0 396 0.4879
0.5266 67.0 402 0.4813
0.526 68.0 408 0.4341
0.5306 69.0 414 0.4817
0.5108 70.0 420 0.4127
0.5079 71.0 426 0.5083
0.5237 72.0 432 0.4423
0.5049 73.0 438 0.4948
0.491 74.0 444 0.4121
0.484 75.0 450 0.4047
0.4668 76.0 456 0.4041
0.4669 77.0 462 0.3987
0.4524 78.0 468 0.4115
0.4604 79.0 474 0.3926
0.4536 80.0 480 0.3970
0.4747 81.0 486 0.3674
0.4417 82.0 492 0.3905
0.458 83.0 498 0.4045
0.4393 84.0 504 0.3889
0.431 85.0 510 0.3427
0.4076 86.0 516 0.3621
0.4239 87.0 522 0.3368
0.4089 88.0 528 0.3353
0.3936 89.0 534 0.3253
0.3899 90.0 540 0.3173
0.3792 91.0 546 0.3065
0.3774 92.0 552 0.3060
0.3677 93.0 558 0.3063
0.3655 94.0 564 0.2909
0.3608 95.0 570 0.2944
0.3561 96.0 576 0.2842
0.3593 97.0 582 0.2906
0.357 98.0 588 0.2800
0.3558 99.0 594 0.2751
0.3456 100.0 600 0.2789

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2