ludziej's picture
End of training
e240099 verified
|
raw
history blame
6.3 kB
metadata
tags:
  - generated_from_trainer
model-index:
  - name: calculator_model_test
    results: []

calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7800

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 6 6.5672
No log 2.0 12 5.9069
No log 3.0 18 5.4671
No log 4.0 24 5.0855
No log 5.0 30 4.7085
No log 6.0 36 4.3935
No log 7.0 42 4.1354
No log 8.0 48 3.9330
No log 9.0 54 3.7647
No log 10.0 60 3.6286
No log 11.0 66 3.5173
No log 12.0 72 3.4148
No log 13.0 78 3.3148
No log 14.0 84 3.2218
No log 15.0 90 3.1353
No log 16.0 96 3.0524
No log 17.0 102 2.9751
No log 18.0 108 2.9017
No log 19.0 114 2.8277
No log 20.0 120 2.7534
No log 21.0 126 2.6865
No log 22.0 132 2.6260
No log 23.0 138 2.5597
No log 24.0 144 2.4913
No log 25.0 150 2.4295
No log 26.0 156 2.3668
No log 27.0 162 2.3040
No log 28.0 168 2.2391
No log 29.0 174 2.1745
No log 30.0 180 2.1036
No log 31.0 186 2.0443
No log 32.0 192 1.9919
No log 33.0 198 1.9321
No log 34.0 204 1.8771
No log 35.0 210 1.8386
No log 36.0 216 1.7951
No log 37.0 222 1.7460
No log 38.0 228 1.6974
No log 39.0 234 1.6576
No log 40.0 240 1.6112
No log 41.0 246 1.5811
No log 42.0 252 1.5540
No log 43.0 258 1.5268
No log 44.0 264 1.4873
No log 45.0 270 1.4500
No log 46.0 276 1.4161
No log 47.0 282 1.3738
No log 48.0 288 1.3495
No log 49.0 294 1.3182
No log 50.0 300 1.2899
No log 51.0 306 1.2610
No log 52.0 312 1.2478
No log 53.0 318 1.2238
No log 54.0 324 1.2060
No log 55.0 330 1.1794
No log 56.0 336 1.1774
No log 57.0 342 1.1425
No log 58.0 348 1.1166
No log 59.0 354 1.1044
No log 60.0 360 1.0913
No log 61.0 366 1.0775
No log 62.0 372 1.0694
No log 63.0 378 1.0311
No log 64.0 384 1.0272
No log 65.0 390 1.0249
No log 66.0 396 0.9923
No log 67.0 402 0.9892
No log 68.0 408 0.9762
No log 69.0 414 0.9704
No log 70.0 420 0.9405
No log 71.0 426 0.9394
No log 72.0 432 0.9237
No log 73.0 438 0.9180
No log 74.0 444 0.8926
No log 75.0 450 0.9081
No log 76.0 456 0.8778
No log 77.0 462 0.8785
No log 78.0 468 0.8580
No log 79.0 474 0.8593
No log 80.0 480 0.8553
No log 81.0 486 0.8671
No log 82.0 492 0.8422
No log 83.0 498 0.8403
2.0956 84.0 504 0.8355
2.0956 85.0 510 0.8188
2.0956 86.0 516 0.8149
2.0956 87.0 522 0.8285
2.0956 88.0 528 0.8063
2.0956 89.0 534 0.8166
2.0956 90.0 540 0.8008
2.0956 91.0 546 0.8127
2.0956 92.0 552 0.7921
2.0956 93.0 558 0.8015
2.0956 94.0 564 0.7882
2.0956 95.0 570 0.7844
2.0956 96.0 576 0.7862
2.0956 97.0 582 0.7810
2.0956 98.0 588 0.7808
2.0956 99.0 594 0.7810
2.0956 100.0 600 0.7800

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.17.1
  • Tokenizers 0.15.2