BartekSadlej's picture
End of training
6cbd401 verified
|
raw
history blame
No virus
3.2 kB
metadata
tags:
  - generated_from_trainer
model-index:
  - name: calculator_model_test
    results: []

calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8846

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.7375 1.0 14 2.8446
2.5309 2.0 28 2.3889
2.3406 3.0 42 2.3073
2.2691 4.0 56 2.2098
2.1412 5.0 70 2.0464
1.9372 6.0 84 1.7744
1.6761 7.0 98 1.5399
1.4725 8.0 112 1.3886
1.368 9.0 126 1.3246
1.33 10.0 140 1.3355
1.3119 11.0 154 1.2886
1.2836 12.0 168 1.2712
1.2668 13.0 182 1.2703
1.2526 14.0 196 1.2477
1.2292 15.0 210 1.2339
1.203 16.0 224 1.1997
1.1686 17.0 238 1.1764
1.1308 18.0 252 1.1424
1.0866 19.0 266 1.1034
1.0355 20.0 280 1.0546
1.0031 21.0 294 1.0241
0.9608 22.0 308 0.9925
0.924 23.0 322 0.9673
0.9022 24.0 336 0.9555
0.8733 25.0 350 0.9381
0.8549 26.0 364 0.9394
0.8363 27.0 378 0.9274
0.8129 28.0 392 0.9211
0.7894 29.0 406 0.9149
0.7705 30.0 420 0.9042
0.7509 31.0 434 0.8962
0.7363 32.0 448 0.9003
0.7261 33.0 462 0.8935
0.7135 34.0 476 0.8923
0.6988 35.0 490 0.8961
0.6883 36.0 504 0.8883
0.6768 37.0 518 0.8905
0.6686 38.0 532 0.8885
0.6625 39.0 546 0.8865
0.6566 40.0 560 0.8846

Framework versions

  • Transformers 4.38.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2