MZ4's picture
End of training
1db3a73 verified
metadata
library_name: transformers
tags:
  - generated_from_trainer
model-index:
  - name: calculator_model_test
    results: []

calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7120

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.3426 1.0 6 2.7993
2.4209 2.0 12 1.9973
1.8734 3.0 18 1.7473
1.6692 4.0 24 1.6342
1.5799 5.0 30 1.6449
1.5671 6.0 36 1.5829
1.5089 7.0 42 1.5246
1.4829 8.0 48 1.5022
1.4327 9.0 54 1.4243
1.4399 10.0 60 1.3937
1.3700 11.0 66 1.3579
1.3430 12.0 72 1.2976
1.2724 13.0 78 1.2658
1.2386 14.0 84 1.1594
1.2057 15.0 90 1.2266
1.2069 16.0 96 1.3501
1.2408 17.0 102 1.1047
1.1625 18.0 108 1.1621
1.1029 19.0 114 1.1712
1.1209 20.0 120 1.0636
1.0304 21.0 126 0.9785
0.9679 22.0 132 0.9535
0.9591 23.0 138 0.8968
0.9017 24.0 144 0.8817
0.8773 25.0 150 0.9545
0.9173 26.0 156 1.0227
0.9503 27.0 162 0.8290
0.8785 28.0 168 0.8701
0.8594 29.0 174 0.8212
0.8462 30.0 180 0.8228
0.8191 31.0 186 0.8144
0.8301 32.0 192 0.7736
0.7794 33.0 198 0.7820
0.7795 34.0 204 0.7523
0.7806 35.0 210 0.7386
0.7463 36.0 216 0.7327
0.7594 37.0 222 0.7222
0.7774 38.0 228 0.7165
0.7488 39.0 234 0.7132
0.7370 40.0 240 0.7120

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2