gum123's picture
End of training
bca0a58 verified
metadata
tags:
  - generated_from_trainer
model-index:
  - name: calculator_model_test
    results: []

calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5972

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
3.7046 1.0 6 3.1566
2.7296 2.0 12 2.2354
1.97 3.0 18 1.7402
1.6522 4.0 24 1.6050
1.5915 5.0 30 1.5117
1.497 6.0 36 1.4859
1.4994 7.0 42 1.4515
1.4372 8.0 48 1.4207
1.4099 9.0 54 1.3809
1.37 10.0 60 1.3981
1.3361 11.0 66 1.2905
1.2942 12.0 72 1.2986
1.2437 13.0 78 1.2145
1.18 14.0 84 1.1069
1.0947 15.0 90 1.0619
1.0435 16.0 96 0.9873
0.9961 17.0 102 0.9470
0.9408 18.0 108 0.9126
0.9119 19.0 114 0.9238
0.9158 20.0 120 0.8937
0.8981 21.0 126 0.8486
0.862 22.0 132 0.8756
0.8577 23.0 138 0.8344
0.8243 24.0 144 0.8168
0.8018 25.0 150 0.7711
0.7861 26.0 156 0.7986
0.7838 27.0 162 0.7765
0.7753 28.0 168 0.7504
0.7602 29.0 174 0.7205
0.7215 30.0 180 0.7216
0.7148 31.0 186 0.6973
0.7082 32.0 192 0.6753
0.7017 33.0 198 0.6480
0.6784 34.0 204 0.6394
0.6702 35.0 210 0.6333
0.6663 36.0 216 0.6221
0.6415 37.0 222 0.6133
0.6377 38.0 228 0.6065
0.6291 39.0 234 0.6049
0.6288 40.0 240 0.5972

Framework versions

  • Transformers 4.38.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2