Edit model card

calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6527

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
3.6778 1.0 6 3.1191
2.7094 2.0 12 2.1966
1.9597 3.0 18 1.7446
1.7331 4.0 24 1.6583
1.5938 5.0 30 1.6425
1.5338 6.0 36 1.5592
1.5077 7.0 42 1.5055
1.4896 8.0 48 1.4880
1.4419 9.0 54 1.4727
1.4062 10.0 60 1.3960
1.3459 11.0 66 1.3129
1.2961 12.0 72 1.3037
1.2268 13.0 78 1.2964
1.2251 14.0 84 1.1677
1.1559 15.0 90 1.1312
1.1157 16.0 96 1.1714
1.1385 17.0 102 1.1348
1.0996 18.0 108 1.1113
1.0407 19.0 114 0.9871
0.9734 20.0 120 0.9324
0.9512 21.0 126 0.9743
0.951 22.0 132 0.9441
0.917 23.0 138 0.8909
0.8726 24.0 144 0.9193
0.8937 25.0 150 0.8686
0.8351 26.0 156 0.8182
0.8397 27.0 162 0.7957
0.8148 28.0 168 0.7851
0.7866 29.0 174 0.7707
0.7579 30.0 180 0.7610
0.7516 31.0 186 0.7259
0.734 32.0 192 0.7193
0.7375 33.0 198 0.7392
0.7284 34.0 204 0.7019
0.7283 35.0 210 0.6881
0.6968 36.0 216 0.6745
0.69 37.0 222 0.6672
0.6877 38.0 228 0.6606
0.6741 39.0 234 0.6575
0.6741 40.0 240 0.6527

Framework versions

  • Transformers 4.38.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
5
Safetensors
Model size
7.8M params
Tensor type
F32
·