Edit model card

calculator_model_test_second_version

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1239

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
3.4193 1.0 6 2.7303
2.3495 2.0 12 1.9121
1.7677 3.0 18 1.6227
1.6093 4.0 24 1.5830
1.5528 5.0 30 1.5309
1.5114 6.0 36 1.4526
1.4513 7.0 42 1.3988
1.4022 8.0 48 1.3442
1.3473 9.0 54 1.2809
1.2985 10.0 60 1.2159
1.2173 11.0 66 1.1870
1.1373 12.0 72 1.0899
1.0855 13.0 78 1.0270
1.04 14.0 84 0.9607
1.0274 15.0 90 0.9749
0.9975 16.0 96 0.9045
0.9274 17.0 102 0.9247
0.8963 18.0 108 0.8161
0.8767 19.0 114 0.8131
0.8764 20.0 120 0.9056
0.8763 21.0 126 0.7668
0.8097 22.0 132 0.8305
0.8 23.0 138 0.7579
0.7483 24.0 144 0.7418
0.8242 25.0 150 0.7103
0.7375 26.0 156 0.6743
0.7078 27.0 162 0.6516
0.7112 28.0 168 0.7178
0.7518 29.0 174 0.7132
0.6874 30.0 180 0.6438
0.6671 31.0 186 0.6512
0.6595 32.0 192 0.6338
0.6375 33.0 198 0.5772
0.5933 34.0 204 0.5397
0.5938 35.0 210 0.5182
0.5818 36.0 216 0.5315
0.6946 37.0 222 0.9134
0.7946 38.0 228 0.7031
0.7079 39.0 234 0.6212
0.6055 40.0 240 0.5024
0.5524 41.0 246 0.5142
0.543 42.0 252 0.4946
0.5265 43.0 258 0.4820
0.5339 44.0 264 0.6029
0.5624 45.0 270 0.5800
0.5097 46.0 276 0.4858
0.5059 47.0 282 0.4554
0.4807 48.0 288 0.4538
0.4824 49.0 294 0.4248
0.4691 50.0 300 0.3919
0.5413 51.0 306 0.5179
0.5131 52.0 312 0.3809
0.4312 53.0 318 0.3955
0.4226 54.0 324 0.3597
0.4059 55.0 330 0.3501
0.3887 56.0 336 0.3281
0.3784 57.0 342 0.3294
0.3696 58.0 348 0.2937
0.3694 59.0 354 0.3153
0.3815 60.0 360 0.2878
0.3575 61.0 366 0.3236
0.3527 62.0 372 0.2940
0.3481 63.0 378 0.2703
0.3466 64.0 384 0.3331
0.4037 65.0 390 0.3615
0.363 66.0 396 0.3057
0.3374 67.0 402 0.2810
0.3256 68.0 408 0.2785
0.3206 69.0 414 0.2553
0.306 70.0 420 0.2336
0.2884 71.0 426 0.2361
0.2892 72.0 432 0.2257
0.275 73.0 438 0.2237
0.2968 74.0 444 0.2405
0.2879 75.0 450 0.2139
0.2832 76.0 456 0.2139
0.2726 77.0 462 0.2174
0.2687 78.0 468 0.2037
0.2609 79.0 474 0.1833
0.2518 80.0 480 0.1836
0.253 81.0 486 0.1861
0.2417 82.0 492 0.1650
0.2279 83.0 498 0.1706
0.2323 84.0 504 0.1785
0.225 85.0 510 0.1694
0.2194 86.0 516 0.1586
0.2217 87.0 522 0.1575
0.2093 88.0 528 0.1497
0.2109 89.0 534 0.1562
0.2081 90.0 540 0.1549
0.2027 91.0 546 0.1419
0.1982 92.0 552 0.1347
0.1951 93.0 558 0.1355
0.1893 94.0 564 0.1338
0.1881 95.0 570 0.1336
0.1911 96.0 576 0.1303
0.1862 97.0 582 0.1289
0.1882 98.0 588 0.1301
0.1792 99.0 594 0.1250
0.176 100.0 600 0.1239

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
2
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.