Edit model card

calculator_model_test_third_version

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1341

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
3.4031 1.0 6 2.7477
2.3423 2.0 12 2.0656
1.7839 3.0 18 1.6402
1.6119 4.0 24 1.5371
1.544 5.0 30 1.4939
1.4631 6.0 36 1.4346
1.4052 7.0 42 1.3510
1.3137 8.0 48 1.2456
1.2297 9.0 54 1.2067
1.2454 10.0 60 1.2082
1.121 11.0 66 1.0614
1.0353 12.0 72 0.9978
1.0028 13.0 78 1.0926
0.993 14.0 84 0.9551
0.9267 15.0 90 0.9013
0.8728 16.0 96 0.9764
0.9072 17.0 102 0.8508
0.8457 18.0 108 0.8541
0.8278 19.0 114 0.7950
0.7903 20.0 120 0.7892
0.7726 21.0 126 0.7708
0.7789 22.0 132 0.7830
0.7515 23.0 138 0.8062
0.7691 24.0 144 0.7276
0.7203 25.0 150 0.7205
0.7119 26.0 156 0.7131
0.6776 27.0 162 0.6892
0.6926 28.0 168 0.7582
0.7128 29.0 174 0.9174
0.8055 30.0 180 0.7222
0.7423 31.0 186 0.6740
0.6712 32.0 192 0.7917
0.6965 33.0 198 0.6726
0.652 34.0 204 0.7449
0.6963 35.0 210 0.6932
0.6652 36.0 216 0.6286
0.6164 37.0 222 0.5777
0.5848 38.0 228 0.5556
0.5657 39.0 234 0.5788
0.5631 40.0 240 0.5216
0.5315 41.0 246 0.5156
0.5277 42.0 252 0.5486
0.5498 43.0 258 0.4877
0.4836 44.0 264 0.5947
0.555 45.0 270 0.4725
0.4804 46.0 276 0.4367
0.4537 47.0 282 0.4729
0.4668 48.0 288 0.3988
0.4507 49.0 294 0.4808
0.5128 50.0 300 0.4311
0.4444 51.0 306 0.4709
0.4538 52.0 312 0.3786
0.4213 53.0 318 0.3962
0.4067 54.0 324 0.3765
0.3931 55.0 330 0.4016
0.3946 56.0 336 0.3674
0.4095 57.0 342 0.3445
0.3817 58.0 348 0.3252
0.3528 59.0 354 0.3171
0.3527 60.0 360 0.3465
0.3562 61.0 366 0.3992
0.4265 62.0 372 0.3743
0.3734 63.0 378 0.3598
0.3585 64.0 384 0.3008
0.3438 65.0 390 0.2719
0.3289 66.0 396 0.2876
0.3128 67.0 402 0.2764
0.3106 68.0 408 0.2986
0.3058 69.0 414 0.2567
0.286 70.0 420 0.2762
0.2857 71.0 426 0.2732
0.2921 72.0 432 0.2728
0.3118 73.0 438 0.2352
0.2701 74.0 444 0.2204
0.2622 75.0 450 0.2114
0.2449 76.0 456 0.2262
0.2542 77.0 462 0.2446
0.259 78.0 468 0.2187
0.2852 79.0 474 0.2329
0.2587 80.0 480 0.2101
0.2491 81.0 486 0.2165
0.2291 82.0 492 0.1921
0.2286 83.0 498 0.1815
0.2095 84.0 504 0.1700
0.2256 85.0 510 0.1640
0.2088 86.0 516 0.1848
0.2087 87.0 522 0.1745
0.2025 88.0 528 0.1655
0.2003 89.0 534 0.1717
0.2007 90.0 540 0.1682
0.1862 91.0 546 0.1629
0.2005 92.0 552 0.1482
0.2003 93.0 558 0.1600
0.1876 94.0 564 0.1498
0.1929 95.0 570 0.1405
0.1772 96.0 576 0.1404
0.1797 97.0 582 0.1366
0.1734 98.0 588 0.1352
0.1686 99.0 594 0.1345
0.177 100.0 600 0.1341

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
1
Safetensors
Model size
7.8M params
Tensor type
F32
·