Kielak2's picture
End of training
68314d7 verified
metadata
base_model: Kielak2/calculator_model_test
tags:
  - generated_from_trainer
model-index:
  - name: calculator_model_test
    results: []

calculator_model_test

This model is a fine-tuned version of Kielak2/calculator_model_test on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2680

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 200
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.5001 1.0 1 3.3999
3.3159 2.0 2 2.5866
2.3974 3.0 3 2.3937
2.0971 4.0 4 2.2596
2.1181 5.0 5 2.1882
2.0458 6.0 6 1.9196
1.8142 7.0 7 1.6198
1.5353 8.0 8 1.3971
1.3998 9.0 9 1.2959
1.2674 10.0 10 1.2443
1.2303 11.0 11 1.2443
1.242 12.0 12 1.1784
1.1823 13.0 13 1.1184
1.1237 14.0 14 1.1045
1.091 15.0 15 1.0504
1.0599 16.0 16 1.0647
1.0767 17.0 17 1.0645
1.0676 18.0 18 1.0431
1.035 19.0 19 0.9963
0.9796 20.0 20 1.0182
1.0002 21.0 21 1.0211
1.0016 22.0 22 1.0027
0.9601 23.0 23 0.9614
0.9323 24.0 24 0.9258
0.8823 25.0 25 0.9227
0.8984 26.0 26 0.9107
0.9054 27.0 27 0.8881
0.9126 28.0 28 0.9379
0.9166 29.0 29 0.9273
0.9053 30.0 30 0.8998
0.8671 31.0 31 0.8555
0.829 32.0 32 0.8551
0.8544 33.0 33 0.8491
0.8128 34.0 34 0.8184
0.7961 35.0 35 0.8312
0.7854 36.0 36 0.8337
0.808 37.0 37 0.8184
0.8211 38.0 38 0.8191
0.7993 39.0 39 0.7743
0.7789 40.0 40 0.7454
0.7924 41.0 41 0.7314
0.7243 42.0 42 0.8436
0.7537 43.0 43 0.8050
0.7622 44.0 44 0.7724
0.7694 45.0 45 0.7963
0.7819 46.0 46 0.7872
0.739 47.0 47 0.8100
0.7456 48.0 48 0.7989
0.7214 49.0 49 0.7234
0.6545 50.0 50 0.6993
0.6834 51.0 51 0.6556
0.6664 52.0 52 0.6544
0.6141 53.0 53 0.6489
0.5929 54.0 54 0.6268
0.566 55.0 55 0.6311
0.6577 56.0 56 0.5828
0.598 57.0 57 0.6526
0.6056 58.0 58 0.7250
0.6204 59.0 59 0.6612
0.5968 60.0 60 0.5759
0.5823 61.0 61 0.5836
0.5986 62.0 62 0.5375
0.5247 63.0 63 0.5993
0.5891 64.0 64 0.6175
0.6142 65.0 65 0.5691
0.5602 66.0 66 0.5180
0.5017 67.0 67 0.5726
0.5304 68.0 68 0.5362
0.4935 69.0 69 0.5311
0.5167 70.0 70 0.5698
0.526 71.0 71 0.5837
0.5538 72.0 72 0.5436
0.4825 73.0 73 0.5253
0.4596 74.0 74 0.5132
0.4722 75.0 75 0.4970
0.4662 76.0 76 0.4983
0.4991 77.0 77 0.4886
0.4613 78.0 78 0.4791
0.4589 79.0 79 0.4654
0.4617 80.0 80 0.4532
0.4491 81.0 81 0.4617
0.4471 82.0 82 0.4416
0.4216 83.0 83 0.4841
0.4516 84.0 84 0.4817
0.4372 85.0 85 0.4570
0.4385 86.0 86 0.4801
0.4546 87.0 87 0.4929
0.4381 88.0 88 0.4646
0.4314 89.0 89 0.4338
0.3989 90.0 90 0.4458
0.4442 91.0 91 0.4365
0.4316 92.0 92 0.4116
0.4012 93.0 93 0.4157
0.4116 94.0 94 0.4185
0.4101 95.0 95 0.4026
0.3975 96.0 96 0.4144
0.3985 97.0 97 0.4438
0.424 98.0 98 0.4383
0.3901 99.0 99 0.4320
0.4301 100.0 100 0.4259
0.428 101.0 101 0.3934
0.3836 102.0 102 0.4109
0.3887 103.0 103 0.4203
0.423 104.0 104 0.3942
0.3722 105.0 105 0.3682
0.3909 106.0 106 0.3681
0.3776 107.0 107 0.3945
0.392 108.0 108 0.3728
0.3536 109.0 109 0.3862
0.4197 110.0 110 0.4024
0.3988 111.0 111 0.3919
0.4064 112.0 112 0.4617
0.4446 113.0 113 0.5055
0.4482 114.0 114 0.4476
0.3832 115.0 115 0.3900
0.3675 116.0 116 0.4018
0.3782 117.0 117 0.3880
0.352 118.0 118 0.3790
0.3458 119.0 119 0.3794
0.3427 120.0 120 0.3671
0.3223 121.0 121 0.3703
0.3161 122.0 122 0.3888
0.3211 123.0 123 0.4134
0.3247 124.0 124 0.4017
0.333 125.0 125 0.3822
0.3227 126.0 126 0.3792
0.3264 127.0 127 0.3783
0.3284 128.0 128 0.3735
0.3199 129.0 129 0.3614
0.3344 130.0 130 0.3755
0.3148 131.0 131 0.3901
0.3592 132.0 132 0.3819
0.3358 133.0 133 0.3764
0.3047 134.0 134 0.3779
0.3538 135.0 135 0.3580
0.3257 136.0 136 0.3419
0.2865 137.0 137 0.3402
0.3037 138.0 138 0.3470
0.3098 139.0 139 0.3432
0.3087 140.0 140 0.3354
0.2926 141.0 141 0.3469
0.2811 142.0 142 0.3526
0.3072 143.0 143 0.3465
0.3092 144.0 144 0.3487
0.3048 145.0 145 0.3465
0.2961 146.0 146 0.3384
0.3149 147.0 147 0.3383
0.3147 148.0 148 0.3326
0.2927 149.0 149 0.3306
0.2765 150.0 150 0.3331
0.2755 151.0 151 0.3255
0.304 152.0 152 0.3027
0.3011 153.0 153 0.3018
0.2842 154.0 154 0.3092
0.2936 155.0 155 0.3037
0.2852 156.0 156 0.3044
0.2726 157.0 157 0.3143
0.2577 158.0 158 0.3330
0.2904 159.0 159 0.3436
0.2619 160.0 160 0.3452
0.276 161.0 161 0.3475
0.2608 162.0 162 0.3454
0.2529 163.0 163 0.3336
0.2685 164.0 164 0.3183
0.2571 165.0 165 0.3048
0.2641 166.0 166 0.2957
0.2549 167.0 167 0.2926
0.243 168.0 168 0.2904
0.2574 169.0 169 0.2917
0.2597 170.0 170 0.2987
0.2512 171.0 171 0.2979
0.247 172.0 172 0.2906
0.2485 173.0 173 0.2851
0.2512 174.0 174 0.2869
0.2481 175.0 175 0.2838
0.268 176.0 176 0.2866
0.2477 177.0 177 0.2902
0.2498 178.0 178 0.2963
0.2535 179.0 179 0.2963
0.2658 180.0 180 0.2939
0.2506 181.0 181 0.2943
0.251 182.0 182 0.2894
0.2491 183.0 183 0.2818
0.2484 184.0 184 0.2767
0.2373 185.0 185 0.2740
0.2481 186.0 186 0.2718
0.2438 187.0 187 0.2690
0.2168 188.0 188 0.2658
0.237 189.0 189 0.2639
0.2505 190.0 190 0.2625
0.2448 191.0 191 0.2622
0.2366 192.0 192 0.2639
0.2394 193.0 193 0.2681
0.2537 194.0 194 0.2727
0.2259 195.0 195 0.2753
0.2314 196.0 196 0.2750
0.2398 197.0 197 0.2730
0.2515 198.0 198 0.2707
0.2244 199.0 199 0.2690
0.2458 200.0 200 0.2680

Framework versions

  • Transformers 4.38.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2