Edit model card

results_model5

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 7.6601

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 300
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
7.2381 0.5570 10000 7.1552
6.9121 1.1141 20000 6.8341
6.7123 1.6711 30000 6.6405
6.4926 2.2282 40000 6.4267
6.3666 2.7852 50000 6.3720
6.2655 3.3422 60000 6.3113
6.1958 3.8993 70000 6.2503
6.0908 4.4563 80000 6.2392
6.0127 5.0134 90000 6.2291
6.0189 5.5704 100000 6.2311
5.9728 6.1275 110000 6.1985
5.9594 6.6845 120000 6.2354
5.873 7.2415 130000 6.2018
5.8756 7.7986 140000 6.2386
5.7656 8.3556 150000 6.2222
5.813 8.9127 160000 6.2550
5.8192 9.4697 170000 6.2894
5.7091 10.0267 180000 6.3533
5.703 10.5838 190000 6.3356
5.6674 11.1408 200000 6.4957
5.6302 11.6979 210000 6.4069
5.6517 12.2549 220000 6.5459
5.6388 12.8119 230000 6.5383
5.5918 13.3690 240000 6.5509
5.622 13.9260 250000 6.4959
5.5546 14.4831 260000 6.5783
5.5205 15.0401 270000 6.5076
5.5209 15.5971 280000 6.5128
5.4692 16.1542 290000 6.3658
5.4871 16.7112 300000 6.3450
5.444 17.2683 310000 6.3261
5.4781 17.8253 320000 6.3246
5.4131 18.3824 330000 6.3904
5.4128 18.9394 340000 6.5145
5.4063 19.4964 350000 6.4409
5.3473 20.0535 360000 6.6570
5.4103 20.6105 370000 6.5708
5.3782 21.1676 380000 6.7661
5.4002 21.7246 390000 6.7968
5.3759 22.2816 400000 6.7145
5.3636 22.8387 410000 6.8896
5.3629 23.3957 420000 6.7899
5.3251 23.9528 430000 6.7925
5.3415 24.5098 440000 6.5798
5.3247 25.0668 450000 6.7255
5.3172 25.6239 460000 6.7998
5.2915 26.1809 470000 6.9089
5.2899 26.7380 480000 6.7261
5.3112 27.2950 490000 6.7184
5.3173 27.8520 500000 6.8470
5.325 28.4091 510000 6.9112
5.2632 28.9661 520000 6.7319
5.2486 29.5232 530000 6.9459
5.2513 30.0802 540000 6.9476
5.2666 30.6373 550000 7.1228
5.2209 31.1943 560000 7.1333
5.2951 31.7513 570000 7.0138
5.2281 32.3084 580000 7.1338
5.275 32.8654 590000 7.0661
5.2248 33.4225 600000 7.1180
5.243 33.9795 610000 7.2631
5.1808 34.5365 620000 7.2399
5.2124 35.0936 630000 7.3326
5.2298 35.6506 640000 7.3016
5.1622 36.2077 650000 7.2848
5.1914 36.7647 660000 7.2105
5.2101 37.3217 670000 7.3469
5.2145 37.8788 680000 7.2929
5.1965 38.4358 690000 7.4581
5.1829 38.9929 700000 7.3079
5.1948 39.5499 710000 7.4294
5.1887 40.1070 720000 7.4563
5.1636 40.6640 730000 7.3479
5.1674 41.2210 740000 7.4878
5.2115 41.7781 750000 7.5378
5.1818 42.3351 760000 7.6372
5.1997 42.8922 770000 7.6155
5.1652 43.4492 780000 7.5538
5.1446 44.0062 790000 7.5399
5.1693 44.5633 800000 7.6295
5.1336 45.1203 810000 7.6689
5.1358 45.6774 820000 7.5853
5.1233 46.2344 830000 7.6833
5.1395 46.7914 840000 7.6448
5.125 47.3485 850000 7.6463
5.161 47.9055 860000 7.6284
5.1301 48.4626 870000 7.6313
5.1448 49.0196 880000 7.6512
5.1284 49.5766 890000 7.6601

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
1
Safetensors
Model size
62.8M params
Tensor type
F32
·
Unable to determine this model’s pipeline type. Check the docs .