Edit model card

runs

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 24.0950
  • Accuracy: 0.0013

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 48
  • seed: 444
  • gradient_accumulation_steps: 3
  • total_train_batch_size: 48
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_warmup_ratio: 0.3
  • training_steps: 100000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
8.2359 6.04 1000 8.2170 0.0070
7.7137 12.07 2000 7.7007 0.0064
6.5277 18.11 3000 6.5254 0.0000
6.0375 24.14 4000 6.0532 0.0000
5.6908 30.18 5000 5.7100 0.0001
5.4294 36.22 6000 5.4758 0.0002
5.2161 42.25 7000 5.2891 0.0006
5.0151 48.29 8000 5.1152 0.0021
4.8349 54.33 9000 4.9847 0.0020
4.6358 60.36 10000 4.8754 0.0022
4.4326 66.4 11000 4.7809 0.0021
4.2632 72.43 12000 4.7416 0.0017
4.0415 78.47 13000 4.7503 0.0016
3.8196 84.51 14000 4.8472 0.0014
3.6207 90.54 15000 5.0215 0.0014
3.3163 96.58 16000 5.2939 0.0014
3.0377 102.62 17000 5.6685 0.0014
2.7272 108.65 18000 6.1649 0.0013
2.4319 114.69 19000 6.7556 0.0013
2.1647 120.72 20000 7.3951 0.0013
1.9001 126.76 21000 8.0823 0.0013
1.6708 132.8 22000 8.8230 0.0013
1.4762 138.83 23000 9.5335 0.0013
1.2833 144.87 24000 10.1973 0.0013
1.1451 150.91 25000 10.8213 0.0013
1.0251 156.94 26000 11.4402 0.0013
0.9164 162.98 27000 11.9995 0.0013
0.8174 169.01 28000 12.5680 0.0013
0.6862 175.05 29000 13.0050 0.0013
0.5738 181.09 30000 13.4692 0.0013
0.4524 187.12 31000 13.9220 0.0013
0.4252 193.16 32000 14.3340 0.0013
0.3952 199.2 33000 14.7961 0.0013
0.3684 205.23 34000 15.2421 0.0013
0.3338 211.27 35000 15.6433 0.0013
0.307 217.3 36000 16.0182 0.0013
0.2951 223.34 37000 16.3087 0.0013
0.28 229.38 38000 16.6556 0.0013
0.2688 235.41 39000 16.9303 0.0013
0.2582 241.45 40000 17.2209 0.0013
0.238 247.48 41000 17.5311 0.0013
0.2261 253.52 42000 17.7731 0.0013
0.21 259.56 43000 18.0205 0.0013
0.2073 265.59 44000 18.2693 0.0013
0.1976 271.63 45000 18.4634 0.0013
0.1865 277.67 46000 18.7215 0.0012
0.1769 283.7 47000 18.9467 0.0013
0.1649 289.74 48000 19.1423 0.0013
0.1517 295.77 49000 19.3638 0.0013
0.1491 301.81 50000 19.5879 0.0013
0.1387 307.85 51000 19.7823 0.0013
0.1332 313.88 52000 19.9663 0.0013
0.1256 319.92 53000 20.1907 0.0013
0.1154 325.96 54000 20.3939 0.0013
0.1091 331.99 55000 20.5926 0.0013
0.0928 338.03 56000 20.8044 0.0013
0.0812 344.06 57000 20.9873 0.0013
0.0677 350.1 58000 21.1931 0.0013
0.0609 356.14 59000 21.3650 0.0013
0.058 362.17 60000 21.5868 0.0013
0.0532 368.21 61000 21.7740 0.0013
0.0481 374.25 62000 21.9339 0.0013
0.0358 380.28 63000 22.1660 0.0012
0.0117 386.32 64000 22.4226 0.0013
0.0768 392.35 65000 22.2193 0.0013
0.0339 398.39 66000 22.3833 0.0013
0.0191 404.43 67000 22.5927 0.0013
0.0493 410.46 68000 22.6069 0.0013
0.0115 416.5 69000 22.8652 0.0012
0.0111 422.54 70000 22.9982 0.0012
0.1182 428.57 71000 22.6628 0.0013
0.0118 434.61 72000 22.9036 0.0013
0.0111 440.64 73000 23.0692 0.0013
0.011 446.68 74000 23.1857 0.0013
0.0386 452.72 75000 22.9263 0.0013
0.0109 458.75 76000 23.1548 0.0013
0.0109 464.79 77000 23.2761 0.0012
0.0108 470.82 78000 23.3763 0.0013
0.0131 476.86 79000 23.2048 0.0013
0.0108 482.9 80000 23.3772 0.0013
0.0106 488.93 81000 23.4733 0.0013
0.0106 494.97 82000 23.5654 0.0013
0.0242 501.01 83000 23.5459 0.0013
0.0104 507.04 84000 23.5695 0.0013
0.01 513.08 85000 23.6659 0.0013
0.0098 519.11 86000 23.7337 0.0013
0.0097 525.15 87000 23.7961 0.0013
0.0097 531.19 88000 23.8573 0.0013
0.0097 537.22 89000 23.9052 0.0013
0.0097 543.26 90000 23.9524 0.0013
0.0096 549.3 91000 23.9823 0.0013
0.0096 555.33 92000 24.0084 0.0013
0.0095 561.37 93000 24.0364 0.0013
0.0095 567.4 94000 24.0545 0.0013
0.0094 573.44 95000 24.0701 0.0013
0.0094 579.48 96000 24.0826 0.0013
0.0093 585.51 97000 24.0898 0.0013
0.0093 591.55 98000 24.0935 0.0013
0.0093 597.59 99000 24.0944 0.0013
0.0092 603.62 100000 24.0950 0.0013

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.1
Downloads last month
16
Safetensors
Model size
41.7M params
Tensor type
F32
·