100M__8397

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.4643
  • Accuracy: 0.3759

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 32
  • eval_batch_size: 16
  • seed: 8397
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Accuracy Validation Loss
5.0934 0.1078 1000 0.2275 5.0225
4.5937 0.2156 2000 0.2702 4.5165
4.3092 0.3235 3000 0.2978 4.2461
4.1586 0.4313 4000 0.3127 4.0885
4.0492 0.5391 5000 0.3212 3.9914
3.9962 0.6469 6000 0.3279 3.9193
3.9219 0.7547 7000 0.3335 3.8632
3.8566 0.8625 8000 0.3377 3.8155
3.856 0.9704 9000 0.3408 3.7813
3.7649 1.0782 10000 0.3444 3.7518
3.769 1.1860 11000 0.3474 3.7232
3.7288 1.2938 12000 0.3493 3.6971
3.7086 1.4016 13000 0.3515 3.6748
3.703 1.5094 14000 0.3535 3.6584
3.664 1.6173 15000 0.3555 3.6374
3.6639 1.7251 16000 0.3572 3.6206
3.6383 1.8329 17000 0.3587 3.6051
3.6321 1.9407 18000 0.3600 3.5912
3.5672 2.0485 19000 0.3615 3.5821
3.5754 2.1563 20000 0.3624 3.5733
3.5637 2.2642 21000 0.3637 3.5613
3.5706 2.3720 22000 0.3646 3.5504
3.5408 2.4798 23000 0.3660 3.5402
3.532 2.5876 24000 0.3669 3.5312
3.527 2.6954 25000 0.3673 3.5218
3.5333 2.8032 26000 0.3685 3.5120
3.5442 2.9111 27000 0.3696 3.5054
3.4451 3.0189 28000 0.3700 3.5012
3.4295 3.1267 29000 0.3709 3.4969
3.4506 3.2345 30000 0.3713 3.4902
3.4582 3.3423 31000 0.3719 3.4849
3.4597 3.4501 32000 0.3733 3.4757
3.4639 3.5580 33000 0.3736 3.4722
3.4602 3.6658 34000 0.3735 3.4653
3.4613 3.7736 35000 0.3749 3.4591
3.4517 3.8814 36000 0.3755 3.4511
3.4488 3.9892 37000 0.3761 3.4457
3.3644 4.0970 38000 0.3764 3.4490
3.3636 4.2049 39000 0.3773 3.4452
3.3899 4.3127 40000 0.3775 3.4419
3.3969 4.4205 41000 0.3782 3.4341
3.3848 4.5283 42000 0.3786 3.4308
3.3952 4.6361 43000 0.3788 3.4244
3.3875 4.7439 44000 0.3793 3.4211
3.4071 4.8518 45000 0.3799 3.4152
3.3742 4.9596 46000 0.3803 3.4095
3.3158 5.0674 47000 0.3807 3.4110
3.3119 5.1752 48000 0.3812 3.4111
3.3434 5.2830 49000 0.3813 3.4074
3.3422 5.3908 50000 0.3820 3.4001
3.3445 5.4987 51000 0.3820 3.3994
3.3169 5.6065 52000 0.3826 3.3929
3.333 5.7143 53000 0.3832 3.3878
3.331 5.8221 54000 0.3835 3.3844
3.3342 5.9299 55000 0.3840 3.3803
3.2353 6.0377 56000 0.3840 3.3844
3.2567 6.1456 57000 0.3843 3.3837
3.2751 6.2534 58000 0.3847 3.3811
3.2877 6.3612 59000 0.3850 3.3768
3.2882 6.4690 60000 0.3854 3.3732
3.2801 6.5768 61000 0.3860 3.3683
3.2991 6.6846 62000 0.3863 3.3624
3.2791 6.7925 63000 0.3866 3.3601
3.288 6.9003 64000 0.3871 3.3546
3.1829 7.0081 65000 0.3871 3.3573
3.2234 7.1159 66000 0.3873 3.3607
3.2282 7.2237 67000 0.3876 3.3572
3.2267 7.3315 68000 0.3878 3.3537
3.218 7.4394 69000 0.3883 3.3486
3.2235 7.5472 70000 0.3885 3.3458
3.2512 7.6550 71000 0.3889 3.3421
3.2503 7.7628 72000 0.3895 3.3384
3.2303 7.8706 73000 0.3901 3.3343
3.2506 7.9784 74000 0.3901 3.3303
3.155 8.0863 75000 0.3902 3.3376
3.1552 8.1941 76000 0.3902 3.3350
3.1737 8.3019 77000 0.3907 3.3308
3.167 8.4097 78000 0.3908 3.3294
3.1798 8.5175 79000 0.3912 3.3247
3.1957 8.6253 80000 0.3918 3.3219
3.1876 8.7332 81000 0.3921 3.3189
3.1825 8.8410 82000 0.3922 3.3150
3.1641 8.9488 83000 0.3927 3.3115
3.1278 9.0566 84000 0.3927 3.3139
3.1192 9.1644 85000 0.3928 3.3138
3.1443 9.2722 86000 0.3932 3.3118
3.1211 9.3801 87000 0.3934 3.3095
3.1296 9.4879 88000 0.3936 3.3073
3.1073 9.5957 89000 0.3939 3.3058
3.1362 9.7035 90000 0.3942 3.3025
3.1261 9.8113 91000 0.3944 3.3008
3.1183 9.9191 92000 0.3945 3.2993
3.3293 10.0270 93000 3.4597 0.3782
3.364 10.1348 94000 3.4676 0.3759
3.3881 10.2426 95000 3.4643 0.3759

Framework versions

  • Transformers 4.47.0.dev0
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.2
  • Tokenizers 0.20.1
Downloads last month
5
Safetensors
Model size
126M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support