Edit model card

swiftformer-xs-dmae-va-U-80B

This model is a fine-tuned version of MBZUAI/swiftformer-xs on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5127
  • Accuracy: 0.8532

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.2
  • num_epochs: 80

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 0.9 7 1.5142 0.2477
1.4484 1.94 15 1.4144 0.2936
1.4394 2.97 23 1.4008 0.3119
1.4217 4.0 31 1.3817 0.3303
1.4025 4.9 38 1.3572 0.3761
1.3816 5.94 46 1.3374 0.4128
1.355 6.97 54 1.3131 0.4220
1.3123 8.0 62 1.2983 0.4220
1.2909 8.9 69 1.2640 0.4587
1.253 9.94 77 1.2272 0.5046
1.2244 10.97 85 1.1942 0.5321
1.1934 12.0 93 1.1909 0.5321
1.156 12.9 100 1.1293 0.5688
1.1127 13.94 108 1.0994 0.5413
1.0505 14.97 116 1.0621 0.5780
1.0323 16.0 124 1.0657 0.6147
0.9802 16.9 131 1.0402 0.6147
0.981 17.94 139 1.0034 0.6514
0.9461 18.97 147 0.9867 0.6330
0.9538 20.0 155 0.9721 0.6514
0.8932 20.9 162 0.9433 0.6881
0.8865 21.94 170 0.9072 0.7248
0.8673 22.97 178 0.9035 0.6881
0.8533 24.0 186 0.8879 0.7064
0.8474 24.9 193 0.8569 0.7339
0.794 25.94 201 0.8465 0.7339
0.814 26.97 209 0.8138 0.7339
0.7915 28.0 217 0.8251 0.7706
0.7437 28.9 224 0.8197 0.7431
0.7584 29.94 232 0.8035 0.7615
0.7256 30.97 240 0.7614 0.7523
0.707 32.0 248 0.7498 0.7523
0.707 32.9 255 0.7515 0.7706
0.6825 33.94 263 0.7327 0.7615
0.6892 34.97 271 0.7706 0.7431
0.6702 36.0 279 0.7571 0.7523
0.6691 36.9 286 0.7104 0.7523
0.6335 37.94 294 0.6955 0.7615
0.6294 38.97 302 0.6769 0.7706
0.6099 40.0 310 0.6560 0.7706
0.6137 40.9 317 0.6429 0.7798
0.5822 41.94 325 0.6249 0.7706
0.571 42.97 333 0.6305 0.7706
0.5842 44.0 341 0.6467 0.7706
0.5849 44.9 348 0.5982 0.7890
0.5811 45.94 356 0.6022 0.7798
0.5497 46.97 364 0.5964 0.7798
0.5323 48.0 372 0.5784 0.7890
0.514 48.9 379 0.5897 0.8073
0.5112 49.94 387 0.6092 0.7798
0.5577 50.97 395 0.5812 0.7982
0.4808 52.0 403 0.5518 0.8257
0.5132 52.9 410 0.5672 0.8073
0.4954 53.94 418 0.5592 0.7982
0.5005 54.97 426 0.5815 0.7890
0.4798 56.0 434 0.5670 0.8073
0.4998 56.9 441 0.5458 0.7982
0.4685 57.94 449 0.5760 0.7890
0.5153 58.97 457 0.5413 0.8165
0.4725 60.0 465 0.5340 0.8257
0.4982 60.9 472 0.5230 0.8257
0.4801 61.94 480 0.5306 0.8257
0.4897 62.97 488 0.5318 0.8165
0.4277 64.0 496 0.5127 0.8532
0.4277 64.9 503 0.5070 0.8349
0.5024 65.94 511 0.4977 0.8440
0.5074 66.97 519 0.5297 0.8165
0.439 68.0 527 0.5005 0.8440
0.4805 68.9 534 0.5269 0.8165
0.4729 69.94 542 0.5227 0.8073
0.4376 70.97 550 0.5062 0.8165
0.4606 72.0 558 0.5254 0.8165
0.4146 72.26 560 0.5001 0.8349

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.1
Downloads last month
8
Safetensors
Model size
3.04M params
Tensor type
F32
·

Finetuned from