abte-restaurants-transformer

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4235
  • F1-score: 0.5421

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss F1-score
0.7388 1.0 8 0.6483 0.0
0.5755 2.0 16 0.6447 0.0
0.558 3.0 24 0.6260 0.0
0.5373 4.0 32 0.6089 0.0
0.5429 5.0 40 0.6016 0.0
0.5312 6.0 48 0.5928 0.0
0.5186 7.0 56 0.5816 0.0007
0.4968 8.0 64 0.5678 0.0241
0.5044 9.0 72 0.5551 0.1569
0.4782 10.0 80 0.5470 0.1896
0.4593 11.0 88 0.5316 0.2667
0.4484 12.0 96 0.5223 0.3096
0.4292 13.0 104 0.5255 0.3096
0.4232 14.0 112 0.5056 0.3716
0.416 15.0 120 0.5070 0.3669
0.4142 16.0 128 0.5082 0.3830
0.4018 17.0 136 0.4959 0.4371
0.3924 18.0 144 0.4946 0.4415
0.3683 19.0 152 0.4900 0.4552
0.3854 20.0 160 0.4850 0.4703
0.4049 21.0 168 0.4875 0.4669
0.381 22.0 176 0.4785 0.4831
0.3789 23.0 184 0.4740 0.4917
0.3683 24.0 192 0.4786 0.4856
0.3689 25.0 200 0.4694 0.4983
0.3646 26.0 208 0.4741 0.4937
0.3585 27.0 216 0.4643 0.5025
0.3602 28.0 224 0.4617 0.5050
0.3685 29.0 232 0.4683 0.5004
0.352 30.0 240 0.4591 0.5036
0.3542 31.0 248 0.4551 0.5086
0.3536 32.0 256 0.4600 0.5045
0.344 33.0 264 0.4589 0.5061
0.3453 34.0 272 0.4500 0.5122
0.354 35.0 280 0.4594 0.5075
0.3528 36.0 288 0.4536 0.5199
0.3316 37.0 296 0.4535 0.5190
0.3397 38.0 304 0.4469 0.5209
0.3292 39.0 312 0.4493 0.5211
0.3276 40.0 320 0.4477 0.5217
0.3308 41.0 328 0.4519 0.5208
0.3301 42.0 336 0.4392 0.5208
0.3272 43.0 344 0.4492 0.5199
0.3273 44.0 352 0.4484 0.5210
0.3193 45.0 360 0.4406 0.5264
0.3268 46.0 368 0.4444 0.5268
0.3184 47.0 376 0.4399 0.5278
0.3229 48.0 384 0.4374 0.5271
0.3061 49.0 392 0.4439 0.5288
0.3176 50.0 400 0.4358 0.5257
0.3133 51.0 408 0.4346 0.5255
0.317 52.0 416 0.4392 0.5278
0.3025 53.0 424 0.4336 0.5261
0.2933 54.0 432 0.4340 0.5261
0.2991 55.0 440 0.4391 0.5274
0.2989 56.0 448 0.4323 0.5291
0.2984 57.0 456 0.4304 0.5321
0.2961 58.0 464 0.4390 0.5296
0.3014 59.0 472 0.4296 0.5298
0.2992 60.0 480 0.4302 0.5299
0.305 61.0 488 0.4315 0.5327
0.2959 62.0 496 0.4353 0.5343
0.2901 63.0 504 0.4292 0.5335
0.2977 64.0 512 0.4323 0.5341
0.2979 65.0 520 0.4257 0.5343
0.2887 66.0 528 0.4309 0.5357
0.2922 67.0 536 0.4282 0.5361
0.287 68.0 544 0.4300 0.5374
0.2866 69.0 552 0.4269 0.5374
0.2904 70.0 560 0.4266 0.5375
0.293 71.0 568 0.4274 0.5368
0.2974 72.0 576 0.4263 0.5351
0.2822 73.0 584 0.4295 0.5383
0.2865 74.0 592 0.4252 0.5369
0.284 75.0 600 0.4292 0.5384
0.2889 76.0 608 0.4245 0.5374
0.3004 77.0 616 0.4256 0.5387
0.2854 78.0 624 0.4252 0.5405
0.3023 79.0 632 0.4241 0.5412
0.2856 80.0 640 0.4251 0.5407
0.283 81.0 648 0.4258 0.5405
0.2882 82.0 656 0.4224 0.5393
0.281 83.0 664 0.4263 0.5403
0.2873 84.0 672 0.4267 0.5401
0.2788 85.0 680 0.4226 0.5398
0.2805 86.0 688 0.4261 0.5400
0.2854 87.0 696 0.4242 0.5408
0.296 88.0 704 0.4217 0.5406
0.2804 89.0 712 0.4248 0.5410
0.2815 90.0 720 0.4230 0.5426
0.2852 91.0 728 0.4221 0.5431
0.2874 92.0 736 0.4225 0.5413
0.2825 93.0 744 0.4238 0.5411
0.2833 94.0 752 0.4247 0.5414
0.2756 95.0 760 0.4234 0.5416
0.2789 96.0 768 0.4227 0.5420
0.2775 97.0 776 0.4232 0.5419
0.2787 98.0 784 0.4234 0.5419
0.2778 99.0 792 0.4236 0.5416
0.2805 100.0 800 0.4235 0.5421

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
4
Safetensors
Model size
5.81M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support