Edit model card

dit-base_tobacco

This model is a fine-tuned version of microsoft/dit-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3120
  • Accuracy: 0.95
  • Brier Loss: 0.0965
  • Nll: 0.6372
  • F1 Micro: 0.9500
  • F1 Macro: 0.9545
  • Ece: 0.0560
  • Aurc: 0.0092

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Accuracy Brier Loss Nll F1 Micro F1 Macro Ece Aurc
No log 0.96 6 2.4454 0.175 0.9193 8.6626 0.175 0.0676 0.2489 0.8592
No log 1.96 12 2.3287 0.175 0.9034 7.2049 0.175 0.0674 0.2590 0.8557
No log 2.96 18 2.0836 0.23 0.8528 3.3114 0.23 0.1544 0.2652 0.7357
No log 3.96 24 2.0456 0.315 0.8435 3.8932 0.315 0.1785 0.3010 0.6372
No log 4.96 30 1.8778 0.3 0.7820 3.0975 0.3 0.1659 0.2985 0.5174
No log 5.96 36 1.7247 0.365 0.7305 2.7808 0.3650 0.2235 0.2507 0.4036
No log 6.96 42 1.6610 0.38 0.7183 2.6958 0.38 0.2449 0.2538 0.4119
No log 7.96 48 1.4667 0.505 0.6417 2.4078 0.505 0.3653 0.2881 0.2656
No log 8.96 54 1.3427 0.58 0.6031 2.0381 0.58 0.5304 0.2885 0.2470
No log 9.96 60 1.1586 0.635 0.5217 1.8792 0.635 0.5496 0.2831 0.1697
No log 10.96 66 1.0108 0.71 0.4578 1.6886 0.7100 0.6273 0.2851 0.1340
No log 11.96 72 0.8648 0.75 0.3849 1.5408 0.75 0.6788 0.2530 0.0801
No log 12.96 78 0.7342 0.79 0.3327 1.3588 0.79 0.7264 0.2152 0.0575
No log 13.96 84 0.6024 0.835 0.2734 1.2694 0.835 0.7937 0.1876 0.0429
No log 14.96 90 0.5143 0.85 0.2386 1.1756 0.85 0.8175 0.1714 0.0363
No log 15.96 96 0.4429 0.865 0.2044 1.1080 0.865 0.8435 0.1380 0.0277
No log 16.96 102 0.3999 0.885 0.1854 1.0748 0.885 0.8673 0.1407 0.0274
No log 17.96 108 0.3635 0.88 0.1732 1.0361 0.88 0.8594 0.1117 0.0247
No log 18.96 114 0.3166 0.89 0.1454 1.0855 0.89 0.8682 0.0971 0.0196
No log 19.96 120 0.3137 0.905 0.1418 1.1614 0.905 0.8934 0.1041 0.0195
No log 20.96 126 0.3207 0.91 0.1408 1.1941 0.91 0.9002 0.0856 0.0198
No log 21.96 132 0.2753 0.925 0.1224 1.0928 0.925 0.9209 0.0858 0.0145
No log 22.96 138 0.2538 0.925 0.1169 1.0895 0.925 0.9187 0.0863 0.0111
No log 23.96 144 0.2691 0.935 0.1138 1.0767 0.935 0.9279 0.0730 0.0149
No log 24.96 150 0.2775 0.935 0.1131 1.0538 0.935 0.9292 0.0676 0.0157
No log 25.96 156 0.2544 0.94 0.1011 1.0266 0.94 0.9292 0.0643 0.0131
No log 26.96 162 0.2637 0.945 0.1013 1.0337 0.945 0.9384 0.0648 0.0150
No log 27.96 168 0.2787 0.94 0.1089 1.0202 0.94 0.9348 0.0685 0.0161
No log 28.96 174 0.2794 0.935 0.1091 1.0099 0.935 0.9306 0.0671 0.0143
No log 29.96 180 0.2631 0.935 0.1025 0.9815 0.935 0.9306 0.0575 0.0129
No log 30.96 186 0.2616 0.945 0.1009 0.9683 0.945 0.9401 0.0674 0.0120
No log 31.96 192 0.2726 0.935 0.1074 0.9598 0.935 0.9346 0.0641 0.0100
No log 32.96 198 0.2765 0.935 0.1058 0.9067 0.935 0.9321 0.0696 0.0101
No log 33.96 204 0.2662 0.95 0.0965 0.8891 0.9500 0.9522 0.0672 0.0120
No log 34.96 210 0.2761 0.935 0.1019 0.8893 0.935 0.9338 0.0597 0.0134
No log 35.96 216 0.2729 0.945 0.0961 0.8807 0.945 0.9419 0.0552 0.0119
No log 36.96 222 0.2741 0.94 0.1037 0.8782 0.94 0.9356 0.0645 0.0086
No log 37.96 228 0.2686 0.94 0.0994 0.8423 0.94 0.9356 0.0592 0.0085
No log 38.96 234 0.2712 0.95 0.0906 0.8179 0.9500 0.9545 0.0610 0.0105
No log 39.96 240 0.2644 0.95 0.0870 0.8240 0.9500 0.9443 0.0510 0.0110
No log 40.96 246 0.2653 0.95 0.0932 0.8386 0.9500 0.9525 0.0572 0.0118
No log 41.96 252 0.2724 0.955 0.0939 0.8369 0.955 0.9573 0.0602 0.0104
No log 42.96 258 0.2552 0.95 0.0868 0.8079 0.9500 0.9522 0.0539 0.0079
No log 43.96 264 0.2629 0.95 0.0879 0.7800 0.9500 0.9545 0.0526 0.0080
No log 44.96 270 0.2664 0.955 0.0864 0.7660 0.955 0.9575 0.0515 0.0086
No log 45.96 276 0.2777 0.945 0.0948 0.7670 0.945 0.9513 0.0524 0.0096
No log 46.96 282 0.2824 0.94 0.1014 0.7799 0.94 0.9436 0.0570 0.0093
No log 47.96 288 0.2699 0.95 0.0896 0.7706 0.9500 0.9546 0.0528 0.0087
No log 48.96 294 0.2809 0.945 0.0950 0.7691 0.945 0.9480 0.0475 0.0087
No log 49.96 300 0.2827 0.945 0.0940 0.7635 0.945 0.9447 0.0571 0.0091
No log 50.96 306 0.2781 0.945 0.0921 0.7591 0.945 0.9478 0.0552 0.0090
No log 51.96 312 0.2834 0.95 0.0946 0.7572 0.9500 0.9484 0.0549 0.0089
No log 52.96 318 0.2986 0.94 0.0994 0.7541 0.94 0.9363 0.0605 0.0091
No log 53.96 324 0.2957 0.94 0.1016 0.7447 0.94 0.9385 0.0562 0.0086
No log 54.96 330 0.2991 0.94 0.1047 0.7392 0.94 0.9377 0.0592 0.0102
No log 55.96 336 0.3027 0.94 0.1031 0.7235 0.94 0.9377 0.0572 0.0113
No log 56.96 342 0.2945 0.945 0.0968 0.7143 0.945 0.9470 0.0581 0.0104
No log 57.96 348 0.2935 0.94 0.0955 0.7046 0.94 0.9459 0.0569 0.0097
No log 58.96 354 0.2909 0.94 0.0934 0.6969 0.94 0.9459 0.0544 0.0092
No log 59.96 360 0.2973 0.95 0.0939 0.6964 0.9500 0.9545 0.0524 0.0082
No log 60.96 366 0.3222 0.93 0.1108 0.7078 0.93 0.9266 0.0586 0.0088
No log 61.96 372 0.3247 0.935 0.1093 0.7743 0.935 0.9353 0.0622 0.0091
No log 62.96 378 0.3125 0.945 0.1003 0.7651 0.945 0.9453 0.0559 0.0089
No log 63.96 384 0.3035 0.945 0.0993 0.7515 0.945 0.9476 0.0545 0.0088
No log 64.96 390 0.3002 0.945 0.0973 0.7408 0.945 0.9476 0.0537 0.0091
No log 65.96 396 0.3023 0.95 0.0965 0.7321 0.9500 0.9545 0.0523 0.0095
No log 66.96 402 0.3075 0.945 0.1007 0.7323 0.945 0.9477 0.0540 0.0096
No log 67.96 408 0.3062 0.945 0.0999 0.6682 0.945 0.9514 0.0525 0.0098
No log 68.96 414 0.3182 0.945 0.0968 0.6809 0.945 0.9432 0.0485 0.0115
No log 69.96 420 0.3272 0.945 0.0972 0.6879 0.945 0.9432 0.0513 0.0132
No log 70.96 426 0.3210 0.945 0.0973 0.7545 0.945 0.9488 0.0522 0.0124
No log 71.96 432 0.3194 0.945 0.1027 0.7464 0.945 0.9514 0.0546 0.0108
No log 72.96 438 0.3236 0.94 0.1067 0.7486 0.94 0.9427 0.0587 0.0097
No log 73.96 444 0.3166 0.94 0.1049 0.6751 0.94 0.9427 0.0597 0.0096
No log 74.96 450 0.3062 0.945 0.0982 0.6702 0.945 0.9514 0.0526 0.0100
No log 75.96 456 0.3018 0.95 0.0948 0.6823 0.9500 0.9545 0.0523 0.0102
No log 76.96 462 0.3062 0.95 0.0951 0.7444 0.9500 0.9545 0.0522 0.0109
No log 77.96 468 0.3072 0.95 0.0933 0.7437 0.9500 0.9545 0.0501 0.0118
No log 78.96 474 0.3095 0.95 0.0943 0.6749 0.9500 0.9545 0.0512 0.0121
No log 79.96 480 0.3097 0.945 0.0968 0.6654 0.945 0.9514 0.0576 0.0116
No log 80.96 486 0.3094 0.95 0.0967 0.6581 0.9500 0.9545 0.0526 0.0112
No log 81.96 492 0.3109 0.95 0.0954 0.6549 0.9500 0.9545 0.0507 0.0115
No log 82.96 498 0.3104 0.95 0.0949 0.7168 0.9500 0.9545 0.0521 0.0113
0.3747 83.96 504 0.3122 0.95 0.0949 0.7130 0.9500 0.9545 0.0513 0.0111
0.3747 84.96 510 0.3140 0.95 0.0944 0.7116 0.9500 0.9545 0.0534 0.0113
0.3747 85.96 516 0.3175 0.95 0.0949 0.7100 0.9500 0.9545 0.0544 0.0113
0.3747 86.96 522 0.3187 0.95 0.0958 0.7072 0.9500 0.9545 0.0537 0.0111
0.3747 87.96 528 0.3191 0.95 0.0967 0.6428 0.9500 0.9545 0.0536 0.0103
0.3747 88.96 534 0.3168 0.95 0.0963 0.6438 0.9500 0.9545 0.0542 0.0102
0.3747 89.96 540 0.3136 0.95 0.0963 0.6418 0.9500 0.9545 0.0554 0.0099
0.3747 90.96 546 0.3117 0.95 0.0963 0.6407 0.9500 0.9545 0.0533 0.0097
0.3747 91.96 552 0.3113 0.95 0.0964 0.6403 0.9500 0.9545 0.0528 0.0091
0.3747 92.96 558 0.3112 0.95 0.0968 0.6401 0.9500 0.9545 0.0517 0.0091
0.3747 93.96 564 0.3109 0.95 0.0967 0.6393 0.9500 0.9545 0.0563 0.0091
0.3747 94.96 570 0.3112 0.95 0.0969 0.6370 0.9500 0.9545 0.0567 0.0092
0.3747 95.96 576 0.3118 0.95 0.0971 0.6364 0.9500 0.9545 0.0568 0.0091
0.3747 96.96 582 0.3120 0.95 0.0969 0.6377 0.9500 0.9545 0.0564 0.0092
0.3747 97.96 588 0.3121 0.95 0.0966 0.6379 0.9500 0.9545 0.0560 0.0092
0.3747 98.96 594 0.3121 0.95 0.0965 0.6374 0.9500 0.9545 0.0560 0.0092
0.3747 99.96 600 0.3120 0.95 0.0965 0.6372 0.9500 0.9545 0.0560 0.0092

Framework versions

  • Transformers 4.26.1
  • Pytorch 1.13.1.post200
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
13