dit-base_tobacco
This model is a fine-tuned version of microsoft/dit-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.3120
- Accuracy: 0.95
- Brier Loss: 0.0965
- Nll: 0.6372
- F1 Micro: 0.9500
- F1 Macro: 0.9545
- Ece: 0.0560
- Aurc: 0.0092
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
No log | 0.96 | 6 | 2.4454 | 0.175 | 0.9193 | 8.6626 | 0.175 | 0.0676 | 0.2489 | 0.8592 |
No log | 1.96 | 12 | 2.3287 | 0.175 | 0.9034 | 7.2049 | 0.175 | 0.0674 | 0.2590 | 0.8557 |
No log | 2.96 | 18 | 2.0836 | 0.23 | 0.8528 | 3.3114 | 0.23 | 0.1544 | 0.2652 | 0.7357 |
No log | 3.96 | 24 | 2.0456 | 0.315 | 0.8435 | 3.8932 | 0.315 | 0.1785 | 0.3010 | 0.6372 |
No log | 4.96 | 30 | 1.8778 | 0.3 | 0.7820 | 3.0975 | 0.3 | 0.1659 | 0.2985 | 0.5174 |
No log | 5.96 | 36 | 1.7247 | 0.365 | 0.7305 | 2.7808 | 0.3650 | 0.2235 | 0.2507 | 0.4036 |
No log | 6.96 | 42 | 1.6610 | 0.38 | 0.7183 | 2.6958 | 0.38 | 0.2449 | 0.2538 | 0.4119 |
No log | 7.96 | 48 | 1.4667 | 0.505 | 0.6417 | 2.4078 | 0.505 | 0.3653 | 0.2881 | 0.2656 |
No log | 8.96 | 54 | 1.3427 | 0.58 | 0.6031 | 2.0381 | 0.58 | 0.5304 | 0.2885 | 0.2470 |
No log | 9.96 | 60 | 1.1586 | 0.635 | 0.5217 | 1.8792 | 0.635 | 0.5496 | 0.2831 | 0.1697 |
No log | 10.96 | 66 | 1.0108 | 0.71 | 0.4578 | 1.6886 | 0.7100 | 0.6273 | 0.2851 | 0.1340 |
No log | 11.96 | 72 | 0.8648 | 0.75 | 0.3849 | 1.5408 | 0.75 | 0.6788 | 0.2530 | 0.0801 |
No log | 12.96 | 78 | 0.7342 | 0.79 | 0.3327 | 1.3588 | 0.79 | 0.7264 | 0.2152 | 0.0575 |
No log | 13.96 | 84 | 0.6024 | 0.835 | 0.2734 | 1.2694 | 0.835 | 0.7937 | 0.1876 | 0.0429 |
No log | 14.96 | 90 | 0.5143 | 0.85 | 0.2386 | 1.1756 | 0.85 | 0.8175 | 0.1714 | 0.0363 |
No log | 15.96 | 96 | 0.4429 | 0.865 | 0.2044 | 1.1080 | 0.865 | 0.8435 | 0.1380 | 0.0277 |
No log | 16.96 | 102 | 0.3999 | 0.885 | 0.1854 | 1.0748 | 0.885 | 0.8673 | 0.1407 | 0.0274 |
No log | 17.96 | 108 | 0.3635 | 0.88 | 0.1732 | 1.0361 | 0.88 | 0.8594 | 0.1117 | 0.0247 |
No log | 18.96 | 114 | 0.3166 | 0.89 | 0.1454 | 1.0855 | 0.89 | 0.8682 | 0.0971 | 0.0196 |
No log | 19.96 | 120 | 0.3137 | 0.905 | 0.1418 | 1.1614 | 0.905 | 0.8934 | 0.1041 | 0.0195 |
No log | 20.96 | 126 | 0.3207 | 0.91 | 0.1408 | 1.1941 | 0.91 | 0.9002 | 0.0856 | 0.0198 |
No log | 21.96 | 132 | 0.2753 | 0.925 | 0.1224 | 1.0928 | 0.925 | 0.9209 | 0.0858 | 0.0145 |
No log | 22.96 | 138 | 0.2538 | 0.925 | 0.1169 | 1.0895 | 0.925 | 0.9187 | 0.0863 | 0.0111 |
No log | 23.96 | 144 | 0.2691 | 0.935 | 0.1138 | 1.0767 | 0.935 | 0.9279 | 0.0730 | 0.0149 |
No log | 24.96 | 150 | 0.2775 | 0.935 | 0.1131 | 1.0538 | 0.935 | 0.9292 | 0.0676 | 0.0157 |
No log | 25.96 | 156 | 0.2544 | 0.94 | 0.1011 | 1.0266 | 0.94 | 0.9292 | 0.0643 | 0.0131 |
No log | 26.96 | 162 | 0.2637 | 0.945 | 0.1013 | 1.0337 | 0.945 | 0.9384 | 0.0648 | 0.0150 |
No log | 27.96 | 168 | 0.2787 | 0.94 | 0.1089 | 1.0202 | 0.94 | 0.9348 | 0.0685 | 0.0161 |
No log | 28.96 | 174 | 0.2794 | 0.935 | 0.1091 | 1.0099 | 0.935 | 0.9306 | 0.0671 | 0.0143 |
No log | 29.96 | 180 | 0.2631 | 0.935 | 0.1025 | 0.9815 | 0.935 | 0.9306 | 0.0575 | 0.0129 |
No log | 30.96 | 186 | 0.2616 | 0.945 | 0.1009 | 0.9683 | 0.945 | 0.9401 | 0.0674 | 0.0120 |
No log | 31.96 | 192 | 0.2726 | 0.935 | 0.1074 | 0.9598 | 0.935 | 0.9346 | 0.0641 | 0.0100 |
No log | 32.96 | 198 | 0.2765 | 0.935 | 0.1058 | 0.9067 | 0.935 | 0.9321 | 0.0696 | 0.0101 |
No log | 33.96 | 204 | 0.2662 | 0.95 | 0.0965 | 0.8891 | 0.9500 | 0.9522 | 0.0672 | 0.0120 |
No log | 34.96 | 210 | 0.2761 | 0.935 | 0.1019 | 0.8893 | 0.935 | 0.9338 | 0.0597 | 0.0134 |
No log | 35.96 | 216 | 0.2729 | 0.945 | 0.0961 | 0.8807 | 0.945 | 0.9419 | 0.0552 | 0.0119 |
No log | 36.96 | 222 | 0.2741 | 0.94 | 0.1037 | 0.8782 | 0.94 | 0.9356 | 0.0645 | 0.0086 |
No log | 37.96 | 228 | 0.2686 | 0.94 | 0.0994 | 0.8423 | 0.94 | 0.9356 | 0.0592 | 0.0085 |
No log | 38.96 | 234 | 0.2712 | 0.95 | 0.0906 | 0.8179 | 0.9500 | 0.9545 | 0.0610 | 0.0105 |
No log | 39.96 | 240 | 0.2644 | 0.95 | 0.0870 | 0.8240 | 0.9500 | 0.9443 | 0.0510 | 0.0110 |
No log | 40.96 | 246 | 0.2653 | 0.95 | 0.0932 | 0.8386 | 0.9500 | 0.9525 | 0.0572 | 0.0118 |
No log | 41.96 | 252 | 0.2724 | 0.955 | 0.0939 | 0.8369 | 0.955 | 0.9573 | 0.0602 | 0.0104 |
No log | 42.96 | 258 | 0.2552 | 0.95 | 0.0868 | 0.8079 | 0.9500 | 0.9522 | 0.0539 | 0.0079 |
No log | 43.96 | 264 | 0.2629 | 0.95 | 0.0879 | 0.7800 | 0.9500 | 0.9545 | 0.0526 | 0.0080 |
No log | 44.96 | 270 | 0.2664 | 0.955 | 0.0864 | 0.7660 | 0.955 | 0.9575 | 0.0515 | 0.0086 |
No log | 45.96 | 276 | 0.2777 | 0.945 | 0.0948 | 0.7670 | 0.945 | 0.9513 | 0.0524 | 0.0096 |
No log | 46.96 | 282 | 0.2824 | 0.94 | 0.1014 | 0.7799 | 0.94 | 0.9436 | 0.0570 | 0.0093 |
No log | 47.96 | 288 | 0.2699 | 0.95 | 0.0896 | 0.7706 | 0.9500 | 0.9546 | 0.0528 | 0.0087 |
No log | 48.96 | 294 | 0.2809 | 0.945 | 0.0950 | 0.7691 | 0.945 | 0.9480 | 0.0475 | 0.0087 |
No log | 49.96 | 300 | 0.2827 | 0.945 | 0.0940 | 0.7635 | 0.945 | 0.9447 | 0.0571 | 0.0091 |
No log | 50.96 | 306 | 0.2781 | 0.945 | 0.0921 | 0.7591 | 0.945 | 0.9478 | 0.0552 | 0.0090 |
No log | 51.96 | 312 | 0.2834 | 0.95 | 0.0946 | 0.7572 | 0.9500 | 0.9484 | 0.0549 | 0.0089 |
No log | 52.96 | 318 | 0.2986 | 0.94 | 0.0994 | 0.7541 | 0.94 | 0.9363 | 0.0605 | 0.0091 |
No log | 53.96 | 324 | 0.2957 | 0.94 | 0.1016 | 0.7447 | 0.94 | 0.9385 | 0.0562 | 0.0086 |
No log | 54.96 | 330 | 0.2991 | 0.94 | 0.1047 | 0.7392 | 0.94 | 0.9377 | 0.0592 | 0.0102 |
No log | 55.96 | 336 | 0.3027 | 0.94 | 0.1031 | 0.7235 | 0.94 | 0.9377 | 0.0572 | 0.0113 |
No log | 56.96 | 342 | 0.2945 | 0.945 | 0.0968 | 0.7143 | 0.945 | 0.9470 | 0.0581 | 0.0104 |
No log | 57.96 | 348 | 0.2935 | 0.94 | 0.0955 | 0.7046 | 0.94 | 0.9459 | 0.0569 | 0.0097 |
No log | 58.96 | 354 | 0.2909 | 0.94 | 0.0934 | 0.6969 | 0.94 | 0.9459 | 0.0544 | 0.0092 |
No log | 59.96 | 360 | 0.2973 | 0.95 | 0.0939 | 0.6964 | 0.9500 | 0.9545 | 0.0524 | 0.0082 |
No log | 60.96 | 366 | 0.3222 | 0.93 | 0.1108 | 0.7078 | 0.93 | 0.9266 | 0.0586 | 0.0088 |
No log | 61.96 | 372 | 0.3247 | 0.935 | 0.1093 | 0.7743 | 0.935 | 0.9353 | 0.0622 | 0.0091 |
No log | 62.96 | 378 | 0.3125 | 0.945 | 0.1003 | 0.7651 | 0.945 | 0.9453 | 0.0559 | 0.0089 |
No log | 63.96 | 384 | 0.3035 | 0.945 | 0.0993 | 0.7515 | 0.945 | 0.9476 | 0.0545 | 0.0088 |
No log | 64.96 | 390 | 0.3002 | 0.945 | 0.0973 | 0.7408 | 0.945 | 0.9476 | 0.0537 | 0.0091 |
No log | 65.96 | 396 | 0.3023 | 0.95 | 0.0965 | 0.7321 | 0.9500 | 0.9545 | 0.0523 | 0.0095 |
No log | 66.96 | 402 | 0.3075 | 0.945 | 0.1007 | 0.7323 | 0.945 | 0.9477 | 0.0540 | 0.0096 |
No log | 67.96 | 408 | 0.3062 | 0.945 | 0.0999 | 0.6682 | 0.945 | 0.9514 | 0.0525 | 0.0098 |
No log | 68.96 | 414 | 0.3182 | 0.945 | 0.0968 | 0.6809 | 0.945 | 0.9432 | 0.0485 | 0.0115 |
No log | 69.96 | 420 | 0.3272 | 0.945 | 0.0972 | 0.6879 | 0.945 | 0.9432 | 0.0513 | 0.0132 |
No log | 70.96 | 426 | 0.3210 | 0.945 | 0.0973 | 0.7545 | 0.945 | 0.9488 | 0.0522 | 0.0124 |
No log | 71.96 | 432 | 0.3194 | 0.945 | 0.1027 | 0.7464 | 0.945 | 0.9514 | 0.0546 | 0.0108 |
No log | 72.96 | 438 | 0.3236 | 0.94 | 0.1067 | 0.7486 | 0.94 | 0.9427 | 0.0587 | 0.0097 |
No log | 73.96 | 444 | 0.3166 | 0.94 | 0.1049 | 0.6751 | 0.94 | 0.9427 | 0.0597 | 0.0096 |
No log | 74.96 | 450 | 0.3062 | 0.945 | 0.0982 | 0.6702 | 0.945 | 0.9514 | 0.0526 | 0.0100 |
No log | 75.96 | 456 | 0.3018 | 0.95 | 0.0948 | 0.6823 | 0.9500 | 0.9545 | 0.0523 | 0.0102 |
No log | 76.96 | 462 | 0.3062 | 0.95 | 0.0951 | 0.7444 | 0.9500 | 0.9545 | 0.0522 | 0.0109 |
No log | 77.96 | 468 | 0.3072 | 0.95 | 0.0933 | 0.7437 | 0.9500 | 0.9545 | 0.0501 | 0.0118 |
No log | 78.96 | 474 | 0.3095 | 0.95 | 0.0943 | 0.6749 | 0.9500 | 0.9545 | 0.0512 | 0.0121 |
No log | 79.96 | 480 | 0.3097 | 0.945 | 0.0968 | 0.6654 | 0.945 | 0.9514 | 0.0576 | 0.0116 |
No log | 80.96 | 486 | 0.3094 | 0.95 | 0.0967 | 0.6581 | 0.9500 | 0.9545 | 0.0526 | 0.0112 |
No log | 81.96 | 492 | 0.3109 | 0.95 | 0.0954 | 0.6549 | 0.9500 | 0.9545 | 0.0507 | 0.0115 |
No log | 82.96 | 498 | 0.3104 | 0.95 | 0.0949 | 0.7168 | 0.9500 | 0.9545 | 0.0521 | 0.0113 |
0.3747 | 83.96 | 504 | 0.3122 | 0.95 | 0.0949 | 0.7130 | 0.9500 | 0.9545 | 0.0513 | 0.0111 |
0.3747 | 84.96 | 510 | 0.3140 | 0.95 | 0.0944 | 0.7116 | 0.9500 | 0.9545 | 0.0534 | 0.0113 |
0.3747 | 85.96 | 516 | 0.3175 | 0.95 | 0.0949 | 0.7100 | 0.9500 | 0.9545 | 0.0544 | 0.0113 |
0.3747 | 86.96 | 522 | 0.3187 | 0.95 | 0.0958 | 0.7072 | 0.9500 | 0.9545 | 0.0537 | 0.0111 |
0.3747 | 87.96 | 528 | 0.3191 | 0.95 | 0.0967 | 0.6428 | 0.9500 | 0.9545 | 0.0536 | 0.0103 |
0.3747 | 88.96 | 534 | 0.3168 | 0.95 | 0.0963 | 0.6438 | 0.9500 | 0.9545 | 0.0542 | 0.0102 |
0.3747 | 89.96 | 540 | 0.3136 | 0.95 | 0.0963 | 0.6418 | 0.9500 | 0.9545 | 0.0554 | 0.0099 |
0.3747 | 90.96 | 546 | 0.3117 | 0.95 | 0.0963 | 0.6407 | 0.9500 | 0.9545 | 0.0533 | 0.0097 |
0.3747 | 91.96 | 552 | 0.3113 | 0.95 | 0.0964 | 0.6403 | 0.9500 | 0.9545 | 0.0528 | 0.0091 |
0.3747 | 92.96 | 558 | 0.3112 | 0.95 | 0.0968 | 0.6401 | 0.9500 | 0.9545 | 0.0517 | 0.0091 |
0.3747 | 93.96 | 564 | 0.3109 | 0.95 | 0.0967 | 0.6393 | 0.9500 | 0.9545 | 0.0563 | 0.0091 |
0.3747 | 94.96 | 570 | 0.3112 | 0.95 | 0.0969 | 0.6370 | 0.9500 | 0.9545 | 0.0567 | 0.0092 |
0.3747 | 95.96 | 576 | 0.3118 | 0.95 | 0.0971 | 0.6364 | 0.9500 | 0.9545 | 0.0568 | 0.0091 |
0.3747 | 96.96 | 582 | 0.3120 | 0.95 | 0.0969 | 0.6377 | 0.9500 | 0.9545 | 0.0564 | 0.0092 |
0.3747 | 97.96 | 588 | 0.3121 | 0.95 | 0.0966 | 0.6379 | 0.9500 | 0.9545 | 0.0560 | 0.0092 |
0.3747 | 98.96 | 594 | 0.3121 | 0.95 | 0.0965 | 0.6374 | 0.9500 | 0.9545 | 0.0560 | 0.0092 |
0.3747 | 99.96 | 600 | 0.3120 | 0.95 | 0.0965 | 0.6372 | 0.9500 | 0.9545 | 0.0560 | 0.0092 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1.post200
- Datasets 2.9.0
- Tokenizers 0.13.2
- Downloads last month
- 13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.