resnet101_rvl-cdip-cnn_rvl_cdip-NK1000_kd_NKD_t1.0_g1.5_rand
This model is a fine-tuned version of bdpc/resnet101_rvl-cdip on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.0688
- Accuracy: 0.768
- Brier Loss: 0.3296
- Nll: 1.9490
- F1 Micro: 0.768
- F1 Macro: 0.7688
- Ece: 0.0595
- Aurc: 0.0737
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 250 | 5.5798 | 0.1893 | 0.8757 | 3.8790 | 0.1893 | 0.1519 | 0.0656 | 0.6360 |
5.6792 | 2.0 | 500 | 5.5256 | 0.2642 | 0.8415 | 4.3054 | 0.2642 | 0.2235 | 0.0705 | 0.5478 |
5.6792 | 3.0 | 750 | 5.3875 | 0.3157 | 0.8213 | 3.4839 | 0.3157 | 0.2920 | 0.0835 | 0.5208 |
4.8939 | 4.0 | 1000 | 4.4870 | 0.473 | 0.6754 | 2.6509 | 0.473 | 0.4671 | 0.0728 | 0.3199 |
4.8939 | 5.0 | 1250 | 4.4320 | 0.5015 | 0.6345 | 2.7717 | 0.5015 | 0.4779 | 0.0577 | 0.2773 |
4.3006 | 6.0 | 1500 | 4.2399 | 0.5435 | 0.5938 | 2.6489 | 0.5435 | 0.5334 | 0.0527 | 0.2353 |
4.3006 | 7.0 | 1750 | 3.8189 | 0.6155 | 0.5111 | 2.4587 | 0.6155 | 0.6096 | 0.0725 | 0.1646 |
3.6984 | 8.0 | 2000 | 3.5238 | 0.6795 | 0.4476 | 2.2085 | 0.6795 | 0.6810 | 0.0712 | 0.1258 |
3.6984 | 9.0 | 2250 | 3.4654 | 0.6877 | 0.4287 | 2.2234 | 0.6877 | 0.6852 | 0.0511 | 0.1198 |
3.32 | 10.0 | 2500 | 3.4769 | 0.692 | 0.4253 | 2.2586 | 0.692 | 0.6868 | 0.0456 | 0.1164 |
3.32 | 11.0 | 2750 | 3.3473 | 0.7235 | 0.3967 | 2.1360 | 0.7235 | 0.7245 | 0.0518 | 0.1063 |
3.0488 | 12.0 | 3000 | 3.3891 | 0.712 | 0.3929 | 2.1461 | 0.712 | 0.7100 | 0.0491 | 0.1043 |
3.0488 | 13.0 | 3250 | 3.3123 | 0.7208 | 0.3846 | 2.1236 | 0.7208 | 0.7171 | 0.0444 | 0.0958 |
2.8727 | 14.0 | 3500 | 3.3357 | 0.7147 | 0.3877 | 2.0928 | 0.7147 | 0.7170 | 0.0489 | 0.1000 |
2.8727 | 15.0 | 3750 | 3.2576 | 0.7318 | 0.3703 | 2.1052 | 0.7318 | 0.7358 | 0.0473 | 0.0901 |
2.7552 | 16.0 | 4000 | 3.2528 | 0.738 | 0.3650 | 2.0968 | 0.738 | 0.7396 | 0.0477 | 0.0877 |
2.7552 | 17.0 | 4250 | 3.4241 | 0.7093 | 0.4120 | 2.1652 | 0.7093 | 0.7078 | 0.0540 | 0.1114 |
2.6718 | 18.0 | 4500 | 3.2877 | 0.7358 | 0.3686 | 2.0587 | 0.7358 | 0.7362 | 0.0517 | 0.0899 |
2.6718 | 19.0 | 4750 | 3.2690 | 0.7375 | 0.3650 | 2.0509 | 0.7375 | 0.7409 | 0.0527 | 0.0891 |
2.6158 | 20.0 | 5000 | 3.2650 | 0.7398 | 0.3629 | 2.0803 | 0.7398 | 0.7432 | 0.0491 | 0.0889 |
2.6158 | 21.0 | 5250 | 3.2507 | 0.7452 | 0.3599 | 2.0773 | 0.7452 | 0.7467 | 0.0577 | 0.0859 |
2.5766 | 22.0 | 5500 | 3.1737 | 0.7525 | 0.3481 | 2.0408 | 0.7525 | 0.7539 | 0.0557 | 0.0815 |
2.5766 | 23.0 | 5750 | 3.1625 | 0.7585 | 0.3478 | 2.0313 | 0.7585 | 0.7599 | 0.0587 | 0.0813 |
2.5388 | 24.0 | 6000 | 3.2357 | 0.746 | 0.3563 | 2.0384 | 0.746 | 0.7445 | 0.0592 | 0.0838 |
2.5388 | 25.0 | 6250 | 3.1653 | 0.761 | 0.3441 | 2.0389 | 0.761 | 0.7605 | 0.0423 | 0.0783 |
2.5129 | 26.0 | 6500 | 3.1662 | 0.7532 | 0.3468 | 2.0033 | 0.7532 | 0.7556 | 0.0594 | 0.0805 |
2.5129 | 27.0 | 6750 | 3.1224 | 0.7632 | 0.3384 | 1.9745 | 0.7632 | 0.7624 | 0.0537 | 0.0773 |
2.4881 | 28.0 | 7000 | 3.2460 | 0.7458 | 0.3618 | 2.0745 | 0.7458 | 0.7479 | 0.0521 | 0.0868 |
2.4881 | 29.0 | 7250 | 3.1299 | 0.7605 | 0.3414 | 1.9781 | 0.7605 | 0.7626 | 0.0600 | 0.0774 |
2.469 | 30.0 | 7500 | 3.1695 | 0.7555 | 0.3481 | 2.0246 | 0.7555 | 0.7563 | 0.0534 | 0.0811 |
2.469 | 31.0 | 7750 | 3.1766 | 0.7612 | 0.3474 | 1.9997 | 0.7612 | 0.7633 | 0.0541 | 0.0803 |
2.4524 | 32.0 | 8000 | 3.0937 | 0.7638 | 0.3351 | 1.9420 | 0.7638 | 0.7649 | 0.0592 | 0.0754 |
2.4524 | 33.0 | 8250 | 3.1293 | 0.7625 | 0.3409 | 1.9671 | 0.7625 | 0.7633 | 0.0580 | 0.0781 |
2.4382 | 34.0 | 8500 | 3.1129 | 0.7668 | 0.3370 | 2.0003 | 0.7668 | 0.7672 | 0.0623 | 0.0746 |
2.4382 | 35.0 | 8750 | 3.0795 | 0.767 | 0.3324 | 1.9772 | 0.767 | 0.7677 | 0.0536 | 0.0759 |
2.4259 | 36.0 | 9000 | 3.0927 | 0.7675 | 0.3332 | 1.9557 | 0.7675 | 0.7690 | 0.0588 | 0.0744 |
2.4259 | 37.0 | 9250 | 3.0856 | 0.7702 | 0.3327 | 1.9465 | 0.7702 | 0.7715 | 0.0554 | 0.0756 |
2.4107 | 38.0 | 9500 | 3.0915 | 0.7678 | 0.3319 | 1.9699 | 0.7678 | 0.7681 | 0.0556 | 0.0749 |
2.4107 | 39.0 | 9750 | 3.0885 | 0.763 | 0.3338 | 1.9478 | 0.763 | 0.7643 | 0.0575 | 0.0750 |
2.4002 | 40.0 | 10000 | 3.0921 | 0.771 | 0.3315 | 1.9563 | 0.771 | 0.7729 | 0.0557 | 0.0744 |
2.4002 | 41.0 | 10250 | 3.0727 | 0.767 | 0.3310 | 1.9530 | 0.767 | 0.7682 | 0.0567 | 0.0748 |
2.3905 | 42.0 | 10500 | 3.0793 | 0.7645 | 0.3320 | 1.9484 | 0.7645 | 0.7657 | 0.0598 | 0.0755 |
2.3905 | 43.0 | 10750 | 3.0771 | 0.7672 | 0.3308 | 1.9548 | 0.7672 | 0.7679 | 0.0566 | 0.0737 |
2.3798 | 44.0 | 11000 | 3.0794 | 0.7685 | 0.3309 | 1.9620 | 0.7685 | 0.7693 | 0.0631 | 0.0736 |
2.3798 | 45.0 | 11250 | 3.0736 | 0.7665 | 0.3320 | 1.9408 | 0.7665 | 0.7677 | 0.0589 | 0.0749 |
2.3731 | 46.0 | 11500 | 3.0746 | 0.7682 | 0.3312 | 1.9635 | 0.7682 | 0.7693 | 0.0576 | 0.0743 |
2.3731 | 47.0 | 11750 | 3.0711 | 0.768 | 0.3306 | 1.9576 | 0.768 | 0.7689 | 0.0572 | 0.0739 |
2.3671 | 48.0 | 12000 | 3.0785 | 0.7682 | 0.3317 | 1.9516 | 0.7682 | 0.7697 | 0.0574 | 0.0744 |
2.3671 | 49.0 | 12250 | 3.0678 | 0.7692 | 0.3298 | 1.9388 | 0.7692 | 0.7700 | 0.0606 | 0.0738 |
2.3628 | 50.0 | 12500 | 3.0688 | 0.768 | 0.3296 | 1.9490 | 0.768 | 0.7688 | 0.0595 | 0.0737 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1.post200
- Datasets 2.9.0
- Tokenizers 0.13.2
- Downloads last month
- 1
This model does not have enough activity to be deployed to Inference API (serverless) yet.
Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.