scenario-KD-SCR-MSV-D2_data-AmazonScience_massive_all_1_166sss
This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-AmazonScience_massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:
- Loss: 82.0859
- Accuracy: 0.5962
- F1: 0.3996
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 66
- gradient_accumulation_steps: 4
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
---|---|---|---|---|---|
126.4935 | 1.0689 | 5000 | 124.3451 | 0.0713 | 0.0044 |
110.1602 | 2.1378 | 10000 | 109.3465 | 0.0897 | 0.0078 |
103.4404 | 3.2067 | 15000 | 103.0728 | 0.1122 | 0.0105 |
99.5042 | 4.2756 | 20000 | 99.4316 | 0.1293 | 0.0130 |
96.8284 | 5.3444 | 25000 | 96.8304 | 0.1399 | 0.0144 |
94.6228 | 6.4133 | 30000 | 94.7923 | 0.1575 | 0.0202 |
92.6483 | 7.4822 | 35000 | 93.0098 | 0.2018 | 0.0394 |
91.1381 | 8.5511 | 40000 | 91.5109 | 0.2423 | 0.0563 |
89.6732 | 9.6200 | 45000 | 90.2175 | 0.2790 | 0.0798 |
88.4284 | 10.6889 | 50000 | 89.0928 | 0.3220 | 0.1084 |
87.3783 | 11.7578 | 55000 | 88.1065 | 0.3688 | 0.1482 |
86.3859 | 12.8267 | 60000 | 87.2331 | 0.4048 | 0.1815 |
85.6002 | 13.8956 | 65000 | 86.4654 | 0.4396 | 0.2121 |
84.8992 | 14.9645 | 70000 | 85.7826 | 0.4661 | 0.2358 |
84.1481 | 16.0333 | 75000 | 85.2149 | 0.4854 | 0.2554 |
83.2927 | 17.1022 | 80000 | 84.6774 | 0.5058 | 0.2754 |
82.8744 | 18.1711 | 85000 | 84.2360 | 0.5193 | 0.2921 |
82.3801 | 19.2400 | 90000 | 83.8295 | 0.5338 | 0.3106 |
81.9336 | 20.3089 | 95000 | 83.5018 | 0.5468 | 0.3298 |
81.7104 | 21.3778 | 100000 | 83.1787 | 0.5572 | 0.3438 |
81.45 | 22.4467 | 105000 | 82.9108 | 0.5686 | 0.3583 |
81.1051 | 23.5156 | 110000 | 82.7071 | 0.5758 | 0.3694 |
80.9291 | 24.5845 | 115000 | 82.5157 | 0.5813 | 0.3774 |
80.6694 | 25.6534 | 120000 | 82.3683 | 0.5856 | 0.3867 |
80.6166 | 26.7222 | 125000 | 82.2656 | 0.5901 | 0.3895 |
80.556 | 27.7911 | 130000 | 82.1561 | 0.5937 | 0.3966 |
80.4723 | 28.8600 | 135000 | 82.1248 | 0.5952 | 0.3979 |
80.5117 | 29.9289 | 140000 | 82.0859 | 0.5962 | 0.3996 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.1.1+cu121
- Datasets 2.14.5
- Tokenizers 0.19.1
- Downloads last month
- 4
Model tree for haryoaw/scenario-KD-SCR-MSV-D2_data-AmazonScience_massive_all_1_166sss
Base model
microsoft/mdeberta-v3-base
Finetuned
haryoaw/scenario-MDBT-TCR-MSV