scenario-KD-SCR-MSV-CL-D2_data-cl-massive_all_1_166
This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:
- Loss: 153.5196
- Accuracy: 0.2013
- F1: 0.2077
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 32
- seed: 66
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
---|---|---|---|---|---|
216.5718 | 0.5558 | 5000 | 278.3075 | 0.0831 | 0.0143 |
96.6856 | 1.1117 | 10000 | 166.4707 | 0.1225 | 0.0520 |
76.1691 | 1.6675 | 15000 | 148.1709 | 0.1620 | 0.1031 |
68.2425 | 2.2233 | 20000 | 143.5187 | 0.1716 | 0.1399 |
64.4004 | 2.7792 | 25000 | 142.7946 | 0.1949 | 0.1573 |
60.4123 | 3.3350 | 30000 | 143.0922 | 0.2043 | 0.1841 |
58.3257 | 3.8908 | 35000 | 139.9617 | 0.1968 | 0.1831 |
55.7519 | 4.4467 | 40000 | 142.7401 | 0.1894 | 0.1785 |
54.1464 | 5.0025 | 45000 | 142.8607 | 0.2036 | 0.1990 |
52.1675 | 5.5583 | 50000 | 144.5067 | 0.2011 | 0.2008 |
50.2095 | 6.1142 | 55000 | 143.3653 | 0.2015 | 0.2047 |
49.9481 | 6.6700 | 60000 | 146.0492 | 0.2070 | 0.2100 |
48.5061 | 7.2258 | 65000 | 145.7824 | 0.2060 | 0.2071 |
47.8933 | 7.7817 | 70000 | 144.2163 | 0.2035 | 0.2029 |
46.6586 | 8.3375 | 75000 | 145.5161 | 0.2011 | 0.1895 |
46.4338 | 8.8933 | 80000 | 145.8758 | 0.2069 | 0.2087 |
45.3242 | 9.4492 | 85000 | 148.9746 | 0.2062 | 0.2001 |
44.8473 | 10.0050 | 90000 | 146.6478 | 0.2048 | 0.2094 |
44.0707 | 10.5608 | 95000 | 147.1517 | 0.2069 | 0.2087 |
43.2148 | 11.1167 | 100000 | 147.0233 | 0.2000 | 0.2018 |
43.2003 | 11.6725 | 105000 | 150.5463 | 0.1995 | 0.2018 |
42.3747 | 12.2283 | 110000 | 149.6212 | 0.2031 | 0.2138 |
42.4209 | 12.7842 | 115000 | 150.6038 | 0.2070 | 0.2057 |
41.8157 | 13.3400 | 120000 | 149.7630 | 0.2098 | 0.2109 |
41.4871 | 13.8958 | 125000 | 148.6580 | 0.1977 | 0.2088 |
41.1541 | 14.4517 | 130000 | 149.6882 | 0.2003 | 0.2043 |
40.8411 | 15.0075 | 135000 | 150.0774 | 0.1982 | 0.2045 |
40.4155 | 15.5633 | 140000 | 149.2371 | 0.1960 | 0.1992 |
39.9237 | 16.1192 | 145000 | 154.0368 | 0.2009 | 0.2069 |
39.9569 | 16.6750 | 150000 | 150.4786 | 0.1917 | 0.2061 |
39.2982 | 17.2308 | 155000 | 150.7963 | 0.2006 | 0.2038 |
39.3455 | 17.7867 | 160000 | 149.2980 | 0.1968 | 0.2061 |
38.8963 | 18.3425 | 165000 | 152.4315 | 0.2014 | 0.2079 |
38.9814 | 18.8983 | 170000 | 149.7322 | 0.2008 | 0.2035 |
38.4953 | 19.4542 | 175000 | 150.7164 | 0.2019 | 0.1999 |
38.4956 | 20.0100 | 180000 | 151.1443 | 0.2032 | 0.2064 |
38.1569 | 20.5658 | 185000 | 151.4858 | 0.2004 | 0.2041 |
37.8998 | 21.1217 | 190000 | 150.7066 | 0.2019 | 0.2058 |
37.9294 | 21.6775 | 195000 | 153.0376 | 0.2020 | 0.2056 |
37.6479 | 22.2333 | 200000 | 153.4656 | 0.2067 | 0.2097 |
37.6083 | 22.7892 | 205000 | 153.5261 | 0.1996 | 0.2062 |
37.2665 | 23.3450 | 210000 | 152.2416 | 0.2014 | 0.2077 |
37.3541 | 23.9008 | 215000 | 152.2836 | 0.1960 | 0.2081 |
36.9886 | 24.4567 | 220000 | 152.6591 | 0.1998 | 0.2045 |
36.9563 | 25.0125 | 225000 | 152.0537 | 0.2015 | 0.2083 |
36.8296 | 25.5683 | 230000 | 153.0621 | 0.1974 | 0.2032 |
36.5423 | 26.1242 | 235000 | 153.1660 | 0.2043 | 0.2088 |
36.6304 | 26.6800 | 240000 | 152.7207 | 0.1997 | 0.2062 |
36.497 | 27.2358 | 245000 | 152.6269 | 0.2006 | 0.2056 |
36.6385 | 27.7917 | 250000 | 152.8843 | 0.2002 | 0.2072 |
36.3263 | 28.3475 | 255000 | 153.2631 | 0.2029 | 0.2085 |
36.3559 | 28.9033 | 260000 | 152.8185 | 0.2019 | 0.2097 |
36.1499 | 29.4592 | 265000 | 153.5196 | 0.2013 | 0.2077 |
Framework versions
- Transformers 4.44.2
- Pytorch 2.1.1+cu121
- Datasets 2.14.5
- Tokenizers 0.19.1
- Downloads last month
- 6
Model tree for haryoaw/scenario-KD-SCR-MSV-CL-D2_data-cl-massive_all_1_166
Base model
microsoft/mdeberta-v3-base
Finetuned
haryoaw/scenario-MDBT-TCR-MSV-CL