scenario-KD-SCR-MSV-CL-D2_data-cl-massive_all_1_166

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:

  • Loss: 153.5196
  • Accuracy: 0.2013
  • F1: 0.2077

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 32
  • seed: 66
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
216.5718 0.5558 5000 278.3075 0.0831 0.0143
96.6856 1.1117 10000 166.4707 0.1225 0.0520
76.1691 1.6675 15000 148.1709 0.1620 0.1031
68.2425 2.2233 20000 143.5187 0.1716 0.1399
64.4004 2.7792 25000 142.7946 0.1949 0.1573
60.4123 3.3350 30000 143.0922 0.2043 0.1841
58.3257 3.8908 35000 139.9617 0.1968 0.1831
55.7519 4.4467 40000 142.7401 0.1894 0.1785
54.1464 5.0025 45000 142.8607 0.2036 0.1990
52.1675 5.5583 50000 144.5067 0.2011 0.2008
50.2095 6.1142 55000 143.3653 0.2015 0.2047
49.9481 6.6700 60000 146.0492 0.2070 0.2100
48.5061 7.2258 65000 145.7824 0.2060 0.2071
47.8933 7.7817 70000 144.2163 0.2035 0.2029
46.6586 8.3375 75000 145.5161 0.2011 0.1895
46.4338 8.8933 80000 145.8758 0.2069 0.2087
45.3242 9.4492 85000 148.9746 0.2062 0.2001
44.8473 10.0050 90000 146.6478 0.2048 0.2094
44.0707 10.5608 95000 147.1517 0.2069 0.2087
43.2148 11.1167 100000 147.0233 0.2000 0.2018
43.2003 11.6725 105000 150.5463 0.1995 0.2018
42.3747 12.2283 110000 149.6212 0.2031 0.2138
42.4209 12.7842 115000 150.6038 0.2070 0.2057
41.8157 13.3400 120000 149.7630 0.2098 0.2109
41.4871 13.8958 125000 148.6580 0.1977 0.2088
41.1541 14.4517 130000 149.6882 0.2003 0.2043
40.8411 15.0075 135000 150.0774 0.1982 0.2045
40.4155 15.5633 140000 149.2371 0.1960 0.1992
39.9237 16.1192 145000 154.0368 0.2009 0.2069
39.9569 16.6750 150000 150.4786 0.1917 0.2061
39.2982 17.2308 155000 150.7963 0.2006 0.2038
39.3455 17.7867 160000 149.2980 0.1968 0.2061
38.8963 18.3425 165000 152.4315 0.2014 0.2079
38.9814 18.8983 170000 149.7322 0.2008 0.2035
38.4953 19.4542 175000 150.7164 0.2019 0.1999
38.4956 20.0100 180000 151.1443 0.2032 0.2064
38.1569 20.5658 185000 151.4858 0.2004 0.2041
37.8998 21.1217 190000 150.7066 0.2019 0.2058
37.9294 21.6775 195000 153.0376 0.2020 0.2056
37.6479 22.2333 200000 153.4656 0.2067 0.2097
37.6083 22.7892 205000 153.5261 0.1996 0.2062
37.2665 23.3450 210000 152.2416 0.2014 0.2077
37.3541 23.9008 215000 152.2836 0.1960 0.2081
36.9886 24.4567 220000 152.6591 0.1998 0.2045
36.9563 25.0125 225000 152.0537 0.2015 0.2083
36.8296 25.5683 230000 153.0621 0.1974 0.2032
36.5423 26.1242 235000 153.1660 0.2043 0.2088
36.6304 26.6800 240000 152.7207 0.1997 0.2062
36.497 27.2358 245000 152.6269 0.2006 0.2056
36.6385 27.7917 250000 152.8843 0.2002 0.2072
36.3263 28.3475 255000 153.2631 0.2029 0.2085
36.3559 28.9033 260000 152.8185 0.2019 0.2097
36.1499 29.4592 265000 153.5196 0.2013 0.2077

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.19.1
Downloads last month
6
Safetensors
Model size
236M params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for haryoaw/scenario-KD-SCR-MSV-CL-D2_data-cl-massive_all_1_166

Finetuned
(22)
this model