scenario-KD-SCR-MSV-CL-D2_data-cl-massive_all_1_166

This model is a fine-tuned version of haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1 on the massive dataset. It achieves the following results on the evaluation set:

Loss: 153.5196
Accuracy: 0.2013
F1: 0.2077

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 32
seed: 66
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
216.5718	0.5558	5000	278.3075	0.0831	0.0143
96.6856	1.1117	10000	166.4707	0.1225	0.0520
76.1691	1.6675	15000	148.1709	0.1620	0.1031
68.2425	2.2233	20000	143.5187	0.1716	0.1399
64.4004	2.7792	25000	142.7946	0.1949	0.1573
60.4123	3.3350	30000	143.0922	0.2043	0.1841
58.3257	3.8908	35000	139.9617	0.1968	0.1831
55.7519	4.4467	40000	142.7401	0.1894	0.1785
54.1464	5.0025	45000	142.8607	0.2036	0.1990
52.1675	5.5583	50000	144.5067	0.2011	0.2008
50.2095	6.1142	55000	143.3653	0.2015	0.2047
49.9481	6.6700	60000	146.0492	0.2070	0.2100
48.5061	7.2258	65000	145.7824	0.2060	0.2071
47.8933	7.7817	70000	144.2163	0.2035	0.2029
46.6586	8.3375	75000	145.5161	0.2011	0.1895
46.4338	8.8933	80000	145.8758	0.2069	0.2087
45.3242	9.4492	85000	148.9746	0.2062	0.2001
44.8473	10.0050	90000	146.6478	0.2048	0.2094
44.0707	10.5608	95000	147.1517	0.2069	0.2087
43.2148	11.1167	100000	147.0233	0.2000	0.2018
43.2003	11.6725	105000	150.5463	0.1995	0.2018
42.3747	12.2283	110000	149.6212	0.2031	0.2138
42.4209	12.7842	115000	150.6038	0.2070	0.2057
41.8157	13.3400	120000	149.7630	0.2098	0.2109
41.4871	13.8958	125000	148.6580	0.1977	0.2088
41.1541	14.4517	130000	149.6882	0.2003	0.2043
40.8411	15.0075	135000	150.0774	0.1982	0.2045
40.4155	15.5633	140000	149.2371	0.1960	0.1992
39.9237	16.1192	145000	154.0368	0.2009	0.2069
39.9569	16.6750	150000	150.4786	0.1917	0.2061
39.2982	17.2308	155000	150.7963	0.2006	0.2038
39.3455	17.7867	160000	149.2980	0.1968	0.2061
38.8963	18.3425	165000	152.4315	0.2014	0.2079
38.9814	18.8983	170000	149.7322	0.2008	0.2035
38.4953	19.4542	175000	150.7164	0.2019	0.1999
38.4956	20.0100	180000	151.1443	0.2032	0.2064
38.1569	20.5658	185000	151.4858	0.2004	0.2041
37.8998	21.1217	190000	150.7066	0.2019	0.2058
37.9294	21.6775	195000	153.0376	0.2020	0.2056
37.6479	22.2333	200000	153.4656	0.2067	0.2097
37.6083	22.7892	205000	153.5261	0.1996	0.2062
37.2665	23.3450	210000	152.2416	0.2014	0.2077
37.3541	23.9008	215000	152.2836	0.1960	0.2081
36.9886	24.4567	220000	152.6591	0.1998	0.2045
36.9563	25.0125	225000	152.0537	0.2015	0.2083
36.8296	25.5683	230000	153.0621	0.1974	0.2032
36.5423	26.1242	235000	153.1660	0.2043	0.2088
36.6304	26.6800	240000	152.7207	0.1997	0.2062
36.497	27.2358	245000	152.6269	0.2006	0.2056
36.6385	27.7917	250000	152.8843	0.2002	0.2072
36.3263	28.3475	255000	153.2631	0.2029	0.2085
36.3559	28.9033	260000	152.8185	0.2019	0.2097
36.1499	29.4592	265000	153.5196	0.2013	0.2077

Framework versions

Transformers 4.44.2
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.19.1

haryoaw
/

scenario-KD-SCR-MSV-CL-D2_data-cl-massive_all_1_166

scenario-KD-SCR-MSV-CL-D2_data-cl-massive_all_1_166

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-KD-SCR-MSV-CL-D2_data-cl-massive_all_1_166

Evaluation results