scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_1_d

This model is a fine-tuned version of xlm-roberta-base on the massive dataset. It achieves the following results on the evaluation set:

Loss: 1.9591
Accuracy: 0.8040
F1: 0.7736

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 123444
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
1.5309	0.27	5000	1.4613	0.6092	0.4846
1.0864	0.53	10000	1.0805	0.7105	0.6215
0.9145	0.8	15000	0.9580	0.7478	0.6813
0.6257	1.07	20000	0.8805	0.7659	0.7137
0.6305	1.34	25000	0.8449	0.7769	0.7324
0.6189	1.6	30000	0.8042	0.7871	0.7416
0.5597	1.87	35000	0.7852	0.7953	0.7531
0.383	2.14	40000	0.8654	0.7921	0.7489
0.3878	2.41	45000	0.8381	0.7969	0.7537
0.3987	2.67	50000	0.8371	0.8006	0.7624
0.4138	2.94	55000	0.7665	0.8083	0.7689
0.2473	3.21	60000	0.9196	0.8012	0.7611
0.2661	3.47	65000	0.8854	0.8059	0.7757
0.2831	3.74	70000	0.8755	0.8079	0.7781
0.2444	4.01	75000	0.9395	0.8028	0.7704
0.1749	4.28	80000	1.0154	0.8029	0.7678
0.1884	4.54	85000	1.0205	0.8045	0.7708
0.1989	4.81	90000	1.0140	0.8083	0.7767
0.1129	5.08	95000	1.1392	0.8059	0.7783
0.1381	5.34	100000	1.1485	0.8070	0.7772
0.1515	5.61	105000	1.1149	0.7984	0.7681
0.1508	5.88	110000	1.0453	0.8080	0.7796
0.1039	6.15	115000	1.2342	0.8023	0.7736
0.1116	6.41	120000	1.2505	0.7973	0.7667
0.1245	6.68	125000	1.2419	0.8018	0.7777
0.1292	6.95	130000	1.1729	0.8041	0.7735
0.0799	7.22	135000	1.3354	0.8031	0.7748
0.1024	7.48	140000	1.3675	0.8018	0.7724
0.1056	7.75	145000	1.2992	0.8047	0.7761
0.0774	8.02	150000	1.3784	0.8002	0.7704
0.0714	8.28	155000	1.4011	0.8060	0.7776
0.0814	8.55	160000	1.4297	0.8008	0.7751
0.0917	8.82	165000	1.4209	0.7960	0.7689
0.0595	9.09	170000	1.4649	0.8055	0.7780
0.0671	9.35	175000	1.4996	0.8026	0.7794
0.0765	9.62	180000	1.4661	0.8025	0.7733
0.0829	9.89	185000	1.4422	0.8048	0.7759
0.0589	10.15	190000	1.5282	0.7994	0.7727
0.0613	10.42	195000	1.5492	0.8029	0.7747
0.0596	10.69	200000	1.5336	0.8015	0.7722
0.0652	10.96	205000	1.5061	0.8033	0.7748
0.0497	11.22	210000	1.5938	0.7994	0.7743
0.0528	11.49	215000	1.5913	0.7993	0.7713
0.0543	11.76	220000	1.5478	0.8022	0.7764
0.0415	12.03	225000	1.6072	0.7993	0.7716
0.0385	12.29	230000	1.6604	0.8021	0.7728
0.0514	12.56	235000	1.6436	0.8001	0.7705
0.051	12.83	240000	1.6705	0.7992	0.7707
0.0369	13.09	245000	1.6312	0.8032	0.7707
0.0444	13.36	250000	1.6923	0.8006	0.7703
0.039	13.63	255000	1.6763	0.8021	0.7722
0.0488	13.9	260000	1.6276	0.8028	0.7756
0.0307	14.16	265000	1.7497	0.7961	0.7658
0.039	14.43	270000	1.7891	0.7995	0.7699
0.0405	14.7	275000	1.7142	0.7971	0.7658
0.0499	14.96	280000	1.7090	0.7972	0.7668
0.0354	15.23	285000	1.7538	0.7977	0.7670
0.0316	15.5	290000	1.7797	0.7954	0.7598
0.0354	15.77	295000	1.7481	0.8002	0.7726
0.0237	16.03	300000	1.7646	0.8021	0.7704
0.0285	16.3	305000	1.8245	0.7955	0.7620
0.0311	16.57	310000	1.7419	0.8001	0.7715
0.0347	16.84	315000	1.7404	0.8005	0.7713
0.0198	17.1	320000	1.7568	0.8004	0.7689
0.028	17.37	325000	1.8381	0.7979	0.7685
0.0234	17.64	330000	1.8297	0.7989	0.7670
0.0256	17.9	335000	1.8539	0.7982	0.7677
0.0223	18.17	340000	1.7779	0.7995	0.7683
0.0267	18.44	345000	1.7948	0.7976	0.7669
0.0306	18.71	350000	1.7912	0.8000	0.7681
0.0236	18.97	355000	1.8350	0.7994	0.7685
0.0174	19.24	360000	1.8375	0.7985	0.7698
0.0218	19.51	365000	1.8370	0.8007	0.7723
0.0257	19.77	370000	1.8556	0.7995	0.7706
0.0157	20.04	375000	1.8932	0.7988	0.7668
0.0207	20.31	380000	1.8580	0.8011	0.7665
0.0223	20.58	385000	1.8414	0.8012	0.7698
0.0223	20.84	390000	1.8548	0.7992	0.7682
0.0154	21.11	395000	1.8825	0.7991	0.7695
0.02	21.38	400000	1.8822	0.7975	0.7643
0.0201	21.65	405000	1.8783	0.8004	0.7660
0.0199	21.91	410000	1.8705	0.7985	0.7678
0.0143	22.18	415000	1.9152	0.7972	0.7685
0.0137	22.45	420000	1.9581	0.7997	0.7686
0.0126	22.71	425000	1.8464	0.8002	0.7679
0.0161	22.98	430000	1.8938	0.8002	0.7708
0.0131	23.25	435000	1.8836	0.8004	0.7701
0.0158	23.52	440000	1.8609	0.8017	0.7702
0.0161	23.78	445000	1.9091	0.7995	0.7692
0.0129	24.05	450000	1.9171	0.8009	0.7696
0.0092	24.32	455000	1.9403	0.8002	0.7714
0.0112	24.58	460000	1.8858	0.8014	0.7709
0.0104	24.85	465000	1.9880	0.7999	0.7670
0.01	25.12	470000	1.9668	0.7992	0.7653
0.0088	25.39	475000	1.9612	0.8003	0.7659
0.0106	25.65	480000	1.9177	0.8018	0.7714
0.0107	25.92	485000	1.8818	0.8018	0.7732
0.0078	26.19	490000	1.9768	0.8006	0.7673
0.0123	26.46	495000	1.9383	0.8026	0.7731
0.0095	26.72	500000	1.9156	0.8024	0.7704
0.0088	26.99	505000	1.9398	0.8014	0.7710
0.01	27.26	510000	1.9727	0.8010	0.7692
0.0078	27.52	515000	1.9469	0.8027	0.7724
0.0073	27.79	520000	1.9359	0.8012	0.7718
0.0069	28.06	525000	1.9325	0.8027	0.7723
0.0081	28.33	530000	1.9528	0.8027	0.7735
0.0096	28.59	535000	1.9615	0.8024	0.7718
0.0085	28.86	540000	1.9500	0.8023	0.7702
0.0075	29.13	545000	1.9682	0.8027	0.7722
0.007	29.39	550000	1.9601	0.8034	0.7733
0.0075	29.66	555000	1.9614	0.8039	0.7736
0.007	29.93	560000	1.9591	0.8040	0.7736

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_1_d

scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_1_d

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_1_d

Evaluation results