scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_1222

This model is a fine-tuned version of xlm-roberta-base on the massive dataset. It achieves the following results on the evaluation set:

Loss: 1.9311
Accuracy: 0.8053
F1: 0.7721

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 222
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
1.4964	0.27	5000	1.4500	0.6105	0.4803
1.0566	0.53	10000	1.0527	0.7148	0.6339
0.8765	0.8	15000	0.9618	0.7474	0.6821
0.6098	1.07	20000	0.8631	0.7740	0.7247
0.5901	1.34	25000	0.8582	0.7834	0.7357
0.5638	1.6	30000	0.7928	0.7943	0.7531
0.5512	1.87	35000	0.8000	0.7959	0.7611
0.3552	2.14	40000	0.8300	0.8024	0.7616
0.379	2.41	45000	0.8126	0.8016	0.7653
0.3881	2.67	50000	0.7908	0.8066	0.7659
0.3811	2.94	55000	0.7985	0.8106	0.7772
0.2352	3.21	60000	0.8810	0.8077	0.7747
0.2569	3.47	65000	0.8784	0.8040	0.7714
0.2718	3.74	70000	0.8711	0.8087	0.7760
0.2217	4.01	75000	0.9441	0.8076	0.7725
0.1726	4.28	80000	1.0024	0.8089	0.7783
0.1948	4.54	85000	0.9991	0.8087	0.7780
0.2053	4.81	90000	0.9625	0.8102	0.7810
0.1194	5.08	95000	1.1190	0.8085	0.7806
0.1351	5.34	100000	1.1087	0.8066	0.7778
0.1492	5.61	105000	1.1256	0.8070	0.7804
0.1617	5.88	110000	1.0722	0.8063	0.7795
0.0976	6.15	115000	1.2638	0.8059	0.7781
0.1163	6.41	120000	1.2244	0.8054	0.7757
0.1232	6.68	125000	1.2040	0.8065	0.7749
0.1285	6.95	130000	1.2076	0.8065	0.7735
0.0918	7.22	135000	1.3107	0.8032	0.7697
0.0912	7.48	140000	1.3389	0.8009	0.7678
0.0959	7.75	145000	1.2997	0.8072	0.7774
0.077	8.02	150000	1.3184	0.8057	0.7753
0.0741	8.28	155000	1.4162	0.8053	0.7766
0.0825	8.55	160000	1.4341	0.8020	0.7649
0.0936	8.82	165000	1.4053	0.8022	0.7763
0.0608	9.09	170000	1.4842	0.8029	0.7733
0.0641	9.35	175000	1.4781	0.8002	0.7737
0.072	9.62	180000	1.5047	0.8026	0.7731
0.0706	9.89	185000	1.4310	0.8037	0.7755
0.0521	10.15	190000	1.5146	0.8050	0.7757
0.0586	10.42	195000	1.5707	0.8010	0.7719
0.0631	10.69	200000	1.5185	0.8046	0.7725
0.0698	10.96	205000	1.5440	0.8061	0.7758
0.0465	11.22	210000	1.5470	0.8018	0.7716
0.0537	11.49	215000	1.5595	0.8040	0.7744
0.0481	11.76	220000	1.6320	0.7988	0.7681
0.0357	12.03	225000	1.6105	0.8020	0.7686
0.0461	12.29	230000	1.6597	0.8031	0.7727
0.0464	12.56	235000	1.6191	0.8032	0.7730
0.0577	12.83	240000	1.6038	0.8009	0.7694
0.035	13.09	245000	1.6705	0.7996	0.7694
0.0394	13.36	250000	1.6780	0.8009	0.7683
0.0424	13.63	255000	1.6732	0.7981	0.7729
0.0423	13.9	260000	1.6766	0.7991	0.7713
0.0352	14.16	265000	1.7255	0.8001	0.7709
0.0393	14.43	270000	1.6708	0.8009	0.7688
0.0319	14.7	275000	1.7312	0.8005	0.7695
0.0488	14.96	280000	1.6715	0.8044	0.7751
0.0331	15.23	285000	1.7184	0.8041	0.7713
0.0302	15.5	290000	1.7106	0.8045	0.7732
0.0362	15.77	295000	1.6744	0.8011	0.7714
0.0241	16.03	300000	1.7239	0.8042	0.7749
0.0292	16.3	305000	1.7661	0.8037	0.7712
0.0354	16.57	310000	1.7505	0.7997	0.7696
0.0275	16.84	315000	1.7971	0.7994	0.7667
0.0205	17.1	320000	1.7622	0.8005	0.7683
0.0281	17.37	325000	1.7888	0.8017	0.7687
0.0327	17.64	330000	1.7603	0.8031	0.7708
0.0289	17.9	335000	1.7648	0.8013	0.7691
0.0203	18.17	340000	1.8123	0.8014	0.7698
0.0226	18.44	345000	1.7910	0.8054	0.7750
0.0272	18.71	350000	1.8106	0.7996	0.7678
0.0269	18.97	355000	1.7764	0.8003	0.7670
0.0213	19.24	360000	1.8076	0.8042	0.7716
0.0221	19.51	365000	1.8362	0.8004	0.7673
0.0209	19.77	370000	1.8254	0.8036	0.7724
0.0133	20.04	375000	1.8579	0.8019	0.7685
0.0214	20.31	380000	1.8559	0.8016	0.7681
0.0201	20.58	385000	1.8591	0.7982	0.7668
0.0208	20.84	390000	1.8606	0.8009	0.7715
0.0152	21.11	395000	1.8695	0.8024	0.7663
0.0158	21.38	400000	1.8735	0.8022	0.7713
0.0175	21.65	405000	1.8899	0.8006	0.7681
0.016	21.91	410000	1.8801	0.8025	0.7718
0.0127	22.18	415000	1.8973	0.8016	0.7669
0.0135	22.45	420000	1.8957	0.8018	0.7681
0.0173	22.71	425000	1.8894	0.8035	0.7700
0.0144	22.98	430000	1.9020	0.8010	0.7690
0.0129	23.25	435000	1.8761	0.8022	0.7699
0.0137	23.52	440000	1.9048	0.8022	0.7675
0.0156	23.78	445000	1.9235	0.8020	0.7674
0.0105	24.05	450000	1.9552	0.8013	0.7686
0.0101	24.32	455000	1.9140	0.8014	0.7693
0.011	24.58	460000	1.9314	0.8029	0.7675
0.0128	24.85	465000	1.9024	0.8040	0.7700
0.0105	25.12	470000	1.9492	0.8023	0.7706
0.0076	25.39	475000	1.9255	0.8038	0.7723
0.0088	25.65	480000	1.8891	0.8027	0.7685
0.012	25.92	485000	1.9045	0.8053	0.7736
0.0093	26.19	490000	1.9281	0.8039	0.7703
0.0109	26.46	495000	1.9403	0.8026	0.7717
0.013	26.72	500000	1.9246	0.8033	0.7721
0.0101	26.99	505000	1.8926	0.8050	0.7722
0.0069	27.26	510000	1.9344	0.8050	0.7721
0.008	27.52	515000	1.9446	0.8041	0.7709
0.0082	27.79	520000	1.9142	0.8038	0.7717
0.0077	28.06	525000	1.9238	0.8052	0.7732
0.0082	28.33	530000	1.9385	0.8052	0.7729
0.0077	28.59	535000	1.9101	0.8046	0.7729
0.0084	28.86	540000	1.9058	0.8053	0.7734
0.007	29.13	545000	1.9396	0.8050	0.7714
0.0071	29.39	550000	1.9404	0.8047	0.7716
0.0071	29.66	555000	1.9322	0.8052	0.7721
0.0076	29.93	560000	1.9311	0.8053	0.7721

Framework versions

Transformers 4.33.3
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.13.3

haryoaw
/

scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_1222

scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_1222

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from

Evaluation results

scenario-NON-KD-SCR-D2_data-AmazonScience_massive_all_1_1222

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from FacebookAI/xlm-roberta-base

Evaluation results

Finetuned from