1_5e-3_5_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 1.0161
Accuracy: 0.7254

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
3.6534	1.0	590	2.9136	0.6217
3.1534	2.0	1180	2.7899	0.5896
3.1737	3.0	1770	4.1075	0.4003
3.108	4.0	2360	2.7570	0.6263
2.796	5.0	2950	2.8853	0.6122
2.6961	6.0	3540	2.6092	0.6083
2.7012	7.0	4130	5.4272	0.3899
2.5904	8.0	4720	2.6163	0.6110
2.6187	9.0	5310	2.4947	0.6440
2.4748	10.0	5900	3.1599	0.6343
2.4977	11.0	6490	2.4600	0.6358
2.4255	12.0	7080	2.3595	0.6165
2.371	13.0	7670	2.2762	0.6505
2.3482	14.0	8260	2.3764	0.6572
2.3162	15.0	8850	2.1363	0.6489
2.1908	16.0	9440	3.3056	0.6407
2.0964	17.0	10030	2.3744	0.6489
2.063	18.0	10620	2.3019	0.6021
2.0119	19.0	11210	2.0892	0.6734
2.0429	20.0	11800	2.5552	0.6596
1.9324	21.0	12390	2.0537	0.6694
1.9379	22.0	12980	1.9183	0.6801
1.9294	23.0	13570	1.8407	0.6774
1.8366	24.0	14160	1.8770	0.6872
1.809	25.0	14750	2.0356	0.6761
1.804	26.0	15340	1.6646	0.6801
1.8059	27.0	15930	1.6864	0.6780
1.7665	28.0	16520	1.6191	0.6813
1.7034	29.0	17110	1.8237	0.6477
1.663	30.0	17700	1.5530	0.6911
1.619	31.0	18290	1.5786	0.6884
1.5861	32.0	18880	2.2685	0.6746
1.5504	33.0	19470	1.6077	0.6624
1.5419	34.0	20060	1.4337	0.6976
1.5614	35.0	20650	1.5165	0.6969
1.5039	36.0	21240	1.8150	0.6972
1.4848	37.0	21830	1.3947	0.7006
1.4697	38.0	22420	1.5730	0.6709
1.3728	39.0	23010	1.5815	0.7021
1.4163	40.0	23600	1.3688	0.7125
1.3908	41.0	24190	1.5884	0.7006
1.3566	42.0	24780	1.3154	0.7180
1.3155	43.0	25370	1.2954	0.7138
1.3059	44.0	25960	1.2546	0.7116
1.2942	45.0	26550	1.4254	0.7092
1.2492	46.0	27140	1.2366	0.7180
1.2493	47.0	27730	1.2187	0.7095
1.202	48.0	28320	1.2318	0.7183
1.2327	49.0	28910	1.4508	0.7083
1.215	50.0	29500	1.2490	0.7205
1.1485	51.0	30090	1.3040	0.7147
1.157	52.0	30680	1.1436	0.7180
1.1302	53.0	31270	1.1814	0.7147
1.1111	54.0	31860	1.3464	0.7150
1.1422	55.0	32450	1.3631	0.7144
1.0891	56.0	33040	1.1418	0.7214
1.0652	57.0	33630	1.2196	0.7202
1.0556	58.0	34220	1.2335	0.7235
1.0672	59.0	34810	1.1583	0.7128
1.0613	60.0	35400	1.1927	0.7061
1.0069	61.0	35990	1.0860	0.7226
1.0483	62.0	36580	1.1060	0.7245
1.0051	63.0	37170	1.1095	0.7150
0.9834	64.0	37760	1.0793	0.7196
0.9801	65.0	38350	1.1033	0.7196
0.9647	66.0	38940	1.0704	0.7214
0.9384	67.0	39530	1.0795	0.7196
0.9791	68.0	40120	1.1657	0.7245
0.9309	69.0	40710	1.1983	0.7263
0.9602	70.0	41300	1.1575	0.7284
0.9462	71.0	41890	1.0949	0.7165
0.9473	72.0	42480	1.1855	0.7266
0.9047	73.0	43070	1.1378	0.7266
0.8996	74.0	43660	1.0339	0.7226
0.9248	75.0	44250	1.1656	0.7309
0.9075	76.0	44840	1.0272	0.7208
0.9062	77.0	45430	1.1646	0.7327
0.8987	78.0	46020	1.0606	0.7202
0.8831	79.0	46610	1.0543	0.7291
0.8655	80.0	47200	1.0785	0.7312
0.8629	81.0	47790	1.0745	0.7284
0.8733	82.0	48380	1.0734	0.7242
0.8796	83.0	48970	1.0343	0.7266
0.8313	84.0	49560	1.0709	0.7294
0.835	85.0	50150	1.0230	0.7266
0.8425	86.0	50740	1.0049	0.7235
0.8486	87.0	51330	1.0971	0.7278
0.8361	88.0	51920	1.0212	0.7226
0.8171	89.0	52510	1.1451	0.7287
0.7994	90.0	53100	1.0329	0.7315
0.8268	91.0	53690	1.0968	0.7346
0.8289	92.0	54280	1.0031	0.7223
0.8082	93.0	54870	1.0499	0.7278
0.8188	94.0	55460	1.0121	0.7235
0.82	95.0	56050	1.0232	0.7242
0.8028	96.0	56640	1.0279	0.7229
0.7891	97.0	57230	1.0091	0.7260
0.7771	98.0	57820	1.0230	0.7248
0.7652	99.0	58410	1.0248	0.7257
0.7874	100.0	59000	1.0161	0.7254

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_5e-3_5_0.9

1_5e-3_5_0.9

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_5e-3_5_0.9

Evaluation results