1_1e-2_5_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.8567
Accuracy: 0.7480

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.682	1.0	590	2.1411	0.6208
1.4095	2.0	1180	1.3977	0.3817
1.1425	3.0	1770	0.8850	0.5963
1.1284	4.0	2360	0.8549	0.6333
0.9827	5.0	2950	0.8314	0.6511
1.4181	6.0	3540	1.6014	0.3835
1.0353	7.0	4130	1.5568	0.4235
0.8632	8.0	4720	0.9442	0.6394
0.8723	9.0	5310	0.7750	0.6905
0.8161	10.0	5900	0.7561	0.6957
0.7785	11.0	6490	0.7662	0.6752
0.7497	12.0	7080	0.7282	0.6966
0.7437	13.0	7670	0.7389	0.6798
0.7156	14.0	8260	0.7087	0.7043
0.6893	15.0	8850	0.7195	0.7034
0.6787	16.0	9440	0.6835	0.7174
0.6392	17.0	10030	0.6839	0.7162
0.6287	18.0	10620	0.8835	0.6587
0.6247	19.0	11210	0.6814	0.7248
0.5969	20.0	11800	0.7200	0.7119
0.5621	21.0	12390	0.6906	0.7284
0.5461	22.0	12980	0.7080	0.7202
0.5147	23.0	13570	0.7483	0.7281
0.5098	24.0	14160	0.7129	0.7177
0.4893	25.0	14750	0.7235	0.7346
0.4723	26.0	15340	1.1308	0.6437
0.4619	27.0	15930	0.7328	0.7254
0.438	28.0	16520	0.8303	0.7422
0.4216	29.0	17110	0.7223	0.7410
0.4079	30.0	17700	0.7778	0.7315
0.3803	31.0	18290	0.7576	0.7318
0.3871	32.0	18880	0.8276	0.7382
0.3846	33.0	19470	0.8631	0.7110
0.3561	34.0	20060	0.8310	0.7211
0.344	35.0	20650	0.7655	0.7364
0.3333	36.0	21240	0.7666	0.7404
0.3287	37.0	21830	0.8005	0.7315
0.3193	38.0	22420	0.8775	0.7443
0.3051	39.0	23010	0.8466	0.7428
0.3019	40.0	23600	0.8328	0.7394
0.2922	41.0	24190	0.8150	0.7382
0.3064	42.0	24780	0.8742	0.7376
0.2841	43.0	25370	0.7898	0.7361
0.2841	44.0	25960	0.8226	0.7401
0.2679	45.0	26550	0.8297	0.7318
0.2651	46.0	27140	0.8316	0.7388
0.2654	47.0	27730	0.8553	0.7364
0.2457	48.0	28320	0.8647	0.7327
0.2558	49.0	28910	0.8399	0.7376
0.2467	50.0	29500	0.8517	0.7391
0.2278	51.0	30090	0.8409	0.7275
0.2343	52.0	30680	0.9442	0.7214
0.2372	53.0	31270	0.8661	0.7300
0.2194	54.0	31860	0.8430	0.7407
0.2222	55.0	32450	0.9235	0.7242
0.2328	56.0	33040	0.8637	0.7367
0.2162	57.0	33630	0.9162	0.7211
0.215	58.0	34220	0.8886	0.7281
0.206	59.0	34810	0.9033	0.7193
0.2099	60.0	35400	0.8829	0.7361
0.2081	61.0	35990	0.8874	0.7367
0.2105	62.0	36580	0.8902	0.7361
0.1899	63.0	37170	0.8541	0.7376
0.1972	64.0	37760	0.8740	0.7437
0.191	65.0	38350	0.8897	0.7413
0.1908	66.0	38940	0.8672	0.7437
0.1894	67.0	39530	0.8892	0.7364
0.1887	68.0	40120	0.8750	0.7407
0.1757	69.0	40710	0.8887	0.7379
0.1791	70.0	41300	0.8757	0.7413
0.1848	71.0	41890	0.8498	0.7437
0.1878	72.0	42480	0.8647	0.7413
0.1811	73.0	43070	0.8715	0.7391
0.1681	74.0	43660	0.9104	0.7416
0.1693	75.0	44250	0.9140	0.7434
0.1778	76.0	44840	0.8656	0.7437
0.1671	77.0	45430	0.8830	0.7413
0.1698	78.0	46020	0.8819	0.7431
0.1641	79.0	46610	0.8667	0.7391
0.1572	80.0	47200	0.8677	0.7419
0.1552	81.0	47790	0.8704	0.7404
0.1543	82.0	48380	0.8640	0.7489
0.1576	83.0	48970	0.8897	0.7459
0.153	84.0	49560	0.8649	0.7465
0.1536	85.0	50150	0.8864	0.7437
0.1548	86.0	50740	0.9050	0.7468
0.144	87.0	51330	0.8696	0.7401
0.151	88.0	51920	0.8987	0.7446
0.1493	89.0	52510	0.8938	0.7431
0.1455	90.0	53100	0.8726	0.7431
0.1414	91.0	53690	0.8814	0.7416
0.1422	92.0	54280	0.8838	0.7419
0.1421	93.0	54870	0.8648	0.7465
0.1477	94.0	55460	0.8532	0.7450
0.1431	95.0	56050	0.8613	0.7465
0.1412	96.0	56640	0.8708	0.7471
0.1413	97.0	57230	0.8656	0.7468
0.1375	98.0	57820	0.8647	0.7468
0.1389	99.0	58410	0.8590	0.7483
0.1389	100.0	59000	0.8567	0.7480

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_1e-2_5_0.1

1_1e-2_5_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_1e-2_5_0.1

Evaluation results