1_9e-3_5_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9096
Accuracy: 0.7495

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.009
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.6689	1.0	590	1.8930	0.3792
1.4177	2.0	1180	1.1713	0.6217
1.4671	3.0	1770	0.9910	0.4239
1.2704	4.0	2360	1.0000	0.4969
1.1101	5.0	2950	0.8316	0.6459
1.0767	6.0	3540	0.9325	0.6428
1.0047	7.0	4130	1.4778	0.4725
0.9251	8.0	4720	0.7582	0.6801
0.8846	9.0	5310	0.8984	0.6737
0.8439	10.0	5900	0.8034	0.7018
0.8068	11.0	6490	0.8305	0.6624
0.7643	12.0	7080	1.0910	0.5859
0.7306	13.0	7670	0.7682	0.6908
0.6488	14.0	8260	0.7171	0.7226
0.6521	15.0	8850	0.6864	0.7202
0.6048	16.0	9440	0.7442	0.7260
0.5536	17.0	10030	1.0092	0.6532
0.5654	18.0	10620	0.7884	0.7052
0.5349	19.0	11210	0.7640	0.7073
0.4958	20.0	11800	0.7724	0.7343
0.4706	21.0	12390	0.7728	0.7183
0.459	22.0	12980	0.7394	0.7254
0.4362	23.0	13570	0.7550	0.7196
0.4176	24.0	14160	0.7744	0.7248
0.4012	25.0	14750	0.8998	0.7364
0.388	26.0	15340	0.9046	0.7104
0.3852	27.0	15930	0.7894	0.7278
0.3737	28.0	16520	0.8274	0.7391
0.3456	29.0	17110	0.7725	0.7471
0.34	30.0	17700	0.9009	0.7260
0.3247	31.0	18290	0.7733	0.7398
0.3197	32.0	18880	0.8370	0.7385
0.3109	33.0	19470	0.8705	0.7269
0.3047	34.0	20060	0.8475	0.7373
0.2815	35.0	20650	0.9676	0.7407
0.2782	36.0	21240	0.8183	0.7450
0.2808	37.0	21830	0.8551	0.7394
0.2639	38.0	22420	0.9552	0.7440
0.2599	39.0	23010	0.8785	0.7422
0.2563	40.0	23600	1.0538	0.7364
0.2471	41.0	24190	0.9479	0.7502
0.2524	42.0	24780	0.9348	0.7398
0.2419	43.0	25370	0.9101	0.7401
0.2338	44.0	25960	0.8726	0.7394
0.2218	45.0	26550	0.8953	0.7416
0.2115	46.0	27140	0.8966	0.7291
0.2234	47.0	27730	0.9359	0.7416
0.2047	48.0	28320	0.9434	0.7284
0.2218	49.0	28910	0.9202	0.7465
0.2075	50.0	29500	0.8866	0.7394
0.1982	51.0	30090	0.9081	0.7358
0.2064	52.0	30680	0.9691	0.7321
0.1955	53.0	31270	0.9527	0.7275
0.2006	54.0	31860	0.8744	0.7456
0.2021	55.0	32450	0.9529	0.7419
0.1932	56.0	33040	0.9040	0.7391
0.1823	57.0	33630	0.9188	0.7382
0.1726	58.0	34220	0.8715	0.7385
0.1867	59.0	34810	0.9165	0.7410
0.1831	60.0	35400	0.9393	0.7431
0.1741	61.0	35990	0.9843	0.7502
0.1687	62.0	36580	0.9161	0.7419
0.1712	63.0	37170	0.9630	0.7431
0.1742	64.0	37760	0.9306	0.7443
0.1721	65.0	38350	0.9384	0.7446
0.1614	66.0	38940	0.9237	0.7401
0.1631	67.0	39530	0.9315	0.7404
0.1626	68.0	40120	0.8884	0.7434
0.1547	69.0	40710	0.9163	0.7483
0.1609	70.0	41300	0.9340	0.7422
0.1592	71.0	41890	0.9292	0.7352
0.1588	72.0	42480	0.8887	0.7495
0.1504	73.0	43070	0.9228	0.7480
0.1422	74.0	43660	0.9570	0.7361
0.1535	75.0	44250	0.9705	0.7446
0.1486	76.0	44840	0.9364	0.7477
0.146	77.0	45430	0.9385	0.7517
0.1519	78.0	46020	0.8991	0.7495
0.148	79.0	46610	0.9516	0.7483
0.1388	80.0	47200	0.9189	0.7462
0.1392	81.0	47790	0.8985	0.7474
0.1426	82.0	48380	0.9112	0.7459
0.1388	83.0	48970	0.9468	0.7456
0.1396	84.0	49560	0.9185	0.7474
0.1316	85.0	50150	0.9230	0.7434
0.1332	86.0	50740	0.9365	0.7388
0.1245	87.0	51330	0.9405	0.7502
0.1283	88.0	51920	0.9384	0.7453
0.1309	89.0	52510	0.9250	0.7483
0.127	90.0	53100	0.9176	0.7434
0.124	91.0	53690	0.9207	0.7446
0.1294	92.0	54280	0.8949	0.7489
0.1322	93.0	54870	0.9154	0.7495
0.1242	94.0	55460	0.9033	0.7508
0.1251	95.0	56050	0.9201	0.7502
0.1174	96.0	56640	0.9043	0.7480
0.1284	97.0	57230	0.9111	0.7489
0.1188	98.0	57820	0.9175	0.7489
0.1201	99.0	58410	0.9150	0.7498
0.1229	100.0	59000	0.9096	0.7495

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_9e-3_5_0.1

1_9e-3_5_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_9e-3_5_0.1

Evaluation results