1_6e-3_10_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9853
Accuracy: 0.7416

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.006
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.4161	1.0	590	1.9327	0.6217
1.4964	2.0	1180	1.4733	0.6217
1.4294	3.0	1770	1.3770	0.6217
1.3196	4.0	2360	1.1956	0.4070
1.1661	5.0	2950	0.9866	0.6333
1.1565	6.0	3540	0.9164	0.6453
1.0435	7.0	4130	1.0146	0.5786
1.0861	8.0	4720	0.8707	0.6541
1.0246	9.0	5310	0.9747	0.6728
0.9761	10.0	5900	1.0055	0.6560
0.9672	11.0	6490	0.7808	0.6869
0.8746	12.0	7080	0.8158	0.6768
0.8883	13.0	7670	0.7982	0.6917
0.8257	14.0	8260	0.9875	0.6869
0.8053	15.0	8850	0.9210	0.7171
0.7995	16.0	9440	0.7910	0.7168
0.7376	17.0	10030	0.8382	0.7122
0.6743	18.0	10620	1.0620	0.6141
0.6343	19.0	11210	0.7421	0.7245
0.6499	20.0	11800	0.7841	0.7187
0.5897	21.0	12390	0.9551	0.6713
0.6163	22.0	12980	1.0281	0.7135
0.5617	23.0	13570	0.9252	0.7245
0.5282	24.0	14160	0.8599	0.7080
0.5402	25.0	14750	0.8381	0.7254
0.493	26.0	15340	1.0387	0.6657
0.474	27.0	15930	0.7978	0.7266
0.4658	28.0	16520	0.8697	0.7306
0.4624	29.0	17110	0.8746	0.7287
0.4333	30.0	17700	0.9256	0.7254
0.4324	31.0	18290	0.8635	0.7336
0.4352	32.0	18880	1.0482	0.7232
0.4144	33.0	19470	1.2383	0.6872
0.3822	34.0	20060	0.9361	0.7324
0.3549	35.0	20650	0.9758	0.7180
0.3597	36.0	21240	1.1784	0.7239
0.3598	37.0	21830	0.9757	0.7336
0.3421	38.0	22420	1.3951	0.7245
0.3309	39.0	23010	1.1202	0.7401
0.3209	40.0	23600	0.9882	0.7358
0.3214	41.0	24190	0.9997	0.7343
0.3101	42.0	24780	0.8871	0.7376
0.2913	43.0	25370	1.0116	0.7401
0.2884	44.0	25960	1.1248	0.7291
0.2761	45.0	26550	0.8363	0.7291
0.2761	46.0	27140	1.0666	0.7202
0.2674	47.0	27730	1.0285	0.7416
0.2647	48.0	28320	0.9575	0.7300
0.2662	49.0	28910	0.9258	0.7373
0.2726	50.0	29500	1.0936	0.7346
0.2461	51.0	30090	1.0192	0.7196
0.2485	52.0	30680	1.0543	0.7382
0.245	53.0	31270	0.9507	0.7336
0.2377	54.0	31860	0.8907	0.7361
0.2379	55.0	32450	0.9788	0.7327
0.2335	56.0	33040	1.0168	0.7413
0.2251	57.0	33630	1.0117	0.7346
0.2293	58.0	34220	0.9280	0.7336
0.2211	59.0	34810	0.9735	0.7401
0.2236	60.0	35400	0.9822	0.7404
0.2123	61.0	35990	1.0189	0.7346
0.207	62.0	36580	1.0436	0.7401
0.2059	63.0	37170	0.9571	0.7410
0.2052	64.0	37760	1.0027	0.7419
0.193	65.0	38350	0.9395	0.7413
0.2099	66.0	38940	1.0325	0.7358
0.1968	67.0	39530	1.0441	0.7398
0.1887	68.0	40120	1.1337	0.7413
0.1911	69.0	40710	1.0438	0.7382
0.1955	70.0	41300	1.0361	0.7394
0.1998	71.0	41890	1.0202	0.7349
0.1944	72.0	42480	1.0261	0.7407
0.1755	73.0	43070	1.0091	0.7422
0.1836	74.0	43660	0.9986	0.7425
0.1856	75.0	44250	0.9461	0.7404
0.187	76.0	44840	0.9383	0.7385
0.1873	77.0	45430	1.0445	0.7416
0.1763	78.0	46020	1.0263	0.7410
0.1749	79.0	46610	0.9650	0.7370
0.1728	80.0	47200	0.9903	0.7343
0.1668	81.0	47790	1.0391	0.7382
0.1693	82.0	48380	0.9794	0.7346
0.1665	83.0	48970	1.0463	0.7355
0.1609	84.0	49560	0.9976	0.7373
0.165	85.0	50150	1.0040	0.7404
0.1622	86.0	50740	1.0184	0.7419
0.1615	87.0	51330	0.9825	0.7336
0.1624	88.0	51920	0.9889	0.7394
0.1557	89.0	52510	0.9938	0.7370
0.1515	90.0	53100	1.0207	0.7385
0.1565	91.0	53690	1.0081	0.7401
0.1582	92.0	54280	0.9308	0.7364
0.1513	93.0	54870	0.9795	0.7398
0.1572	94.0	55460	0.9688	0.7382
0.1514	95.0	56050	1.0002	0.7410
0.1546	96.0	56640	0.9869	0.7401
0.1534	97.0	57230	0.9694	0.7370
0.1405	98.0	57820	0.9705	0.7404
0.149	99.0	58410	0.9859	0.7413
0.1456	100.0	59000	0.9853	0.7416

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_6e-3_10_0.1

1_6e-3_10_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_6e-3_10_0.1

Evaluation results