1_5e-3_10_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9323
Accuracy: 0.7440

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.4938	1.0	590	2.3680	0.6217
1.4628	2.0	1180	1.4346	0.6217
1.3133	3.0	1770	0.9694	0.6260
1.268	4.0	2360	1.1126	0.6217
1.0913	5.0	2950	0.9254	0.6587
1.0518	6.0	3540	0.8635	0.6593
1.0309	7.0	4130	1.3201	0.5049
0.9539	8.0	4720	0.8164	0.6801
0.9364	9.0	5310	1.3605	0.6254
0.9419	10.0	5900	0.7974	0.6844
0.9314	11.0	6490	1.3755	0.5486
0.8216	12.0	7080	0.7721	0.7012
0.8582	13.0	7670	0.7902	0.6902
0.7695	14.0	8260	0.7552	0.6945
0.7901	15.0	8850	0.8217	0.7144
0.7382	16.0	9440	0.8028	0.6844
0.7009	17.0	10030	0.7778	0.6994
0.6922	18.0	10620	0.9600	0.6688
0.6409	19.0	11210	0.8214	0.7104
0.6419	20.0	11800	1.1320	0.7031
0.6205	21.0	12390	0.7671	0.7232
0.6242	22.0	12980	0.8438	0.7208
0.5749	23.0	13570	1.0312	0.7214
0.5669	24.0	14160	0.7602	0.7242
0.5499	25.0	14750	0.8538	0.7294
0.5258	26.0	15340	1.5849	0.5807
0.5256	27.0	15930	0.8285	0.7306
0.491	28.0	16520	0.8039	0.7180
0.4844	29.0	17110	0.7899	0.7382
0.4584	30.0	17700	0.8144	0.7309
0.4602	31.0	18290	1.0077	0.7196
0.4467	32.0	18880	0.9234	0.7306
0.4267	33.0	19470	0.8644	0.7174
0.4031	34.0	20060	0.8536	0.7226
0.3862	35.0	20650	0.8552	0.7385
0.3811	36.0	21240	0.9266	0.7373
0.3814	37.0	21830	0.9688	0.7147
0.3613	38.0	22420	0.8678	0.7434
0.3528	39.0	23010	0.8885	0.7309
0.3563	40.0	23600	0.9239	0.7446
0.3507	41.0	24190	0.9006	0.7450
0.3437	42.0	24780	1.0086	0.7281
0.3138	43.0	25370	0.9287	0.7361
0.3208	44.0	25960	0.9420	0.7318
0.3214	45.0	26550	0.9205	0.7339
0.3013	46.0	27140	0.9259	0.7248
0.3066	47.0	27730	0.8718	0.7388
0.2987	48.0	28320	0.9665	0.7214
0.3116	49.0	28910	0.9426	0.7410
0.2766	50.0	29500	0.8971	0.7428
0.2683	51.0	30090	1.0176	0.7437
0.27	52.0	30680	0.9311	0.7382
0.2653	53.0	31270	0.9399	0.7336
0.2583	54.0	31860	0.8990	0.7281
0.2582	55.0	32450	0.9761	0.7419
0.2616	56.0	33040	0.8687	0.7480
0.2401	57.0	33630	0.9587	0.7266
0.2426	58.0	34220	0.9359	0.7474
0.2466	59.0	34810	0.9008	0.7385
0.2351	60.0	35400	0.9119	0.7462
0.237	61.0	35990	0.9495	0.7425
0.2329	62.0	36580	0.9731	0.7446
0.2235	63.0	37170	0.9495	0.7379
0.2251	64.0	37760	0.9236	0.7343
0.2235	65.0	38350	0.9289	0.7483
0.2237	66.0	38940	0.9300	0.7364
0.2159	67.0	39530	0.9430	0.7434
0.2201	68.0	40120	0.9144	0.7453
0.2075	69.0	40710	0.9126	0.7477
0.2195	70.0	41300	0.9387	0.7529
0.2036	71.0	41890	0.9798	0.7349
0.2116	72.0	42480	1.0175	0.7492
0.1953	73.0	43070	0.9082	0.7498
0.2003	74.0	43660	0.9919	0.7443
0.2016	75.0	44250	0.9649	0.7453
0.1997	76.0	44840	0.9454	0.7398
0.2065	77.0	45430	0.9424	0.7440
0.1983	78.0	46020	0.9516	0.7361
0.1937	79.0	46610	0.9370	0.7404
0.1826	80.0	47200	0.9395	0.7468
0.1816	81.0	47790	0.9566	0.7446
0.1931	82.0	48380	0.9800	0.7508
0.1866	83.0	48970	0.9390	0.7459
0.1831	84.0	49560	0.9383	0.7440
0.183	85.0	50150	0.9607	0.7422
0.1825	86.0	50740	0.9608	0.7468
0.178	87.0	51330	0.9820	0.7456
0.1787	88.0	51920	0.9598	0.7468
0.1745	89.0	52510	0.9419	0.7483
0.1739	90.0	53100	0.9796	0.7495
0.1752	91.0	53690	0.9373	0.7477
0.1773	92.0	54280	0.9310	0.7370
0.1699	93.0	54870	0.9641	0.7456
0.1738	94.0	55460	0.9418	0.7468
0.1698	95.0	56050	0.9586	0.7450
0.1658	96.0	56640	0.9488	0.7468
0.1658	97.0	57230	0.9450	0.7471
0.1641	98.0	57820	0.9307	0.7459
0.1695	99.0	58410	0.9487	0.7443
0.1653	100.0	59000	0.9323	0.7440

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_5e-3_10_0.1

1_5e-3_10_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_5e-3_10_0.1

Evaluation results