1_6e-3_1_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.4885
Accuracy: 0.7401

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.006
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.9248	1.0	590	0.7400	0.3786
0.8836	2.0	1180	0.7971	0.3914
0.8513	3.0	1770	0.6664	0.6217
0.7488	4.0	2360	0.7384	0.6217
0.729	5.0	2950	1.0125	0.6217
0.7097	6.0	3540	0.7106	0.5046
0.6521	7.0	4130	0.5533	0.6098
0.6704	8.0	4720	0.4852	0.6587
0.6271	9.0	5310	0.5153	0.6850
0.6134	10.0	5900	0.4555	0.6948
0.5702	11.0	6490	0.4732	0.6716
0.5428	12.0	7080	0.4548	0.6963
0.5681	13.0	7670	0.4534	0.6859
0.5238	14.0	8260	0.6556	0.6725
0.5103	15.0	8850	0.5050	0.7110
0.5004	16.0	9440	0.4638	0.6813
0.4614	17.0	10030	0.4935	0.7113
0.4702	18.0	10620	0.4570	0.7040
0.4305	19.0	11210	0.4871	0.7190
0.4402	20.0	11800	0.5026	0.6722
0.4035	21.0	12390	0.4476	0.7208
0.3907	22.0	12980	0.6030	0.6367
0.3686	23.0	13570	0.4396	0.7131
0.3765	24.0	14160	0.4589	0.7180
0.3709	25.0	14750	0.4440	0.7107
0.3446	26.0	15340	1.0145	0.5728
0.3433	27.0	15930	0.6213	0.6627
0.331	28.0	16520	0.4566	0.7144
0.3373	29.0	17110	0.5484	0.7284
0.3117	30.0	17700	0.6371	0.6648
0.2988	31.0	18290	0.7013	0.7089
0.2928	32.0	18880	0.4553	0.7281
0.297	33.0	19470	0.5225	0.6976
0.2808	34.0	20060	0.4951	0.7343
0.2735	35.0	20650	0.5188	0.7095
0.2624	36.0	21240	0.4961	0.7367
0.2642	37.0	21830	0.4731	0.7254
0.2548	38.0	22420	0.4635	0.7260
0.2575	39.0	23010	0.4896	0.7073
0.244	40.0	23600	0.5605	0.7358
0.2472	41.0	24190	0.6450	0.7266
0.2433	42.0	24780	0.4922	0.7367
0.2312	43.0	25370	0.5115	0.7269
0.2355	44.0	25960	0.4879	0.7388
0.2204	45.0	26550	0.5023	0.7355
0.2223	46.0	27140	0.4976	0.7355
0.22	47.0	27730	0.5051	0.7364
0.2056	48.0	28320	0.4973	0.7205
0.2166	49.0	28910	0.5008	0.7180
0.2129	50.0	29500	0.5323	0.7382
0.1973	51.0	30090	0.5689	0.6908
0.2025	52.0	30680	0.4855	0.7367
0.1977	53.0	31270	0.5230	0.7211
0.1946	54.0	31860	0.5969	0.7333
0.2063	55.0	32450	0.5340	0.7098
0.1967	56.0	33040	0.5589	0.7361
0.1793	57.0	33630	0.5207	0.7358
0.1872	58.0	34220	0.4926	0.7394
0.1831	59.0	34810	0.5265	0.7434
0.1808	60.0	35400	0.5113	0.7407
0.1892	61.0	35990	0.4972	0.7416
0.1795	62.0	36580	0.5121	0.7391
0.172	63.0	37170	0.4857	0.7321
0.176	64.0	37760	0.5014	0.7232
0.1763	65.0	38350	0.5061	0.7370
0.1753	66.0	38940	0.4840	0.7358
0.1716	67.0	39530	0.5262	0.7361
0.1675	68.0	40120	0.4844	0.7324
0.1647	69.0	40710	0.5357	0.7440
0.1702	70.0	41300	0.4852	0.7394
0.1666	71.0	41890	0.4749	0.7391
0.162	72.0	42480	0.5616	0.7385
0.1546	73.0	43070	0.5089	0.7352
0.1525	74.0	43660	0.5315	0.7382
0.1595	75.0	44250	0.5300	0.7419
0.1555	76.0	44840	0.5664	0.7407
0.1604	77.0	45430	0.5057	0.7416
0.1584	78.0	46020	0.5008	0.7355
0.1574	79.0	46610	0.5206	0.7398
0.1552	80.0	47200	0.5176	0.7361
0.1501	81.0	47790	0.4955	0.7376
0.1492	82.0	48380	0.5001	0.7391
0.1508	83.0	48970	0.4963	0.7379
0.1463	84.0	49560	0.5148	0.7413
0.1449	85.0	50150	0.4868	0.7349
0.1489	86.0	50740	0.5012	0.7419
0.1415	87.0	51330	0.4963	0.7321
0.145	88.0	51920	0.5046	0.7291
0.1375	89.0	52510	0.5011	0.7416
0.1387	90.0	53100	0.5041	0.7440
0.1428	91.0	53690	0.4940	0.7425
0.1442	92.0	54280	0.4912	0.7401
0.139	93.0	54870	0.5014	0.7428
0.1406	94.0	55460	0.4919	0.7391
0.1387	95.0	56050	0.5063	0.7446
0.1368	96.0	56640	0.4902	0.7410
0.1391	97.0	57230	0.4947	0.7407
0.136	98.0	57820	0.4922	0.7413
0.133	99.0	58410	0.4926	0.7394
0.1379	100.0	59000	0.4885	0.7401

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_6e-3_1_0.5

1_6e-3_1_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_6e-3_1_0.5

Evaluation results