1_1e-2_1_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.4701
Accuracy: 0.7431

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.2311	1.0	590	1.5093	0.6217
1.0444	2.0	1180	0.5788	0.6196
0.9287	3.0	1770	1.3468	0.6217
0.8066	4.0	2360	0.7094	0.6217
0.6756	5.0	2950	0.5829	0.6486
0.5869	6.0	3540	0.5398	0.6670
0.5733	7.0	4130	0.6279	0.5716
0.5229	8.0	4720	0.4543	0.7061
0.4998	9.0	5310	0.4906	0.6685
0.476	10.0	5900	0.5972	0.6927
0.4498	11.0	6490	0.4602	0.7049
0.4082	12.0	7080	0.4432	0.7012
0.4072	13.0	7670	0.4585	0.6963
0.3746	14.0	8260	0.4281	0.7312
0.3652	15.0	8850	0.4691	0.7294
0.3505	16.0	9440	0.4156	0.7303
0.3375	17.0	10030	0.4299	0.7275
0.3298	18.0	10620	0.4948	0.7
0.3056	19.0	11210	0.4208	0.7275
0.2956	20.0	11800	0.4474	0.7324
0.2859	21.0	12390	0.5893	0.6746
0.2807	22.0	12980	0.4613	0.7291
0.2566	23.0	13570	0.4610	0.7235
0.249	24.0	14160	0.5434	0.7413
0.2391	25.0	14750	0.5110	0.7333
0.2421	26.0	15340	0.6915	0.6465
0.2556	27.0	15930	0.4759	0.7306
0.2271	28.0	16520	0.4690	0.7321
0.2295	29.0	17110	0.5012	0.7376
0.2283	30.0	17700	0.5150	0.7128
0.2054	31.0	18290	0.4737	0.7343
0.2157	32.0	18880	0.6032	0.7327
0.215	33.0	19470	0.4818	0.7297
0.196	34.0	20060	0.4894	0.7147
0.2001	35.0	20650	0.5326	0.7193
0.1955	36.0	21240	0.4826	0.7413
0.1947	37.0	21830	0.4625	0.7385
0.1912	38.0	22420	0.4764	0.7492
0.1946	39.0	23010	0.5615	0.7443
0.1898	40.0	23600	0.4870	0.7413
0.1789	41.0	24190	0.5526	0.7462
0.1803	42.0	24780	0.5021	0.7217
0.1708	43.0	25370	0.4751	0.7379
0.1835	44.0	25960	0.4738	0.7355
0.1738	45.0	26550	0.4759	0.7336
0.1726	46.0	27140	0.4928	0.7367
0.1756	47.0	27730	0.5380	0.7193
0.1617	48.0	28320	0.5119	0.7327
0.1725	49.0	28910	0.4884	0.7431
0.1643	50.0	29500	0.4968	0.7382
0.1593	51.0	30090	0.4708	0.7281
0.1645	52.0	30680	0.4943	0.7364
0.1566	53.0	31270	0.4820	0.7446
0.1555	54.0	31860	0.5117	0.7376
0.1584	55.0	32450	0.5269	0.7410
0.1587	56.0	33040	0.4650	0.7394
0.1527	57.0	33630	0.5007	0.7431
0.157	58.0	34220	0.4689	0.7413
0.1527	59.0	34810	0.4960	0.7306
0.1461	60.0	35400	0.5033	0.7416
0.1506	61.0	35990	0.4817	0.7459
0.153	62.0	36580	0.4782	0.7422
0.1417	63.0	37170	0.4808	0.7410
0.1477	64.0	37760	0.5090	0.7358
0.1467	65.0	38350	0.5180	0.7419
0.1416	66.0	38940	0.5055	0.7483
0.1407	67.0	39530	0.4779	0.7416
0.1407	68.0	40120	0.4661	0.7401
0.1379	69.0	40710	0.5172	0.7450
0.1432	70.0	41300	0.4883	0.7422
0.1455	71.0	41890	0.4853	0.7382
0.1348	72.0	42480	0.4934	0.7465
0.134	73.0	43070	0.4773	0.7462
0.1323	74.0	43660	0.5033	0.7428
0.1356	75.0	44250	0.5184	0.7483
0.1321	76.0	44840	0.4860	0.7382
0.1328	77.0	45430	0.4800	0.7422
0.1334	78.0	46020	0.4668	0.7489
0.128	79.0	46610	0.4930	0.7498
0.1315	80.0	47200	0.4808	0.7410
0.1236	81.0	47790	0.4718	0.7456
0.1286	82.0	48380	0.4723	0.7413
0.1264	83.0	48970	0.4987	0.7480
0.1273	84.0	49560	0.4582	0.7492
0.1243	85.0	50150	0.4713	0.7471
0.1286	86.0	50740	0.4913	0.7437
0.1186	87.0	51330	0.4953	0.7495
0.1194	88.0	51920	0.4805	0.7486
0.118	89.0	52510	0.4799	0.7474
0.1236	90.0	53100	0.4829	0.7471
0.1201	91.0	53690	0.4736	0.7474
0.1235	92.0	54280	0.4695	0.7431
0.1214	93.0	54870	0.4781	0.7446
0.1188	94.0	55460	0.4701	0.7456
0.1191	95.0	56050	0.4681	0.7456
0.1144	96.0	56640	0.4737	0.7453
0.1212	97.0	57230	0.4736	0.7446
0.1152	98.0	57820	0.4668	0.7410
0.1153	99.0	58410	0.4743	0.7437
0.1194	100.0	59000	0.4701	0.7431

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_1e-2_1_0.5

1_1e-2_1_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_1e-2_1_0.5

Evaluation results