1_7e-3_10_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9382
Accuracy: 0.7557

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.007
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.7912	1.0	590	2.5545	0.3872
3.233	2.0	1180	2.8480	0.6217
2.7249	3.0	1770	2.7584	0.4037
2.5026	4.0	2360	1.8755	0.6113
2.235	5.0	2950	1.6668	0.6661
1.9303	6.0	3540	1.6441	0.6346
1.9491	7.0	4130	2.1352	0.5789
1.6294	8.0	4720	2.2811	0.6572
1.6591	9.0	5310	1.5834	0.6896
1.5251	10.0	5900	1.7600	0.6716
1.5112	11.0	6490	1.2400	0.6905
1.3972	12.0	7080	1.2023	0.7165
1.3804	13.0	7670	1.1972	0.7009
1.3085	14.0	8260	1.6154	0.7101
1.2559	15.0	8850	1.1741	0.7
1.2292	16.0	9440	1.1551	0.7028
1.1711	17.0	10030	1.9400	0.6242
1.1356	18.0	10620	1.1234	0.7165
1.0466	19.0	11210	1.0939	0.7312
1.1043	20.0	11800	1.2564	0.7183
0.9875	21.0	12390	1.1273	0.7135
0.9788	22.0	12980	1.0513	0.7187
0.9086	23.0	13570	1.0497	0.7312
0.9327	24.0	14160	1.1127	0.7046
0.8835	25.0	14750	1.3732	0.7235
0.8652	26.0	15340	1.6447	0.6511
0.843	27.0	15930	1.1686	0.7425
0.8072	28.0	16520	1.0110	0.7446
0.7735	29.0	17110	1.1610	0.7401
0.7717	30.0	17700	0.9851	0.7352
0.7746	31.0	18290	1.4960	0.7223
0.7439	32.0	18880	0.9772	0.7358
0.7534	33.0	19470	1.0034	0.7456
0.6874	34.0	20060	0.9894	0.7407
0.6877	35.0	20650	1.4460	0.6771
0.6816	36.0	21240	1.0221	0.7489
0.7158	37.0	21830	1.3579	0.7425
0.6694	38.0	22420	1.1472	0.7517
0.6586	39.0	23010	1.0499	0.7523
0.6418	40.0	23600	1.0344	0.7459
0.6366	41.0	24190	1.2582	0.7422
0.6289	42.0	24780	0.9833	0.7370
0.6065	43.0	25370	1.0209	0.7529
0.6053	44.0	25960	1.0147	0.7287
0.5958	45.0	26550	0.9454	0.7456
0.5637	46.0	27140	0.9789	0.7535
0.5818	47.0	27730	1.0014	0.7529
0.5743	48.0	28320	0.9380	0.7526
0.592	49.0	28910	0.9494	0.7385
0.5591	50.0	29500	0.9728	0.7523
0.5431	51.0	30090	0.9528	0.7502
0.5537	52.0	30680	0.9995	0.7410
0.5444	53.0	31270	0.9815	0.7538
0.5372	54.0	31860	0.9556	0.7517
0.5491	55.0	32450	0.9824	0.7459
0.5294	56.0	33040	0.9625	0.7391
0.5074	57.0	33630	0.9761	0.7538
0.5127	58.0	34220	1.1065	0.7587
0.5095	59.0	34810	0.9373	0.7434
0.5079	60.0	35400	0.9822	0.7532
0.4886	61.0	35990	1.0654	0.7627
0.5143	62.0	36580	0.9688	0.7520
0.4822	63.0	37170	0.9816	0.7373
0.4956	64.0	37760	0.9746	0.7477
0.4953	65.0	38350	0.9493	0.7544
0.4794	66.0	38940	1.0795	0.7532
0.4794	67.0	39530	0.9915	0.7575
0.48	68.0	40120	0.9385	0.7498
0.4633	69.0	40710	1.0949	0.7526
0.4749	70.0	41300	1.0207	0.7557
0.4657	71.0	41890	0.9383	0.7428
0.465	72.0	42480	1.0948	0.7581
0.4558	73.0	43070	0.9506	0.7492
0.4516	74.0	43660	1.0518	0.7606
0.4577	75.0	44250	1.0124	0.7575
0.4642	76.0	44840	0.9293	0.7526
0.4497	77.0	45430	0.9862	0.7541
0.4614	78.0	46020	0.9403	0.7566
0.4442	79.0	46610	0.9599	0.7581
0.4483	80.0	47200	0.9766	0.7593
0.4223	81.0	47790	0.9297	0.7547
0.4416	82.0	48380	0.9614	0.7587
0.4279	83.0	48970	0.9403	0.7587
0.4159	84.0	49560	1.0827	0.7569
0.4319	85.0	50150	0.9250	0.7505
0.427	86.0	50740	0.9475	0.7517
0.427	87.0	51330	0.9429	0.7523
0.4233	88.0	51920	0.9721	0.7581
0.4167	89.0	52510	0.9387	0.7557
0.4162	90.0	53100	0.9282	0.7544
0.4163	91.0	53690	0.9785	0.7566
0.4214	92.0	54280	0.9217	0.7517
0.4038	93.0	54870	0.9470	0.7584
0.4258	94.0	55460	0.9254	0.7550
0.4206	95.0	56050	0.9380	0.7569
0.4086	96.0	56640	0.9379	0.7578
0.3973	97.0	57230	0.9425	0.7557
0.3971	98.0	57820	0.9461	0.7572
0.3899	99.0	58410	0.9388	0.7557
0.4033	100.0	59000	0.9382	0.7557

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_7e-3_10_0.5

1_7e-3_10_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_7e-3_10_0.5

Evaluation results