1_6e-3_5_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.8860
Accuracy: 0.7462

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.006
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.5324	1.0	590	2.6875	0.6217
2.4802	2.0	1180	3.4068	0.6214
2.6163	3.0	1770	3.8107	0.3841
2.2085	4.0	2360	2.0912	0.5021
2.1045	5.0	2950	1.6305	0.6394
1.7984	6.0	3540	1.8421	0.6352
1.7236	7.0	4130	1.3822	0.6550
1.6613	8.0	4720	1.3880	0.6939
1.5506	9.0	5310	2.7376	0.6498
1.6032	10.0	5900	1.9660	0.5471
1.4851	11.0	6490	1.2698	0.7015
1.3779	12.0	7080	1.1481	0.7070
1.315	13.0	7670	1.1203	0.6963
1.3238	14.0	8260	1.1089	0.7040
1.2662	15.0	8850	1.0526	0.7211
1.2489	16.0	9440	1.0878	0.6905
1.1504	17.0	10030	1.1004	0.7232
1.1289	18.0	10620	1.2881	0.6615
1.0159	19.0	11210	0.9890	0.7196
1.1298	20.0	11800	1.0623	0.7070
0.9891	21.0	12390	1.2508	0.7211
0.9865	22.0	12980	1.3142	0.6630
0.996	23.0	13570	1.0147	0.7125
0.9373	24.0	14160	1.0033	0.7281
0.9647	25.0	14750	2.0608	0.6920
0.8803	26.0	15340	0.9517	0.7312
0.8541	27.0	15930	0.9624	0.7266
0.8476	28.0	16520	0.9491	0.7239
0.8058	29.0	17110	0.9725	0.7385
0.8055	30.0	17700	0.9748	0.7248
0.788	31.0	18290	1.0021	0.7333
0.7576	32.0	18880	0.9257	0.7358
0.7698	33.0	19470	1.1881	0.6872
0.7371	34.0	20060	0.9496	0.7303
0.7355	35.0	20650	0.9241	0.7306
0.7062	36.0	21240	0.9682	0.7336
0.6691	37.0	21830	0.9349	0.7358
0.6613	38.0	22420	0.9785	0.7437
0.7068	39.0	23010	0.9227	0.7416
0.6189	40.0	23600	1.1750	0.7419
0.6352	41.0	24190	1.1787	0.7394
0.63	42.0	24780	0.9740	0.7422
0.6166	43.0	25370	1.2322	0.7376
0.6076	44.0	25960	0.9889	0.7260
0.6081	45.0	26550	1.2527	0.6783
0.5942	46.0	27140	0.9813	0.7214
0.5892	47.0	27730	0.9268	0.7391
0.5552	48.0	28320	0.9250	0.7425
0.5875	49.0	28910	0.9149	0.7306
0.5532	50.0	29500	0.9487	0.7272
0.5467	51.0	30090	0.9219	0.7355
0.5536	52.0	30680	0.9884	0.7431
0.5306	53.0	31270	1.0661	0.7165
0.5382	54.0	31860	0.9046	0.7379
0.5506	55.0	32450	1.0618	0.7150
0.5427	56.0	33040	0.9165	0.7434
0.513	57.0	33630	1.2612	0.7358
0.5008	58.0	34220	0.9674	0.7388
0.4962	59.0	34810	0.9219	0.7346
0.5079	60.0	35400	0.9093	0.7413
0.4973	61.0	35990	0.9088	0.7343
0.4938	62.0	36580	0.8926	0.7404
0.4984	63.0	37170	1.0869	0.7080
0.4907	64.0	37760	0.9026	0.7343
0.4727	65.0	38350	0.8803	0.7410
0.4667	66.0	38940	0.9391	0.7404
0.4706	67.0	39530	0.9321	0.7343
0.4696	68.0	40120	0.9011	0.7446
0.4471	69.0	40710	0.9192	0.7450
0.4535	70.0	41300	1.1121	0.7483
0.4664	71.0	41890	0.8832	0.7346
0.4462	72.0	42480	0.8937	0.7413
0.4247	73.0	43070	0.9067	0.7419
0.4218	74.0	43660	0.9289	0.7416
0.4553	75.0	44250	0.9095	0.7453
0.4485	76.0	44840	0.9062	0.7477
0.432	77.0	45430	0.8999	0.7394
0.4325	78.0	46020	0.8833	0.7523
0.4293	79.0	46610	0.9077	0.7495
0.4259	80.0	47200	0.9243	0.7440
0.4056	81.0	47790	0.9145	0.7431
0.424	82.0	48380	0.9100	0.7450
0.418	83.0	48970	0.9334	0.7532
0.4122	84.0	49560	0.9404	0.7511
0.4023	85.0	50150	0.9007	0.7443
0.4066	86.0	50740	0.9115	0.7474
0.4065	87.0	51330	0.9344	0.7443
0.4098	88.0	51920	0.9139	0.7453
0.3902	89.0	52510	0.9120	0.7398
0.3926	90.0	53100	0.9105	0.7425
0.3994	91.0	53690	0.9182	0.7394
0.3998	92.0	54280	0.8989	0.7446
0.3961	93.0	54870	0.9133	0.7446
0.3982	94.0	55460	0.8877	0.7428
0.3855	95.0	56050	0.9050	0.7480
0.3785	96.0	56640	0.8889	0.7456
0.3816	97.0	57230	0.8830	0.7431
0.377	98.0	57820	0.8847	0.7440
0.367	99.0	58410	0.8872	0.7456
0.3799	100.0	59000	0.8860	0.7462

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_6e-3_5_0.5

1_6e-3_5_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_6e-3_5_0.5

Evaluation results