1_8e-3_5_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9097
Accuracy: 0.7502

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.008
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.7895	1.0	590	1.8785	0.6150
2.562	2.0	1180	2.8327	0.4046
2.4023	3.0	1770	2.0853	0.5217
2.3167	4.0	2360	1.5879	0.6505
2.161	5.0	2950	1.9917	0.4914
1.794	6.0	3540	2.5834	0.5110
1.9698	7.0	4130	3.1462	0.4927
1.5971	8.0	4720	1.6865	0.5966
1.5201	9.0	5310	3.4553	0.6413
1.5841	10.0	5900	3.1799	0.6327
1.5231	11.0	6490	1.1451	0.6933
1.3941	12.0	7080	1.1390	0.6884
1.3679	13.0	7670	1.4767	0.6902
1.2653	14.0	8260	1.5274	0.7028
1.2451	15.0	8850	1.6725	0.7073
1.255	16.0	9440	1.5284	0.7012
1.184	17.0	10030	1.0831	0.6979
1.1215	18.0	10620	2.0515	0.5755
1.0766	19.0	11210	1.1808	0.7263
1.1108	20.0	11800	1.0647	0.7190
1.0272	21.0	12390	1.2527	0.6654
1.036	22.0	12980	1.1910	0.6783
0.9735	23.0	13570	1.0311	0.7037
0.9167	24.0	14160	0.9997	0.7021
0.8494	25.0	14750	1.0338	0.7284
0.8461	26.0	15340	1.4642	0.6495
0.8466	27.0	15930	0.9877	0.7370
0.8498	28.0	16520	0.9401	0.7287
0.7851	29.0	17110	1.0208	0.7336
0.7796	30.0	17700	0.9350	0.7232
0.7725	31.0	18290	1.4097	0.7162
0.7599	32.0	18880	1.1313	0.7333
0.768	33.0	19470	1.0272	0.7379
0.7007	34.0	20060	0.9294	0.7364
0.6718	35.0	20650	0.9347	0.7330
0.6786	36.0	21240	1.0231	0.7416
0.6822	37.0	21830	0.9767	0.7413
0.6667	38.0	22420	0.9351	0.7272
0.6497	39.0	23010	0.9574	0.7355
0.638	40.0	23600	1.0610	0.7437
0.6468	41.0	24190	1.1462	0.7434
0.6046	42.0	24780	0.9750	0.7211
0.6079	43.0	25370	1.2040	0.7419
0.5806	44.0	25960	1.1603	0.7018
0.5753	45.0	26550	1.0639	0.7110
0.5693	46.0	27140	1.0966	0.7422
0.5757	47.0	27730	1.0137	0.7468
0.5692	48.0	28320	0.9476	0.7382
0.5732	49.0	28910	1.0004	0.7291
0.5563	50.0	29500	0.9870	0.7394
0.5217	51.0	30090	0.9681	0.7312
0.5239	52.0	30680	0.9812	0.7456
0.525	53.0	31270	1.0355	0.7196
0.5136	54.0	31860	0.9161	0.7385
0.5249	55.0	32450	1.0093	0.7382
0.5092	56.0	33040	1.0072	0.7428
0.4754	57.0	33630	1.0560	0.7425
0.4716	58.0	34220	0.9922	0.7425
0.4913	59.0	34810	1.0014	0.7480
0.4773	60.0	35400	0.9148	0.7352
0.4725	61.0	35990	0.9691	0.7474
0.4656	62.0	36580	0.9459	0.7453
0.4565	63.0	37170	0.9521	0.7388
0.4502	64.0	37760	1.0172	0.7474
0.4765	65.0	38350	0.9504	0.7327
0.4439	66.0	38940	0.9998	0.7443
0.4424	67.0	39530	1.0985	0.7498
0.4541	68.0	40120	0.9088	0.7446
0.4321	69.0	40710	0.9322	0.7379
0.4346	70.0	41300	1.0028	0.7495
0.4329	71.0	41890	0.8949	0.7385
0.4344	72.0	42480	0.9631	0.7544
0.4111	73.0	43070	0.9800	0.7272
0.4183	74.0	43660	1.1350	0.7541
0.4234	75.0	44250	0.9444	0.7511
0.4297	76.0	44840	0.9584	0.7526
0.4172	77.0	45430	0.9165	0.7413
0.4083	78.0	46020	0.9103	0.7401
0.4078	79.0	46610	0.9100	0.7468
0.3977	80.0	47200	0.9172	0.7480
0.3885	81.0	47790	0.9714	0.7523
0.4012	82.0	48380	1.0683	0.7547
0.3831	83.0	48970	0.9867	0.7575
0.3878	84.0	49560	0.9245	0.7541
0.3841	85.0	50150	0.9662	0.7327
0.3835	86.0	50740	0.9532	0.7505
0.3755	87.0	51330	0.9645	0.7492
0.379	88.0	51920	0.9183	0.7483
0.38	89.0	52510	0.9787	0.7523
0.37	90.0	53100	0.9205	0.7443
0.368	91.0	53690	0.9236	0.7446
0.3737	92.0	54280	0.9023	0.7419
0.3663	93.0	54870	0.9200	0.7514
0.3763	94.0	55460	0.9496	0.7517
0.3635	95.0	56050	0.9487	0.7508
0.3656	96.0	56640	0.9122	0.7502
0.3604	97.0	57230	0.9036	0.7498
0.3475	98.0	57820	0.9054	0.7474
0.3552	99.0	58410	0.9078	0.7471
0.3564	100.0	59000	0.9097	0.7502

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_8e-3_5_0.5

1_8e-3_5_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_8e-3_5_0.5

Evaluation results