1_6e-3_10_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9536
Accuracy: 0.7596

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.006
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.948	1.0	590	2.2396	0.6214
2.5635	2.0	1180	2.2693	0.6275
2.5246	3.0	1770	1.9556	0.6141
2.329	4.0	2360	2.3951	0.4801
2.1726	5.0	2950	1.7234	0.6618
2.0265	6.0	3540	1.5347	0.6679
2.0227	7.0	4130	1.8508	0.6064
1.8725	8.0	4720	2.0863	0.6584
1.8575	9.0	5310	4.0052	0.4639
1.8071	10.0	5900	3.1552	0.6468
1.6655	11.0	6490	1.3147	0.7104
1.501	12.0	7080	1.3005	0.6844
1.538	13.0	7670	1.7051	0.6948
1.4114	14.0	8260	1.4922	0.7028
1.3916	15.0	8850	1.6514	0.7034
1.3373	16.0	9440	1.9420	0.5896
1.271	17.0	10030	2.9731	0.6624
1.3123	18.0	10620	1.4756	0.6609
1.2775	19.0	11210	1.4888	0.6612
1.2341	20.0	11800	1.4493	0.7159
1.1907	21.0	12390	1.7638	0.7110
1.2035	22.0	12980	1.0716	0.7291
1.0365	23.0	13570	1.2975	0.6853
1.1041	24.0	14160	1.0275	0.7220
1.1326	25.0	14750	1.0228	0.7385
1.0261	26.0	15340	1.1473	0.7076
1.0168	27.0	15930	1.0435	0.7205
1.0653	28.0	16520	1.0105	0.7358
0.9418	29.0	17110	1.0397	0.7232
1.0591	30.0	17700	1.3640	0.6917
0.9186	31.0	18290	0.9679	0.7459
0.8665	32.0	18880	1.0310	0.7303
0.9005	33.0	19470	1.0498	0.7235
0.8494	34.0	20060	0.9766	0.7358
0.8474	35.0	20650	1.0077	0.7465
0.7973	36.0	21240	1.0674	0.7428
0.8049	37.0	21830	1.0074	0.7398
0.8241	38.0	22420	0.9613	0.7453
0.7793	39.0	23010	0.9864	0.7398
0.7781	40.0	23600	1.0741	0.7456
0.7539	41.0	24190	0.9809	0.7550
0.7403	42.0	24780	0.9993	0.7339
0.7494	43.0	25370	0.9887	0.7477
0.7091	44.0	25960	1.1792	0.7125
0.7236	45.0	26550	0.9549	0.7443
0.6947	46.0	27140	1.3568	0.7440
0.6928	47.0	27730	1.0682	0.7517
0.6578	48.0	28320	1.0993	0.7486
0.7723	49.0	28910	1.0381	0.7260
0.7169	50.0	29500	0.9510	0.7486
0.6424	51.0	30090	1.0781	0.7281
0.6652	52.0	30680	0.9623	0.7541
0.6274	53.0	31270	0.9476	0.7498
0.6295	54.0	31860	0.9461	0.7474
0.6252	55.0	32450	1.0873	0.7278
0.632	56.0	33040	0.9470	0.7492
0.5865	57.0	33630	1.4737	0.7355
0.6029	58.0	34220	1.0871	0.7477
0.5935	59.0	34810	1.0781	0.7514
0.6023	60.0	35400	0.9968	0.7581
0.5849	61.0	35990	1.0700	0.7547
0.5813	62.0	36580	1.2525	0.7425
0.5557	63.0	37170	0.9643	0.7541
0.541	64.0	37760	1.0179	0.7547
0.5693	65.0	38350	1.0064	0.7401
0.5562	66.0	38940	1.2333	0.7367
0.5677	67.0	39530	0.9976	0.7388
0.5357	68.0	40120	0.9795	0.7413
0.5372	69.0	40710	1.1113	0.7462
0.5563	70.0	41300	1.1366	0.7492
0.5377	71.0	41890	0.9343	0.7502
0.5442	72.0	42480	1.1735	0.7465
0.5124	73.0	43070	0.9499	0.7514
0.5007	74.0	43660	1.2104	0.7456
0.5094	75.0	44250	0.9865	0.7474
0.5118	76.0	44840	1.0542	0.7474
0.5166	77.0	45430	0.9762	0.7615
0.5071	78.0	46020	0.9333	0.7581
0.4961	79.0	46610	1.0310	0.7535
0.4863	80.0	47200	1.0242	0.7492
0.4801	81.0	47790	1.0528	0.7535
0.4975	82.0	48380	1.0188	0.7554
0.4868	83.0	48970	0.9455	0.7596
0.4661	84.0	49560	0.9841	0.7557
0.4765	85.0	50150	0.9570	0.7538
0.4732	86.0	50740	1.0383	0.7535
0.4846	87.0	51330	0.9560	0.7587
0.4641	88.0	51920	0.9716	0.7578
0.477	89.0	52510	0.9581	0.7606
0.4567	90.0	53100	0.9674	0.7569
0.4567	91.0	53690	0.9718	0.7587
0.4676	92.0	54280	0.9535	0.7520
0.4532	93.0	54870	0.9593	0.7563
0.4727	94.0	55460	0.9611	0.7584
0.4535	95.0	56050	0.9539	0.7602
0.4569	96.0	56640	0.9506	0.7587
0.4417	97.0	57230	0.9616	0.7584
0.4314	98.0	57820	0.9488	0.7593
0.4318	99.0	58410	0.9439	0.7587
0.4415	100.0	59000	0.9536	0.7596

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_6e-3_10_0.5

1_6e-3_10_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_6e-3_10_0.5

Evaluation results