1_5e-3_10_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9119
Accuracy: 0.7446

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.6814	1.0	590	2.2524	0.6128
2.6474	2.0	1180	2.2889	0.6217
2.7373	3.0	1770	3.8911	0.4401
2.7048	4.0	2360	2.6859	0.6214
2.3193	5.0	2950	3.0408	0.6217
2.0191	6.0	3540	2.0926	0.5706
1.9595	7.0	4130	1.7082	0.6908
1.833	8.0	4720	1.7816	0.6092
1.7395	9.0	5310	1.6251	0.6281
1.7038	10.0	5900	2.6889	0.6554
1.7975	11.0	6490	1.5326	0.6994
1.5534	12.0	7080	2.6513	0.5554
1.5833	13.0	7670	1.5617	0.6410
1.4585	14.0	8260	1.8289	0.6171
1.4375	15.0	8850	1.6306	0.6517
1.3418	16.0	9440	1.2628	0.7153
1.2576	17.0	10030	1.4116	0.7098
1.2068	18.0	10620	1.1643	0.7089
1.1781	19.0	11210	1.4702	0.7083
1.1497	20.0	11800	1.1550	0.6988
1.0552	21.0	12390	1.0861	0.7284
1.047	22.0	12980	1.0821	0.7205
1.0036	23.0	13570	1.1193	0.7193
0.9589	24.0	14160	1.3591	0.7135
0.9604	25.0	14750	1.0030	0.7229
0.9283	26.0	15340	1.1469	0.7031
0.9242	27.0	15930	1.0466	0.7318
0.8703	28.0	16520	1.0736	0.7343
0.858	29.0	17110	1.0357	0.7183
0.8267	30.0	17700	0.9936	0.7339
0.8148	31.0	18290	0.9989	0.7321
0.7981	32.0	18880	1.0559	0.7404
0.7956	33.0	19470	1.0207	0.7217
0.7817	34.0	20060	0.9636	0.7361
0.7545	35.0	20650	0.9415	0.7324
0.7372	36.0	21240	1.0793	0.7413
0.7317	37.0	21830	1.2911	0.7315
0.7411	38.0	22420	0.9517	0.7364
0.7093	39.0	23010	1.0133	0.7382
0.6838	40.0	23600	1.1835	0.7401
0.6773	41.0	24190	0.9180	0.7379
0.6776	42.0	24780	0.9410	0.7367
0.6486	43.0	25370	0.9836	0.7419
0.6527	44.0	25960	0.9721	0.7309
0.6465	45.0	26550	0.9508	0.7388
0.6245	46.0	27140	0.9273	0.7434
0.6258	47.0	27730	0.9763	0.7330
0.6086	48.0	28320	0.9135	0.7388
0.6417	49.0	28910	1.0037	0.7446
0.6064	50.0	29500	0.9751	0.7398
0.5938	51.0	30090	0.9801	0.7453
0.5951	52.0	30680	0.9515	0.7370
0.5718	53.0	31270	0.9160	0.7419
0.5751	54.0	31860	0.9263	0.7462
0.5839	55.0	32450	0.9170	0.7376
0.5707	56.0	33040	0.9787	0.7431
0.564	57.0	33630	0.9822	0.7431
0.5539	58.0	34220	0.9335	0.7407
0.5567	59.0	34810	1.0004	0.7370
0.5555	60.0	35400	0.9554	0.7446
0.5344	61.0	35990	0.9199	0.7483
0.5494	62.0	36580	0.9970	0.7456
0.5226	63.0	37170	0.9454	0.7434
0.5275	64.0	37760	0.9771	0.7361
0.5186	65.0	38350	1.0032	0.7517
0.52	66.0	38940	0.9263	0.7440
0.5209	67.0	39530	1.0130	0.7443
0.528	68.0	40120	0.9466	0.7422
0.5146	69.0	40710	0.9790	0.7456
0.5026	70.0	41300	0.9880	0.7489
0.5204	71.0	41890	0.9132	0.7373
0.5049	72.0	42480	0.9589	0.7480
0.4969	73.0	43070	0.9564	0.7446
0.4911	74.0	43660	0.9255	0.7336
0.4961	75.0	44250	0.9983	0.7502
0.4986	76.0	44840	0.9003	0.7376
0.4979	77.0	45430	0.8937	0.7385
0.4941	78.0	46020	0.9082	0.7422
0.487	79.0	46610	0.9231	0.7471
0.4773	80.0	47200	0.9673	0.7437
0.4665	81.0	47790	0.9598	0.7462
0.4824	82.0	48380	0.9110	0.7410
0.4795	83.0	48970	0.9222	0.7425
0.4654	84.0	49560	0.9369	0.7459
0.4605	85.0	50150	0.9379	0.7502
0.477	86.0	50740	0.8911	0.7437
0.4644	87.0	51330	0.9287	0.7434
0.4539	88.0	51920	0.9421	0.7422
0.4582	89.0	52510	0.9248	0.7437
0.4488	90.0	53100	0.9152	0.7425
0.4554	91.0	53690	0.9511	0.7471
0.4547	92.0	54280	0.9064	0.7419
0.4534	93.0	54870	0.9404	0.7471
0.463	94.0	55460	0.9346	0.7453
0.4482	95.0	56050	0.9191	0.7437
0.4518	96.0	56640	0.9154	0.7431
0.4326	97.0	57230	0.9055	0.7440
0.4291	98.0	57820	0.9072	0.7437
0.4278	99.0	58410	0.9101	0.7437
0.4397	100.0	59000	0.9119	0.7446

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_5e-3_10_0.5

1_5e-3_10_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_5e-3_10_0.5

Evaluation results