1_7e-3_5_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9635
Accuracy: 0.7382

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.007
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.4151	1.0	590	1.1624	0.6217
1.4557	2.0	1180	0.9521	0.4489
1.2723	3.0	1770	3.3480	0.3795
1.1923	4.0	2360	1.0321	0.4761
1.2283	5.0	2950	1.7063	0.6217
1.0486	6.0	3540	0.8079	0.6566
0.983	7.0	4130	2.7141	0.4119
1.061	8.0	4720	1.2305	0.6407
0.9617	9.0	5310	0.9103	0.6654
0.9218	10.0	5900	1.0764	0.5728
0.8804	11.0	6490	0.7290	0.7034
0.8314	12.0	7080	0.7770	0.7080
0.7805	13.0	7670	0.7321	0.7165
0.7474	14.0	8260	0.7924	0.6667
0.7693	15.0	8850	0.8842	0.7150
0.7532	16.0	9440	0.6981	0.7174
0.6803	17.0	10030	1.2782	0.6064
0.6888	18.0	10620	0.9639	0.7061
0.6432	19.0	11210	0.8320	0.7174
0.6091	20.0	11800	0.8192	0.7144
0.5904	21.0	12390	1.0849	0.7089
0.5754	22.0	12980	0.8291	0.6823
0.539	23.0	13570	1.1292	0.7128
0.525	24.0	14160	0.8724	0.6942
0.5346	25.0	14750	0.8999	0.7067
0.5164	26.0	15340	1.5764	0.5832
0.4874	27.0	15930	1.1817	0.6581
0.439	28.0	16520	1.0572	0.6719
0.4388	29.0	17110	0.9059	0.7376
0.4096	30.0	17700	0.8708	0.7028
0.4117	31.0	18290	0.9059	0.7379
0.401	32.0	18880	0.8226	0.7303
0.3763	33.0	19470	0.8717	0.7248
0.3629	34.0	20060	0.9393	0.7046
0.33	35.0	20650	0.8766	0.7248
0.3598	36.0	21240	1.0561	0.7315
0.3211	37.0	21830	0.9181	0.7021
0.3146	38.0	22420	0.8177	0.7303
0.322	39.0	23010	0.9637	0.7336
0.2963	40.0	23600	1.0769	0.7128
0.3265	41.0	24190	1.0980	0.7330
0.276	42.0	24780	0.8939	0.7422
0.2953	43.0	25370	1.0178	0.7303
0.2669	44.0	25960	1.0061	0.7150
0.2613	45.0	26550	1.0087	0.7076
0.257	46.0	27140	0.8887	0.7122
0.2586	47.0	27730	1.0173	0.7327
0.2492	48.0	28320	1.0005	0.7324
0.2572	49.0	28910	0.9586	0.7226
0.2388	50.0	29500	0.9336	0.7318
0.218	51.0	30090	1.0072	0.7220
0.2353	52.0	30680	0.8747	0.7343
0.2252	53.0	31270	0.9927	0.7361
0.2239	54.0	31860	0.9873	0.7281
0.2289	55.0	32450	1.0668	0.7098
0.2108	56.0	33040	0.8821	0.7306
0.197	57.0	33630	0.9667	0.7287
0.2045	58.0	34220	0.8937	0.7294
0.2092	59.0	34810	1.1175	0.7110
0.2115	60.0	35400	1.0294	0.7330
0.2051	61.0	35990	0.9363	0.7349
0.1947	62.0	36580	0.9427	0.7278
0.1918	63.0	37170	1.0344	0.7226
0.1911	64.0	37760	0.9883	0.7324
0.1875	65.0	38350	0.9878	0.7281
0.181	66.0	38940	1.0037	0.7306
0.1844	67.0	39530	1.0300	0.7309
0.172	68.0	40120	0.9785	0.7275
0.1728	69.0	40710	1.0590	0.7413
0.1756	70.0	41300	0.9992	0.7248
0.1671	71.0	41890	1.0583	0.7061
0.1824	72.0	42480	1.0114	0.7361
0.1638	73.0	43070	0.9866	0.7266
0.159	74.0	43660	1.0436	0.7242
0.168	75.0	44250	1.0963	0.7364
0.1637	76.0	44840	0.9260	0.7300
0.1583	77.0	45430	0.9472	0.7309
0.161	78.0	46020	0.9540	0.7300
0.1485	79.0	46610	0.9537	0.7294
0.1566	80.0	47200	1.0064	0.7248
0.1499	81.0	47790	0.9961	0.7358
0.1529	82.0	48380	0.9872	0.7410
0.1545	83.0	48970	1.0003	0.7309
0.1481	84.0	49560	0.9471	0.7349
0.1492	85.0	50150	0.9946	0.7235
0.1402	86.0	50740	1.0070	0.7394
0.1437	87.0	51330	0.9976	0.7379
0.1368	88.0	51920	0.9900	0.7355
0.1394	89.0	52510	1.0081	0.7333
0.1376	90.0	53100	0.9910	0.7349
0.1402	91.0	53690	0.9569	0.7358
0.1397	92.0	54280	0.9660	0.7346
0.1311	93.0	54870	0.9787	0.7291
0.1389	94.0	55460	0.9653	0.7343
0.1315	95.0	56050	0.9494	0.7346
0.1301	96.0	56640	0.9705	0.7333
0.133	97.0	57230	0.9615	0.7355
0.1293	98.0	57820	0.9686	0.7312
0.1332	99.0	58410	0.9759	0.7346
0.1306	100.0	59000	0.9635	0.7382

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_7e-3_5_0.1

1_7e-3_5_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_7e-3_5_0.1

Evaluation results