1_7e-3_10_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9819
Accuracy: 0.7303

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.007
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.4686	1.0	590	2.1510	0.3798
1.4409	2.0	1180	1.6620	0.6214
1.3336	3.0	1770	2.9692	0.3789
1.3331	4.0	2360	0.9502	0.6306
1.1121	5.0	2950	1.0075	0.6294
1.1211	6.0	3540	0.8872	0.6612
1.0596	7.0	4130	2.2995	0.4128
0.9931	8.0	4720	0.9438	0.6810
0.9235	9.0	5310	0.8872	0.6581
0.9613	10.0	5900	1.2425	0.5847
0.9177	11.0	6490	0.8943	0.6862
0.7985	12.0	7080	0.8038	0.6884
0.7943	13.0	7670	0.8016	0.6924
0.7742	14.0	8260	0.7611	0.7162
0.7373	15.0	8850	0.8728	0.7128
0.7054	16.0	9440	0.7415	0.7116
0.6589	17.0	10030	0.7437	0.7070
0.6449	18.0	10620	1.1703	0.6303
0.5872	19.0	11210	0.7583	0.7217
0.6065	20.0	11800	0.8280	0.7196
0.5721	21.0	12390	0.8555	0.7012
0.5955	22.0	12980	0.8109	0.7147
0.5202	23.0	13570	0.7935	0.7245
0.5017	24.0	14160	0.8676	0.6976
0.4923	25.0	14750	0.9052	0.7346
0.4774	26.0	15340	1.5937	0.5976
0.4714	27.0	15930	0.8523	0.7220
0.4439	28.0	16520	0.8909	0.7278
0.4227	29.0	17110	0.9224	0.7321
0.4029	30.0	17700	0.8559	0.7245
0.4015	31.0	18290	0.9032	0.7309
0.3923	32.0	18880	0.9003	0.7327
0.3897	33.0	19470	0.9786	0.6966
0.354	34.0	20060	0.8606	0.7251
0.3508	35.0	20650	0.8788	0.7278
0.3293	36.0	21240	1.1236	0.7214
0.3336	37.0	21830	0.9196	0.7266
0.3407	38.0	22420	0.9319	0.7220
0.3338	39.0	23010	0.8982	0.7321
0.3065	40.0	23600	0.9969	0.7333
0.2972	41.0	24190	1.0879	0.7309
0.2904	42.0	24780	0.9547	0.7327
0.2883	43.0	25370	0.9553	0.7187
0.2889	44.0	25960	0.9805	0.7251
0.269	45.0	26550	0.9516	0.7321
0.2573	46.0	27140	0.9094	0.7242
0.2679	47.0	27730	0.9398	0.7217
0.2595	48.0	28320	1.0380	0.7064
0.2819	49.0	28910	0.9346	0.7324
0.247	50.0	29500	0.9272	0.7239
0.2482	51.0	30090	0.9673	0.7254
0.242	52.0	30680	1.0115	0.7217
0.2343	53.0	31270	0.9958	0.7226
0.2381	54.0	31860	0.9392	0.7263
0.2279	55.0	32450	0.9564	0.7284
0.2256	56.0	33040	1.0298	0.7239
0.2267	57.0	33630	1.0001	0.7263
0.2161	58.0	34220	0.9867	0.7248
0.214	59.0	34810	0.9574	0.7226
0.2148	60.0	35400	1.0306	0.7229
0.2128	61.0	35990	1.0751	0.7346
0.2081	62.0	36580	0.9656	0.7263
0.203	63.0	37170	1.0100	0.7263
0.204	64.0	37760	0.9536	0.7297
0.1988	65.0	38350	0.9686	0.7269
0.1976	66.0	38940	0.9927	0.7297
0.1943	67.0	39530	0.9987	0.7309
0.1941	68.0	40120	0.9876	0.7309
0.1862	69.0	40710	0.9646	0.7321
0.1986	70.0	41300	1.0332	0.7324
0.1872	71.0	41890	0.9861	0.7324
0.1898	72.0	42480	0.9831	0.7346
0.1793	73.0	43070	0.9901	0.7303
0.1843	74.0	43660	1.0411	0.7294
0.1757	75.0	44250	1.0355	0.7312
0.1814	76.0	44840	1.0320	0.7239
0.1764	77.0	45430	0.9895	0.7333
0.1779	78.0	46020	0.9944	0.7367
0.1752	79.0	46610	0.9581	0.7263
0.1734	80.0	47200	0.9525	0.7297
0.1718	81.0	47790	0.9693	0.7275
0.1722	82.0	48380	0.9876	0.7297
0.1719	83.0	48970	0.9838	0.7306
0.161	84.0	49560	0.9996	0.7281
0.1711	85.0	50150	0.9880	0.7291
0.1634	86.0	50740	1.0062	0.7306
0.1587	87.0	51330	1.0071	0.7318
0.156	88.0	51920	1.0271	0.7297
0.1574	89.0	52510	1.0062	0.7321
0.151	90.0	53100	0.9889	0.7263
0.1553	91.0	53690	0.9676	0.7324
0.1584	92.0	54280	0.9721	0.7321
0.1491	93.0	54870	0.9824	0.7349
0.1523	94.0	55460	0.9880	0.7306
0.1509	95.0	56050	0.9993	0.7327
0.1496	96.0	56640	0.9892	0.7318
0.1518	97.0	57230	0.9925	0.7339
0.149	98.0	57820	0.9845	0.7333
0.1449	99.0	58410	0.9832	0.7312
0.15	100.0	59000	0.9819	0.7303

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_7e-3_10_0.1

1_7e-3_10_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_7e-3_10_0.1

Evaluation results