1_6e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9552
Accuracy: 0.7294

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.006
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.1794	1.0	590	0.8903	0.6217
1.096	2.0	1180	0.6682	0.5771
0.8877	3.0	1770	1.0585	0.3792
0.8825	4.0	2360	0.6340	0.6229
0.891	5.0	2950	0.8424	0.6217
0.7749	6.0	3540	0.6586	0.5752
0.8351	7.0	4130	0.6083	0.6373
0.7693	8.0	4720	0.6969	0.5813
0.869	9.0	5310	0.5918	0.6777
0.7739	10.0	5900	0.6373	0.6416
0.741	11.0	6490	0.7306	0.6306
0.6366	12.0	7080	0.6535	0.6951
0.6503	13.0	7670	0.5655	0.7021
0.7297	14.0	8260	0.8470	0.5847
0.5637	15.0	8850	0.6914	0.6278
0.6233	16.0	9440	0.7041	0.6862
0.5812	17.0	10030	0.6282	0.7049
0.5423	18.0	10620	1.1433	0.5612
0.5366	19.0	11210	0.6643	0.7168
0.5369	20.0	11800	0.9787	0.6832
0.4828	21.0	12390	0.8036	0.7049
0.5085	22.0	12980	0.8132	0.6425
0.4488	23.0	13570	0.7755	0.6651
0.4184	24.0	14160	0.6817	0.7104
0.448	25.0	14750	0.6490	0.7193
0.4123	26.0	15340	0.7854	0.6728
0.4196	27.0	15930	0.7012	0.7138
0.4119	28.0	16520	0.7525	0.7116
0.3811	29.0	17110	0.7333	0.7012
0.3698	30.0	17700	1.1169	0.6480
0.3382	31.0	18290	0.6635	0.7232
0.338	32.0	18880	0.7444	0.7266
0.3359	33.0	19470	1.0398	0.6621
0.3071	34.0	20060	0.8387	0.7291
0.3001	35.0	20650	0.7648	0.7281
0.3221	36.0	21240	0.7485	0.7266
0.2973	37.0	21830	0.7841	0.7260
0.2801	38.0	22420	0.8797	0.7242
0.2666	39.0	23010	0.9504	0.7028
0.2575	40.0	23600	0.8444	0.7217
0.2796	41.0	24190	1.1635	0.7067
0.2596	42.0	24780	0.8979	0.7217
0.2465	43.0	25370	0.8439	0.7177
0.2475	44.0	25960	0.9628	0.7028
0.2394	45.0	26550	0.9549	0.7156
0.2192	46.0	27140	0.8422	0.7251
0.2253	47.0	27730	0.9386	0.7245
0.2063	48.0	28320	0.9686	0.7028
0.2258	49.0	28910	0.8843	0.7165
0.2114	50.0	29500	0.9566	0.7324
0.2039	51.0	30090	1.0167	0.7073
0.182	52.0	30680	0.9182	0.7303
0.1825	53.0	31270	0.9879	0.7147
0.1827	54.0	31860	0.9542	0.7199
0.1727	55.0	32450	0.9540	0.7245
0.1857	56.0	33040	0.9222	0.7294
0.182	57.0	33630	1.1263	0.7021
0.1716	58.0	34220	0.9947	0.7239
0.1659	59.0	34810	0.9969	0.7220
0.1596	60.0	35400	0.9764	0.7193
0.1656	61.0	35990	1.0089	0.7281
0.1545	62.0	36580	0.9712	0.7193
0.1429	63.0	37170	0.9785	0.7245
0.1567	64.0	37760	1.0706	0.7076
0.1493	65.0	38350	0.9546	0.7287
0.1453	66.0	38940	0.9959	0.7245
0.1384	67.0	39530	0.9687	0.7300
0.1409	68.0	40120	0.9739	0.7202
0.1388	69.0	40710	1.1173	0.7232
0.1366	70.0	41300	0.9598	0.7254
0.1429	71.0	41890	1.0048	0.7070
0.1384	72.0	42480	0.9816	0.7205
0.1221	73.0	43070	1.0827	0.7232
0.131	74.0	43660	1.0217	0.7294
0.1282	75.0	44250	0.9694	0.7287
0.1308	76.0	44840	1.0198	0.7208
0.1252	77.0	45430	1.0261	0.7278
0.1252	78.0	46020	0.9709	0.7272
0.117	79.0	46610	1.0140	0.7257
0.1171	80.0	47200	1.0226	0.7321
0.1132	81.0	47790	1.0880	0.7199
0.116	82.0	48380	0.9087	0.7254
0.1156	83.0	48970	0.9973	0.7257
0.103	84.0	49560	1.0078	0.7287
0.1096	85.0	50150	1.0122	0.7263
0.1097	86.0	50740	1.0316	0.7312
0.098	87.0	51330	1.0030	0.7275
0.1035	88.0	51920	0.9551	0.7214
0.0978	89.0	52510	1.0217	0.7287
0.1001	90.0	53100	0.9817	0.7291
0.1011	91.0	53690	0.9693	0.7281
0.0957	92.0	54280	1.0017	0.7199
0.0946	93.0	54870	0.9992	0.7278
0.0976	94.0	55460	0.9660	0.7291
0.0961	95.0	56050	0.9572	0.7278
0.0944	96.0	56640	0.9801	0.7269
0.0944	97.0	57230	0.9527	0.7272
0.0936	98.0	57820	0.9543	0.7266
0.0939	99.0	58410	0.9540	0.7281
0.0915	100.0	59000	0.9552	0.7294

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_6e-3_1_0.1

1_6e-3_1_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_6e-3_1_0.1

Evaluation results