bert-base-uncased-8-50-0.01

This model is a fine-tuned version of bert-base-uncased on the glue dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.0	400	0.9219
1.2047	2.0	800	1.8168
1.0707	3.0	1200	1.4474
1.0538	4.0	1600	1.5223
1.316	5.0	2000	0.8467
1.316	6.0	2400	1.0906
1.2739	7.0	2800	0.6851
1.1342	8.0	3200	1.3170
1.2572	9.0	3600	0.8870
1.0237	10.0	4000	1.3236
1.0237	11.0	4400	0.9025
0.9597	12.0	4800	0.7757
1.0946	13.0	5200	1.2551
1.0011	14.0	5600	1.1606
1.1111	15.0	6000	0.6040
1.1111	16.0	6400	1.4347
1.0098	17.0	6800	0.6218
1.0829	18.0	7200	0.4979
0.9131	19.0	7600	1.3040
0.879	20.0	8000	2.0309
0.879	21.0	8400	0.5150
0.9646	22.0	8800	0.4850
0.9625	23.0	9200	0.5076
0.9129	24.0	9600	1.1277
0.8839	25.0	10000	0.9403
0.8839	26.0	10400	1.6226
0.9264	27.0	10800	0.6049
0.7999	28.0	11200	0.9549
0.752	29.0	11600	0.6757
0.7675	30.0	12000	0.7320
0.7675	31.0	12400	0.8393
0.6887	32.0	12800	0.5977
0.7563	33.0	13200	0.4815
0.7671	34.0	13600	0.5457
0.7227	35.0	14000	0.7384
0.7227	36.0	14400	0.7749
0.7308	37.0	14800	0.4726
0.7191	38.0	15200	0.5069
0.6846	39.0	15600	0.4762
0.6151	40.0	16000	0.4738
0.6151	41.0	16400	0.5114
0.5982	42.0	16800	0.4866
0.6199	43.0	17200	0.4717
0.5737	44.0	17600	0.7651
0.5703	45.0	18000	0.8008
0.5703	46.0	18400	0.5391
0.5748	47.0	18800	0.5097
0.5297	48.0	19200	0.4731
0.4902	49.0	19600	0.4720
0.4955	50.0	20000	0.4748