tiny-mlm-glue-qnli-custom-tokenizer

This model is a fine-tuned version of google/bert_uncased_L-2_H-128_A-2 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
7.9951	0.4	500	7.3315
7.1282	0.8	1000	7.2457
7.0402	1.2	1500	7.2104
6.9634	1.6	2000	7.1415
6.9383	2.0	2500	7.0838
6.8365	2.4	3000	7.0031
6.7812	2.8	3500	6.9679
6.6959	3.2	4000	6.9121
6.6423	3.6	4500	6.8421
6.5766	4.0	5000	6.8474
6.5676	4.4	5500	6.8089
6.4728	4.8	6000	6.7246
6.5008	5.2	6500	6.7049
6.4367	5.6	7000	6.6539
6.4016	6.0	7500	6.6268
6.4063	6.4	8000	6.6038
6.3836	6.8	8500	6.5452
6.3576	7.2	9000	6.5932
6.2768	7.6	9500	6.5443
6.3002	8.0	10000	6.5018
6.304	8.4	10500	6.5263
6.2123	8.8	11000	6.4739
6.2015	9.2	11500	6.4407
6.1809	9.6	12000	6.4371
6.1624	10.0	12500	6.4379
6.1831	10.4	13000	6.3897
6.163	10.8	13500	6.4086
6.0881	11.2	14000	6.3902
6.0474	11.6	14500	6.3229
6.0454	12.0	15000	6.2995
6.0491	12.4	15500	6.3559
6.0045	12.8	16000	6.2820
6.043	13.2	16500	6.3260
5.9485	13.6	17000	6.2554
5.9513	14.0	17500	6.2668
5.9501	14.4	18000	6.2396
5.9882	14.8	18500	6.2655
5.9311	15.2	19000	6.1839
5.9662	15.6	19500	6.1942
5.9328	16.0	20000	6.2425
5.8984	16.4	20500	6.2339