mobilebert_sa_GLUE_Experiment_logit_kd_pretrain_sst2

This model is a fine-tuned version of gokuls/mobilebert_sa_pre-training-complete on the GLUE SST2 dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.4176	1.0	527	0.2978	0.9197
0.1807	2.0	1054	0.2951	0.9174
0.1163	3.0	1581	0.2749	0.9186
0.0862	4.0	2108	0.2988	0.9083
0.0695	5.0	2635	0.2760	0.9174
0.0598	6.0	3162	0.2695	0.9151
0.0525	7.0	3689	0.2723	0.9255
0.0464	8.0	4216	0.2430	0.9243
0.0422	9.0	4743	0.2814	0.9243
0.0395	10.0	5270	0.2464	0.9163
0.0357	11.0	5797	0.2390	0.9197
0.0341	12.0	6324	0.2713	0.9197
0.0328	13.0	6851	0.2685	0.9220
0.0315	14.0	7378	0.2585	0.9186
0.0296	15.0	7905	0.2367	0.9220
0.0283	16.0	8432	0.2560	0.9186
0.0277	17.0	8959	0.2635	0.9174
0.0269	18.0	9486	0.2364	0.9266
0.026	19.0	10013	0.2749	0.9209
0.0252	20.0	10540	0.2507	0.9174
0.0248	21.0	11067	0.2769	0.9163
0.0248	22.0	11594	0.2543	0.9220
0.024	23.0	12121	0.2677	0.9209