hbertv1-massive-logit_KD-mini

This model is a fine-tuned version of gokuls/model_v1_complete_training_wt_init_48_mini_freeze_new on the massive dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
3.5547	1.0	180	2.3028	0.4481
1.9374	2.0	360	1.2686	0.6513
1.2845	3.0	540	0.9328	0.7324
0.9981	4.0	720	0.7684	0.7836
0.8273	5.0	900	0.6834	0.7998
0.7068	6.0	1080	0.6369	0.8062
0.6043	7.0	1260	0.5804	0.8205
0.535	8.0	1440	0.5475	0.8396
0.4763	9.0	1620	0.5247	0.8396
0.4245	10.0	1800	0.5122	0.8470
0.3794	11.0	1980	0.5038	0.8460
0.3424	12.0	2160	0.5057	0.8465
0.3194	13.0	2340	0.4977	0.8485
0.2897	14.0	2520	0.4973	0.8534
0.2688	15.0	2700	0.4714	0.8574
0.255	16.0	2880	0.4763	0.8480
0.2401	17.0	3060	0.4856	0.8510
0.2286	18.0	3240	0.4713	0.8578
0.2138	19.0	3420	0.4753	0.8500
0.2022	20.0	3600	0.4641	0.8544
0.1937	21.0	3780	0.4640	0.8598
0.1802	22.0	3960	0.4788	0.8505
0.1719	23.0	4140	0.4520	0.8593
0.17	24.0	4320	0.4703	0.8564
0.159	25.0	4500	0.4620	0.8554
0.1566	26.0	4680	0.4825	0.8549