HBERTv1_48_L8_H64_A2_massive

This model is a fine-tuned version of gokuls/HBERTv1_48_L8_H64_A2 on the massive dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
3.9694	1.0	180	3.7614	0.0949
3.5709	2.0	360	3.3717	0.1323
3.2745	3.0	540	3.1313	0.1968
3.0491	4.0	720	2.9227	0.2086
2.8486	5.0	900	2.7431	0.2238
2.6671	6.0	1080	2.5865	0.2848
2.514	7.0	1260	2.4468	0.3212
2.3707	8.0	1440	2.3252	0.3620
2.262	9.0	1620	2.2383	0.3866
2.1746	10.0	1800	2.1570	0.4171
2.0999	11.0	1980	2.1083	0.4309
2.0442	12.0	2160	2.0581	0.4383
1.9992	13.0	2340	2.0297	0.4432
1.9728	14.0	2520	2.0104	0.4461
1.9461	15.0	2700	2.0023	0.4506