hbertv1-massive-intermediate_KD_new_2

This model is a fine-tuned version of gokuls/HBERTv1_48_L10_H768_A12 on the massive dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
4.5836	1.0	180	3.4660	0.2710
3.38	2.0	360	2.7802	0.4324
2.7571	3.0	540	2.3906	0.5991
2.3743	4.0	720	2.1148	0.7029
2.1481	5.0	900	2.0007	0.7245
1.9762	6.0	1080	1.9660	0.7467
1.8702	7.0	1260	1.8680	0.7619
1.759	8.0	1440	1.8192	0.7806
1.6949	9.0	1620	1.7677	0.7949
1.6253	10.0	1800	1.7452	0.7885
1.5849	11.0	1980	1.7075	0.8023
1.5239	12.0	2160	1.6915	0.7939
1.4768	13.0	2340	1.6821	0.8067
1.4474	14.0	2520	1.7201	0.7944
1.424	15.0	2700	1.6538	0.8096
1.3839	16.0	2880	1.5979	0.8141
1.3537	17.0	3060	1.6254	0.8062
1.3422	18.0	3240	1.6386	0.8077
1.3166	19.0	3420	1.6048	0.8141
1.2923	20.0	3600	1.5927	0.8146
1.2722	21.0	3780	1.5544	0.8180
1.2513	22.0	3960	1.5904	0.8077
1.2286	23.0	4140	1.5506	0.8195
1.2056	24.0	4320	1.5547	0.8146
1.1941	25.0	4500	1.5258	0.8224
1.1701	26.0	4680	1.4975	0.8224
1.1582	27.0	4860	1.4945	0.8200
1.1367	28.0	5040	1.4888	0.8219
1.127	29.0	5220	1.4596	0.8254
1.1126	30.0	5400	1.4686	0.8175
1.0922	31.0	5580	1.4934	0.8200
1.0809	32.0	5760	1.4370	0.8249
1.0715	33.0	5940	1.4305	0.8234
1.0572	34.0	6120	1.4255	0.8273
1.0429	35.0	6300	1.4042	0.8249
1.0375	36.0	6480	1.4004	0.8190
1.0242	37.0	6660	1.3849	0.8269
1.0132	38.0	6840	1.3777	0.8288
1.0085	39.0	7020	1.3731	0.8273
0.9964	40.0	7200	1.3647	0.8278
0.9867	41.0	7380	1.3655	0.8239
0.9787	42.0	7560	1.3542	0.8293
0.9692	43.0	7740	1.3449	0.8278
0.9646	44.0	7920	1.3402	0.8283
0.959	45.0	8100	1.3360	0.8288
0.9482	46.0	8280	1.3289	0.8303
0.9503	47.0	8460	1.3173	0.8328
0.9428	48.0	8640	1.3152	0.8333
0.9416	49.0	8820	1.3102	0.8342
0.9348	50.0	9000	1.3133	0.8328