opt-babylm2-clean-spacy-32k-earlystop_seed-42_1e-3

This model was trained from scratch on the kanishka/babylm2-clean-spacy dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
5.9107	0.9995	1942	3.9887	0.3269
3.7896	1.9996	3885	3.5236	0.3657
3.3813	2.9997	5828	3.3040	0.3859
3.174	3.9997	7771	3.1921	0.3962
3.0533	4.9998	9714	3.1266	0.4026
2.9768	5.9999	11657	3.0838	0.4071
2.9232	6.9999	13600	3.0550	0.4101
2.8863	8.0	15543	3.0363	0.4122
2.8563	8.9995	17485	3.0208	0.4139
2.8356	9.9996	19428	3.0117	0.4151
2.816	10.9997	21371	3.0030	0.4162
2.8069	11.9997	23314	2.9951	0.4170
2.7941	12.9998	25257	2.9923	0.4175
2.7889	13.9999	27200	2.9888	0.4182
2.7802	14.9999	29143	2.9839	0.4186
2.7802	16.0	31086	2.9821	0.4190
2.7665	16.9995	33028	2.9626	0.4212
2.6908	17.9996	34971	2.9378	0.4247
2.6058	18.9997	36914	2.9145	0.4284
2.505	19.9910	38840	2.9103	0.4305