fresh-12-layer-medmcqa-distill-of-fresh-12-layer-gpqa

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	63	14.7499	0.2929
No log	2.0	126	13.0612	0.3636
No log	3.0	189	13.1660	0.4293
No log	4.0	252	13.4796	0.4848
No log	5.0	315	11.9863	0.5101
No log	6.0	378	11.3380	0.5253
No log	7.0	441	11.5841	0.4242
4.5481	8.0	504	15.3570	0.3485
4.5481	9.0	567	14.1857	0.1465
4.5481	10.0	630	13.5387	0.1263
4.5481	11.0	693	13.4757	0.1566
4.5481	12.0	756	14.4836	0.0657
4.5481	13.0	819	13.8175	0.0707
4.5481	14.0	882	14.0705	0.1313
4.5481	15.0	945	14.3308	0.0
7.3037	16.0	1008	14.2806	0.1263
7.3037	17.0	1071	14.2719	0.0101
7.3037	18.0	1134	13.7977	0.2727
7.3037	19.0	1197	14.2746	0.0657
7.3037	20.0	1260	14.0949	0.0