fresh-2-layer-medmcqa-distill-of-fresh-2-layer-gpqa

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	63	14.3363	0.2929
No log	2.0	126	13.8007	0.4040
No log	3.0	189	13.1932	0.4697
No log	4.0	252	12.4231	0.4899
No log	5.0	315	11.6190	0.5101
No log	6.0	378	11.4170	0.5404
No log	7.0	441	12.2002	0.4899
3.3802	8.0	504	11.9545	0.4646
3.3802	9.0	567	13.2518	0.5202
3.3802	10.0	630	11.9140	0.5
3.3802	11.0	693	11.4793	0.4545
3.3802	12.0	756	11.6963	0.4798
3.3802	13.0	819	11.2862	0.4848
3.3802	14.0	882	11.1868	0.4949
3.3802	15.0	945	10.9490	0.4646
0.479	16.0	1008	11.0089	0.4899
0.479	17.0	1071	11.1883	0.4798
0.479	18.0	1134	11.2915	0.4697
0.479	19.0	1197	11.1116	0.4747
0.479	20.0	1260	11.0499	0.4747