fresh-4-layer-medmcqa-distill-of-fresh-4-layer-gpqa

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	63	12.5127	0.2677
No log	2.0	126	12.1212	0.4646
No log	3.0	189	11.4053	0.5
No log	4.0	252	9.7231	0.5404
No log	5.0	315	10.5580	0.5101
No log	6.0	378	9.6734	0.5556
No log	7.0	441	11.0424	0.5
3.7476	8.0	504	9.7632	0.5455
3.7476	9.0	567	9.2853	0.5404
3.7476	10.0	630	9.6583	0.5152
3.7476	11.0	693	9.7091	0.5101
3.7476	12.0	756	9.8989	0.5303
3.7476	13.0	819	9.1235	0.5808
3.7476	14.0	882	8.9247	0.5354
3.7476	15.0	945	9.0337	0.5859
0.5369	16.0	1008	9.0346	0.5505
0.5369	17.0	1071	9.0031	0.5606
0.5369	18.0	1134	8.9774	0.5657
0.5369	19.0	1197	8.9290	0.5808
0.5369	20.0	1260	8.9045	0.5808