fresh-2-layer-medmcqa-distill-of-bert-gpqa

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	63	16.6292	0.2424
No log	2.0	126	16.2288	0.3737
No log	3.0	189	16.1398	0.3939
No log	4.0	252	14.0247	0.4444
No log	5.0	315	13.9443	0.4495
No log	6.0	378	13.9826	0.4444
No log	7.0	441	15.5288	0.4495
5.606	8.0	504	13.7123	0.4596
5.606	9.0	567	13.6056	0.4646
5.606	10.0	630	13.2762	0.4899
5.606	11.0	693	13.7919	0.4596
5.606	12.0	756	13.6602	0.4646
5.606	13.0	819	13.5119	0.4646
5.606	14.0	882	13.1687	0.4747
5.606	15.0	945	13.4347	0.4646
0.781	16.0	1008	13.2637	0.4495
0.781	17.0	1071	13.2955	0.4545
0.781	18.0	1134	13.5991	0.4394
0.781	19.0	1197	13.5485	0.4444
0.781	20.0	1260	13.4956	0.4444