fresh-2-layer-arc-distill-of-fresh-2-layer-gpqa

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	63	15.3758	0.2980
No log	2.0	126	17.7481	0.3131
No log	3.0	189	17.2496	0.3485
No log	4.0	252	21.2791	0.3889
No log	5.0	315	15.8656	0.4091
No log	6.0	378	17.6969	0.4697
No log	7.0	441	19.0015	0.4091
2.8992	8.0	504	18.0056	0.3939
2.8992	9.0	567	18.0133	0.3889
2.8992	10.0	630	16.3063	0.4242
2.8992	11.0	693	17.3745	0.4343
2.8992	12.0	756	17.3269	0.3838
2.8992	13.0	819	16.1809	0.3889
2.8992	14.0	882	17.2396	0.3939
2.8992	15.0	945	17.6566	0.3990
0.4269	16.0	1008	17.4774	0.4192
0.4269	17.0	1071	16.4508	0.3889
0.4269	18.0	1134	16.7112	0.3737
0.4269	19.0	1197	17.0893	0.3889
0.4269	20.0	1260	16.8349	0.3889