fresh-2-layer-qasc-distill-of-fresh-2-layer-gpqa

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	63	16.9688	0.2778
No log	2.0	126	15.2240	0.3737
No log	3.0	189	17.7866	0.3485
No log	4.0	252	17.3558	0.3535
No log	5.0	315	14.5362	0.3232
No log	6.0	378	15.8903	0.3687
No log	7.0	441	15.9517	0.3939
1.9038	8.0	504	18.0730	0.3939
1.9038	9.0	567	15.1385	0.3586
1.9038	10.0	630	16.3576	0.3737
1.9038	11.0	693	16.6174	0.3586
1.9038	12.0	756	15.8650	0.4040
1.9038	13.0	819	15.8556	0.3636
1.9038	14.0	882	15.6212	0.3788
1.9038	15.0	945	15.3199	0.3586
0.2405	16.0	1008	15.4362	0.3737
0.2405	17.0	1071	15.7245	0.3737
0.2405	18.0	1134	15.4229	0.3687
0.2405	19.0	1197	15.5387	0.3889
0.2405	20.0	1260	15.5974	0.3737