fresh-2-layer-copa-distill-of-fresh-2-layer-gpqa

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	63	22.9294	0.2121
No log	2.0	126	30.4497	0.2828
No log	3.0	189	25.8786	0.2576
No log	4.0	252	21.7507	0.2828
No log	5.0	315	20.2591	0.2929
No log	6.0	378	20.1541	0.3030
No log	7.0	441	22.8909	0.3283
2.0303	8.0	504	22.0673	0.2626
2.0303	9.0	567	22.4491	0.2980
2.0303	10.0	630	22.6646	0.3384
2.0303	11.0	693	17.0362	0.3182
2.0303	12.0	756	18.8820	0.3131
2.0303	13.0	819	17.8284	0.2980
2.0303	14.0	882	19.4531	0.2828
2.0303	15.0	945	18.5961	0.2727
0.31	16.0	1008	18.4590	0.2626
0.31	17.0	1071	18.7574	0.2828
0.31	18.0	1134	18.9932	0.2828
0.31	19.0	1197	18.0842	0.2980
0.31	20.0	1260	18.4719	0.2929