fresh-2-layer-qasc8134-distill-of-fresh-2-layer-mmlu_EVAL_mmlu

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	0.39	100	201.4073	0.222
No log	0.78	200	207.1151	0.33
No log	1.18	300	203.8212	0.328
No log	1.57	400	204.8179	0.39
104.9916	1.96	500	224.3691	0.402
104.9916	2.35	600	190.8359	0.404
104.9916	2.75	700	199.6778	0.414
104.9916	3.14	800	196.7796	0.43
104.9916	3.53	900	190.6462	0.386
28.8659	3.92	1000	187.5610	0.402
28.8659	4.31	1100	186.6063	0.426
28.8659	4.71	1200	195.5366	0.416
28.8659	5.1	1300	192.5193	0.412