fresh-2-layer-medmcqa20000-distill-of-fresh-2-layer-mmlu_EVAL_mmlu

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	0.16	100	200.5686	0.248
No log	0.32	200	201.9314	0.324
No log	0.48	300	181.0271	0.372
No log	0.64	400	206.5410	0.364
143.7199	0.8	500	184.4744	0.42
143.7199	0.96	600	181.4189	0.402
143.7199	1.12	700	186.8587	0.414
143.7199	1.28	800	195.9331	0.396
143.7199	1.44	900	182.1619	0.426
88.7183	1.6	1000	178.5117	0.428
88.7183	1.76	1100	180.1005	0.432
88.7183	1.92	1200	177.7711	0.418
88.7183	2.08	1300	184.9631	0.426
88.7183	2.24	1400	170.4556	0.41
71.5399	2.4	1500	180.7118	0.446
71.5399	2.56	1600	171.3761	0.438
71.5399	2.72	1700	165.6044	0.432
71.5399	2.88	1800	168.3776	0.456
71.5399	3.04	1900	165.8044	0.428
59.1947	3.2	2000	181.0893	0.44
59.1947	3.36	2100	174.6589	0.454
59.1947	3.52	2200	174.3077	0.448
59.1947	3.68	2300	169.3694	0.464
59.1947	3.84	2400	172.2202	0.47
48.8665	4.0	2500	161.0428	0.468
48.8665	4.16	2600	174.7397	0.468
48.8665	4.32	2700	167.8463	0.462
48.8665	4.48	2800	176.5635	0.47
48.8665	4.64	2900	168.6186	0.464