mistral_darulm_20_05_24_part1-2_32000_bpe_full_lr1e4_bs256

This model is a fine-tuned version of RefalMachine/mistral_darulm_20_05_24_part1-2_32000_bpe_mean_init_03_07_24 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.3529	0.04	2000	2.1464	0.5505
2.3262	0.09	4000	2.1167	0.5540
2.2945	0.13	6000	2.1000	0.5563
2.2961	0.18	8000	2.0909	0.5571
2.2943	0.22	10000	2.0807	0.5588
2.2748	0.26	12000	2.0766	0.5595
2.2741	0.31	14000	2.0678	0.5607
2.2538	0.35	16000	2.0620	0.5620
2.2802	0.39	18000	2.0558	0.5627
2.2613	0.44	20000	2.0485	0.5638
2.243	0.48	22000	2.0431	0.5646
2.2438	0.53	24000	2.0381	0.5654
2.2478	0.57	26000	2.0327	0.5664
2.2143	0.61	28000	2.0288	0.5669
2.2207	0.66	30000	2.0255	0.5674
2.2236	0.7	32000	2.0233	0.5679
2.2279	0.74	34000	2.0216	0.5682
2.227	0.79	36000	2.0207	0.5684
2.2343	0.83	38000	2.0202	0.5684
2.2226	0.88	40000	2.0199	0.5685
2.2162	0.92	42000	2.0199	0.5685
2.2351	0.96	44000	2.0198	0.5685