pruned-mt5-small

This model is a fine-tuned version of X-Wang/pruned-mt5-small on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
3.3446	0.07	2000	2.9103	10.3957	16.0567
2.8425	0.14	4000	2.8570	10.5695	16.1895
3.186	0.21	6000	2.8137	10.5958	16.1523
2.788	0.28	8000	2.7593	10.7553	16.0138
2.9075	0.35	10000	2.7266	10.9199	16.2016
3.0579	0.42	12000	2.7030	10.6	16.0496
2.3618	0.49	14000	2.6547	10.8026	16.0412
3.079	0.56	16000	2.6441	10.7945	16.1148
2.7597	0.63	18000	2.6244	10.5877	16.0507
2.8533	0.7	20000	2.6049	10.9986	16.1145
2.843	0.77	22000	2.5836	10.9173	16.0826
2.8268	0.84	24000	2.5685	10.8136	16.0516
2.7021	0.91	26000	2.5509	11.326	16.0554
3.338	0.98	28000	2.5289	11.1485	16.0333
2.7374	1.05	30000	2.5220	11.0166	16.0998
2.7996	1.12	32000	2.5077	11.1316	16.131
2.6897	1.19	34000	2.4994	11.0811	16.1139
2.4107	1.26	36000	2.4877	11.2641	16.142
2.7695	1.33	38000	2.4756	11.2135	16.0977
3.3271	1.41	40000	2.4658	11.3328	16.0953
2.2641	1.48	42000	2.4612	11.3065	16.0549
2.6594	1.55	44000	2.4556	11.2684	16.1371
2.7322	1.62	46000	2.4520	11.3739	16.1058
2.6824	1.69	48000	2.4462	11.3335	16.1043
2.3369	1.76	50000	2.4455	11.3851	16.1239
2.9537	1.83	52000	2.4430	11.4026	16.0858
2.3928	1.9	54000	2.4433	11.301	16.1129
2.4714	1.97	56000	2.4431	11.4084	16.1053