mt5-small

This model was trained from scratch on TEC-JL Japanese learner error corpus dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.0485	1.0	3125	0.7139	0.0061	13.051
0.2413	2.0	6250	0.1114	53.3974	13.056
0.1153	3.0	9375	0.0937	61.71	13.056
0.0918	4.0	12500	0.0867	63.8407	13.056
0.0819	5.0	15625	0.0833	65.2015	13.056
0.08	6.0	18750	0.0806	65.6513	13.056
0.078	7.0	21875	0.0793	66.3861	13.051
0.0704	8.0	25000	0.0779	66.6447	13.051
0.0724	9.0	28125	0.0759	67.2105	13.051
0.0707	10.0	31250	0.0765	67.3232	13.051
0.0682	11.0	34375	0.0761	67.3443	13.051
0.07	12.0	37500	0.0758	67.2605	13.051