distilled-mt5-small-0.4-0.25 / train_results.json
Lvxue's picture
End of training
6472295
{
"epoch": 5.0,
"train_loss": 11.89774876953125,
"train_runtime": 2711.7009,
"train_samples": 10000,
"train_samples_per_second": 18.439,
"train_steps_per_second": 4.61
}