root
upload teacher pretrained model on wikipedia.
e904c00
{
"epoch": 1.5,
"train_loss": 1.5298656570243836,
"train_runtime": 127603.5055,
"train_samples": 17029620,
"train_samples_per_second": 200.621,
"train_steps_per_second": 0.784
}