Edit model card

mT5

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7797
  • Rouge1: 17.5958
  • Rouge2: 5.5502
  • Rougel: 14.89
  • Rougelsum: 15.8861

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
6.7587 1.0 313 3.0537 15.5845 4.426 12.7262 13.9385
3.6224 2.0 626 2.8799 16.4339 4.8534 13.3138 14.9449
3.3322 3.0 939 2.8378 18.1043 6.2202 15.376 16.5012
3.1974 4.0 1252 2.8008 17.8905 5.7529 15.0379 16.3205
3.1183 5.0 1565 2.7936 17.7318 5.4565 14.8508 15.9979
3.0522 6.0 1878 2.7824 17.6328 5.5352 14.7803 15.8202
3.019 7.0 2191 2.7846 17.7348 5.4391 14.7499 15.8859
2.9889 8.0 2504 2.7797 17.5958 5.5502 14.89 15.8861

Framework versions

  • Transformers 4.32.1
  • Pytorch 1.13.0+cpu
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
12

Finetuned from