Edit model card

mt5-small-text-sum-5

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3673
  • Rouge1: 21.51
  • Rouge2: 6.94
  • Rougel: 20.94

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 11
  • eval_batch_size: 11
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel
4.5176 1.77 500 2.6172 16.23 5.35 16.14
3.073 3.55 1000 2.4755 17.77 5.53 17.67
2.8478 5.32 1500 2.4330 18.56 5.28 18.32
2.7152 7.09 2000 2.4423 18.31 5.1 18.14
2.6003 8.87 2500 2.3905 19.46 5.52 19.17
2.5218 10.64 3000 2.3660 19.58 5.93 19.07
2.4172 12.41 3500 2.3595 19.89 6.42 19.5
2.3841 14.18 4000 2.3564 20.38 6.67 19.99
2.3049 15.96 4500 2.3730 20.21 6.41 19.79
2.2596 17.73 5000 2.3532 20.27 6.38 19.95
2.2155 19.5 5500 2.3539 19.6 6.41 19.24
2.1657 21.28 6000 2.3511 21.13 6.19 20.79
2.1343 23.05 6500 2.3378 20.59 6.45 20.18
2.1032 24.82 7000 2.3510 19.91 6.28 19.6
2.068 26.6 7500 2.3452 19.37 6.11 19.1
2.0438 28.37 8000 2.3513 20.86 6.43 20.49
2.0191 30.14 8500 2.3673 21.51 6.94 20.94
2.0085 31.91 9000 2.3519 20.65 6.61 20.2
1.9797 33.69 9500 2.3728 21.01 6.33 20.6
1.9808 35.46 10000 2.3663 21.22 6.48 20.82
1.9605 37.23 10500 2.3581 20.45 6.41 20.06
1.9599 39.01 11000 2.3608 21.07 6.57 20.6

Framework versions

  • Transformers 4.26.1
  • Pytorch 1.13.1+cu116
  • Datasets 2.10.1
  • Tokenizers 0.13.2
Downloads last month
9