Edit model card

mt5-small

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3524
  • Rouge1: 21.18
  • Rouge2: 6.37
  • Rougel: 20.84

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 9
  • eval_batch_size: 9
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel
4.6211 1.45 500 2.5968 16.96 4.86 16.73
3.1269 2.9 1000 2.4790 17.62 5.0 17.58
2.884 4.35 1500 2.4077 17.67 5.06 17.4
2.7627 5.8 2000 2.4003 18.67 5.42 18.26
2.638 7.25 2500 2.3953 18.76 5.49 18.44
2.5427 8.7 3000 2.3837 18.97 6.04 18.62
2.4846 10.14 3500 2.3957 20.17 6.23 19.88
2.3867 11.59 4000 2.3558 19.5 6.24 19.1
2.3651 13.04 4500 2.3225 19.6 6.18 19.2
2.2846 14.49 5000 2.3385 19.34 6.3 18.9
2.2351 15.94 5500 2.3413 20.42 6.44 19.93
2.1862 17.39 6000 2.3418 20.04 6.35 19.51
2.1375 18.84 6500 2.3438 21.02 6.56 20.45
2.0961 20.29 7000 2.3451 20.82 6.81 20.6
2.0686 21.74 7500 2.3571 20.46 6.57 20.03
2.0253 23.19 8000 2.3672 20.49 6.21 20.16
1.9997 24.64 8500 2.3524 21.18 6.37 20.84
1.9627 26.09 9000 2.3780 20.9 5.96 20.4
1.9561 27.54 9500 2.3808 21.06 6.59 20.76
1.902 28.99 10000 2.3739 20.73 6.09 20.41
1.8837 30.43 10500 2.3786 20.65 6.27 20.35
1.8587 31.88 11000 2.3853 20.44 6.23 20.0

Framework versions

  • Transformers 4.26.1
  • Pytorch 1.13.1+cu116
  • Datasets 2.10.1
  • Tokenizers 0.13.2
Downloads last month
9