Edit model card

mt5-small-text-sum-1

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3715
  • Rouge1: 20.75
  • Rouge2: 6.54
  • Rougel: 20.33

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel
4.936 1.29 500 2.6226 15.38 5.14 15.22
3.1573 2.58 1000 2.5081 18.02 5.53 17.8
2.9258 3.87 1500 2.4499 17.19 5.3 17.0
2.7786 5.15 2000 2.4264 18.17 5.02 17.99
2.6786 6.44 2500 2.4088 17.98 5.48 17.6
2.5824 7.73 3000 2.3909 19.43 6.32 19.07
2.5261 9.02 3500 2.3691 19.06 5.94 18.76
2.4372 10.31 4000 2.3580 19.76 6.37 19.49
2.3727 11.6 4500 2.3595 19.96 6.52 19.68
2.3488 12.89 5000 2.3580 19.63 6.14 19.31
2.2868 14.18 5500 2.3595 19.93 6.4 19.72
2.2268 15.46 6000 2.3632 19.95 6.13 19.55
2.2081 16.75 6500 2.3631 20.47 6.34 20.1
2.1583 18.04 7000 2.3562 20.04 6.13 19.71
2.1178 19.33 7500 2.3615 19.55 5.8 19.1
2.0904 20.62 8000 2.3549 20.37 6.6 20.05
2.0697 21.91 8500 2.3859 20.53 6.64 20.22
2.0256 23.2 9000 2.3715 20.75 6.54 20.33
2.0011 24.48 9500 2.3713 20.55 6.72 20.25
1.9899 25.77 10000 2.3582 19.82 5.82 19.4
1.965 27.06 10500 2.3789 20.48 5.8 20.23
1.9518 28.35 11000 2.3822 20.03 6.07 19.67
1.9089 29.64 11500 2.3743 19.62 6.1 19.3

Framework versions

  • Transformers 4.26.1
  • Pytorch 1.13.1+cu116
  • Datasets 2.10.1
  • Tokenizers 0.13.2
Downloads last month
27