Edit model card

mt5-small-finetuned-news-summary-model-2

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5813
  • Rouge1: 29.4322
  • Rouge2: 11.4361
  • Rougel: 26.3875
  • Rougelsum: 26.297

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 10
  • eval_batch_size: 10
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 12

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
9.2632 0.9972 351 3.7059 17.3365 5.2307 15.438 15.3776
4.6719 1.9943 702 3.0896 19.5787 6.8278 18.0637 18.0255
4.1356 2.9915 1053 2.8713 22.5668 8.2899 20.551 20.5232
3.7852 3.9886 1404 2.7729 25.7974 9.9158 23.2398 23.2198
3.6194 4.9858 1755 2.7038 26.2572 10.0034 24.0326 23.9956
3.4864 5.9830 2106 2.6714 26.8149 9.9056 24.2704 24.1399
3.3965 6.9801 2457 2.6361 27.5399 10.3609 24.8286 24.7628
3.3422 7.9773 2808 2.6194 28.0298 10.6938 25.1678 25.0924
3.2879 8.9744 3159 2.5976 28.2324 10.6412 25.2803 25.1804
3.2391 9.9716 3510 2.5894 29.0155 11.174 25.9995 25.8843
3.2128 10.9688 3861 2.5854 29.3283 11.477 26.2235 26.1278
3.2214 11.9659 4212 2.5813 29.4322 11.4361 26.3875 26.297

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
5
Safetensors
Model size
300M params
Tensor type
F32
·

Finetuned from