Edit model card

mt5-small-finetuned-news-summary-kaggle

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6907
  • Rouge1: 26.6547
  • Rouge2: 10.1
  • Rougel: 24.0137
  • Rougelsum: 23.9999

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

https://www.kaggle.com/datasets/sunnysai12345/news-summary

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 220 3.9956 14.9021 3.3744 13.4763 13.499
8.3183 2.0 440 3.1550 17.9472 5.9671 16.6974 16.6959
8.3183 3.0 660 2.8950 21.2665 7.4266 19.5041 19.4837
4.0457 4.0 880 2.8087 25.063 9.4484 22.746 22.7351
4.0457 5.0 1100 2.7375 25.5269 9.4299 23.0623 23.0075
3.6505 6.0 1320 2.7091 25.8308 9.3392 23.2001 23.1586
3.6505 7.0 1540 2.6949 26.2177 9.8536 23.5946 23.6358
3.5175 8.0 1760 2.6907 26.6547 10.1 24.0137 23.9999

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.1.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
5
Safetensors
Model size
300M params
Tensor type
F32
·

Finetuned from