Edit model card

mt5-small-finetuned-news-summary-kaggle

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5691
  • Rouge1: 29.8831
  • Rouge2: 11.6462
  • Rougel: 26.8481
  • Rougelsum: 26.8856

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
8.1234 1.0 440 3.3123 18.1687 5.9565 16.7581 16.7495
4.2107 2.0 880 2.8404 23.0004 8.3723 20.8424 20.9312
3.738 3.0 1320 2.7354 26.5882 10.1061 23.9299 24.0001
3.4864 4.0 1760 2.6756 27.2242 10.1775 24.4504 24.5062
3.3642 5.0 2200 2.6224 28.7857 11.5222 26.2568 26.3167
3.269 6.0 2640 2.5883 29.6623 11.7765 26.8117 26.906
3.212 7.0 3080 2.5677 29.7811 11.635 26.5844 26.6327
3.186 8.0 3520 2.5691 29.8831 11.6462 26.8481 26.8856

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
7
Safetensors
Model size
300M params
Tensor type
F32
·

Finetuned from