Edit model card

news-summary-t5-model-2

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5691
  • Rouge1: 29.9122
  • Rouge2: 11.6784
  • Rougel: 26.812
  • Rougelsum: 26.8345

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
8.1234 1.0 440 3.3123 18.1585 5.9435 16.7364 16.7646
4.2107 2.0 880 2.8404 22.9864 8.3815 20.8354 20.9346
3.738 3.0 1320 2.7354 26.5984 10.0823 23.912 23.9585
3.4864 4.0 1760 2.6756 27.1487 10.1681 24.3788 24.4672
3.3642 5.0 2200 2.6224 28.7513 11.5416 26.2106 26.2335
3.269 6.0 2640 2.5883 29.6461 11.8038 26.7581 26.7764
3.212 7.0 3080 2.5677 29.8037 11.6582 26.5532 26.5455
3.186 8.0 3520 2.5691 29.9122 11.6784 26.812 26.8345

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
5
Safetensors
Model size
300M params
Tensor type
F32
·

Finetuned from