shivraj221's picture
Training complete
6d475df verified
metadata
license: apache-2.0
base_model: google/mt5-small
tags:
  - summarization
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: news-summary-t5-model-2
    results: []

news-summary-t5-model-2

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5691
  • Rouge1: 29.9122
  • Rouge2: 11.6784
  • Rougel: 26.812
  • Rougelsum: 26.8345

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
8.1234 1.0 440 3.3123 18.1585 5.9435 16.7364 16.7646
4.2107 2.0 880 2.8404 22.9864 8.3815 20.8354 20.9346
3.738 3.0 1320 2.7354 26.5984 10.0823 23.912 23.9585
3.4864 4.0 1760 2.6756 27.1487 10.1681 24.3788 24.4672
3.3642 5.0 2200 2.6224 28.7513 11.5416 26.2106 26.2335
3.269 6.0 2640 2.5883 29.6461 11.8038 26.7581 26.7764
3.212 7.0 3080 2.5677 29.8037 11.6582 26.5532 26.5455
3.186 8.0 3520 2.5691 29.9122 11.6784 26.812 26.8345

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1