Summarization / README.md
HienHNMU's picture
End of training
a6f0573 verified
metadata
license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
datasets:
  - wcep-10
metrics:
  - rouge
model-index:
  - name: mt5-small-finetuned-amazon-en-es
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: wcep-10
          type: wcep-10
          config: roberta
          split: validation
          args: roberta
        metrics:
          - name: Rouge1
            type: rouge
            value: 22.6862

mt5-small-finetuned-amazon-en-es

This model is a fine-tuned version of google/mt5-small on the wcep-10 dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1575
  • Rouge1: 22.6862
  • Rouge2: 7.7268
  • Rougel: 19.1961
  • Rougelsum: 19.3808

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
6.5905 1.0 1020 3.4711 21.2268 7.4345 18.5023 18.6264
4.1604 2.0 2040 3.3228 21.6354 7.3939 18.4926 18.6047
3.914 3.0 3060 3.2606 21.9787 7.5818 18.6971 18.8603
3.7698 4.0 4080 3.2058 21.8859 7.5625 18.6413 18.8169
3.679 5.0 5100 3.1824 22.6515 7.7467 19.1196 19.3121
3.6131 6.0 6120 3.1678 22.0223 7.6153 18.7956 18.9968
3.5722 7.0 7140 3.1631 22.679 7.7952 19.1784 19.384
3.5432 8.0 8160 3.1575 22.6862 7.7268 19.1961 19.3808

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1