nestoralvaro's picture
update model card README.md
6d4cf43
metadata
license: apache-2.0
tags:
  - summarization
  - generated_from_trainer
datasets:
  - mlsum
metrics:
  - rouge
model-index:
  - name: mt5-small-test-ged-mlsum_max_target_length_10
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: mlsum
          type: mlsum
          args: es
        metrics:
          - name: Rouge1
            type: rouge
            value: 74.8229

mt5-small-test-ged-mlsum_max_target_length_10

This model is a fine-tuned version of google/mt5-small on the mlsum dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3341
  • Rouge1: 74.8229
  • Rouge2: 68.1808
  • Rougel: 74.8297
  • Rougelsum: 74.8414

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
0.5565 1.0 33296 0.3827 69.9041 62.821 69.8709 69.8924
0.2636 2.0 66592 0.3552 72.0701 65.4937 72.0787 72.091
0.2309 3.0 99888 0.3525 72.5071 65.8026 72.5132 72.512
0.2109 4.0 133184 0.3346 74.0842 67.4776 74.0887 74.0968
0.1972 5.0 166480 0.3398 74.6051 68.6024 74.6177 74.6365
0.1867 6.0 199776 0.3283 74.9022 68.2146 74.9023 74.926
0.1785 7.0 233072 0.3325 74.8631 68.2468 74.8843 74.9026
0.1725 8.0 266368 0.3341 74.8229 68.1808 74.8297 74.8414

Framework versions

  • Transformers 4.20.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.3.2
  • Tokenizers 0.12.1