lewtun's picture
lewtun HF staff
Training complete
99af8bb
metadata
license: apache-2.0
tags:
  - summarization
  - generated_from_trainer
datasets:
  - null
metrics:
  - rouge
model-index:
  - name: mt5-small-finetuned-amazon-en-es
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        metrics:
          - name: Rouge1
            type: rouge
            value: 12.4927

mt5-small-finetuned-amazon-en-es

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.9894
  • Rouge1: 12.4927
  • Rouge2: 4.847
  • Rougel: 12.4387
  • Rougelsum: 12.4383
  • Gen Len: 6.1675

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
6.5619 1.0 2202 3.2749 9.2423 3.2813 9.2013 9.1698 5.0354
3.8525 2.0 4404 3.1296 11.1883 4.047 11.1545 11.1885 6.4033
3.5419 3.0 6606 3.0478 11.4905 4.4465 11.3538 11.3805 6.6462
3.4045 4.0 8808 3.0174 11.5798 4.4426 11.5372 11.571 6.6816
3.3091 5.0 11010 3.0080 12.0207 4.5622 11.9232 11.9476 6.4976
3.2457 6.0 13212 2.9981 12.2459 4.6924 12.2306 12.2375 6.1533
3.2179 7.0 15414 2.9943 12.3927 4.6072 12.2888 12.2848 6.3561
3.1898 8.0 17616 2.9894 12.4927 4.847 12.4387 12.4383 6.1675

Framework versions

  • Transformers 4.10.3
  • Pytorch 1.9.1+cu111
  • Datasets 1.12.1
  • Tokenizers 0.10.3