msarmento's picture
Training complete
e4dd898 verified
metadata
license: apache-2.0
base_model: google/mt5-small
tags:
  - summarization
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: mt5-small-finetuned-xlsum-pt
    results: []

mt5-small-finetuned-xlsum-pt

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0986
  • Rouge1: 16.5756
  • Rouge2: 13.7639
  • Rougel: 15.7445
  • Rougelsum: 16.5112

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
0.7681 1.0 125 0.1393 12.9432 9.5039 12.2871 12.7291
0.5282 2.0 250 0.1231 13.4575 10.0697 12.6449 13.2
0.4132 3.0 375 0.1134 16.6964 14.0187 15.7338 16.6025
0.3534 4.0 500 0.1077 16.8961 14.2203 15.9187 16.7712
0.3126 5.0 625 0.1039 16.993 14.0876 15.8914 16.9277
0.283 6.0 750 0.1023 16.7431 13.9453 15.8758 16.6413
0.2675 7.0 875 0.1008 16.6566 13.8639 15.775 16.5481
0.2509 8.0 1000 0.0987 16.6829 13.935 15.872 16.6222
0.2441 9.0 1125 0.0987 16.6085 13.7884 15.7896 16.5412
0.2401 10.0 1250 0.0986 16.5756 13.7639 15.7445 16.5112

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.17.1
  • Tokenizers 0.15.2