mt5-small-test-ged-mlsum_max_target_length_10

This model is a fine-tuned version of google/mt5-small on the mlsum dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3341
  • Rouge1: 74.8229
  • Rouge2: 68.1808
  • Rougel: 74.8297
  • Rougelsum: 74.8414

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
0.5565 1.0 33296 0.3827 69.9041 62.821 69.8709 69.8924
0.2636 2.0 66592 0.3552 72.0701 65.4937 72.0787 72.091
0.2309 3.0 99888 0.3525 72.5071 65.8026 72.5132 72.512
0.2109 4.0 133184 0.3346 74.0842 67.4776 74.0887 74.0968
0.1972 5.0 166480 0.3398 74.6051 68.6024 74.6177 74.6365
0.1867 6.0 199776 0.3283 74.9022 68.2146 74.9023 74.926
0.1785 7.0 233072 0.3325 74.8631 68.2468 74.8843 74.9026
0.1725 8.0 266368 0.3341 74.8229 68.1808 74.8297 74.8414

Framework versions

  • Transformers 4.20.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.3.2
  • Tokenizers 0.12.1
Downloads last month
10
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train nestoralvaro/mt5-small-test-ged-mlsum_max_target_length_10

Evaluation results