mt5-small-finetuned-amazon-en-es

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.0150
  • Rouge1: 16.6244
  • Rouge2: 7.4737
  • Rougel: 16.2514
  • Rougelsum: 16.3475

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
3.3536 1.0 1209 3.1069 17.673 8.5372 17.1607 17.1184
3.3641 2.0 2418 3.0533 17.2312 8.728 16.7626 16.7101
3.2384 3.0 3627 3.0481 15.9593 7.8483 15.7221 15.8227
3.1555 4.0 4836 3.0504 16.3465 7.9873 15.9367 15.9714
3.0826 5.0 6045 3.0267 16.285 7.4032 15.7749 15.8915
3.0396 6.0 7254 3.0217 16.5194 7.6637 16.1961 16.214
3.0181 7.0 8463 3.0122 16.068 7.2738 15.7885 15.8982
2.9965 8.0 9672 3.0150 16.6244 7.4737 16.2514 16.3475

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.5.1+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
9
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kperron1/mt5-small-finetuned-amazon-en-es

Base model

google/mt5-small
Finetuned
(370)
this model