Edit model card

mt5-small-finetuned-amazon-en-es

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.3101
  • Rouge1: 13.2641
  • Rouge2: 4.7733
  • Rougel: 13.0439
  • Rougelsum: 13.1146

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
4.1164 1.0 565 3.5818 12.0154 3.1674 11.9107 11.8538
3.8955 2.0 1130 3.4628 11.1762 3.135 11.0969 11.0566
3.6555 3.0 1695 3.3872 12.5361 4.2116 12.4509 12.3869
3.5088 4.0 2260 3.3396 12.6856 4.2888 12.4772 12.4739
3.4233 5.0 2825 3.3386 12.3451 4.269 12.1469 12.2031
3.3535 6.0 3390 3.3160 12.7596 4.6932 12.504 12.5963
3.3182 7.0 3955 3.3172 13.2367 5.0141 12.9874 13.1011
3.2891 8.0 4520 3.3101 13.2641 4.7733 13.0439 13.1146

Framework versions

  • Transformers 4.41.1
  • Pytorch 2.2.2+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
5
Safetensors
Model size
300M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from