Edit model card

mt5-small-finetuned-amazon-en-de

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6628
  • Rouge1: 16.6076
  • Rouge2: 9.9027
  • Rougel: 16.1974
  • Rougelsum: 16.2188

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
8.2562 1.0 651 3.1075 14.737 6.9361 14.3272 14.3439
4.0951 2.0 1302 2.8909 13.9821 7.1908 13.5966 13.6613
3.7285 3.0 1953 2.7839 14.1682 7.1028 13.8005 13.8592
3.5478 4.0 2604 2.7192 15.8804 9.6757 15.6891 15.7287
3.4295 5.0 3255 2.6939 17.5083 10.3927 17.0418 17.0633
3.3476 6.0 3906 2.6724 16.9379 10.2269 16.6049 16.5903
3.2964 7.0 4557 2.6725 16.8053 10.1869 16.3812 16.441
3.2723 8.0 5208 2.6628 16.6076 9.9027 16.1974 16.2188

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
2
Safetensors
Model size
300M params
Tensor type
F32
·

Finetuned from