metadata

license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
datasets:
  - wcep-10
metrics:
  - rouge
model-index:
  - name: mt5-small-finetuned-amazon-en-es
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: wcep-10
          type: wcep-10
          config: roberta
          split: validation
          args: roberta
        metrics:
          - name: Rouge1
            type: rouge
            value: 22.6862

mt5-small-finetuned-amazon-en-es

This model is a fine-tuned version of google/mt5-small on the wcep-10 dataset. It achieves the following results on the evaluation set:

Loss: 3.1575
Rouge1: 22.6862
Rouge2: 7.7268
Rougel: 19.1961
Rougelsum: 19.3808

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 8

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
6.5905	1.0	1020	3.4711	21.2268	7.4345	18.5023	18.6264
4.1604	2.0	2040	3.3228	21.6354	7.3939	18.4926	18.6047
3.914	3.0	3060	3.2606	21.9787	7.5818	18.6971	18.8603
3.7698	4.0	4080	3.2058	21.8859	7.5625	18.6413	18.8169
3.679	5.0	5100	3.1824	22.6515	7.7467	19.1196	19.3121
3.6131	6.0	6120	3.1678	22.0223	7.6153	18.7956	18.9968
3.5722	7.0	7140	3.1631	22.679	7.7952	19.1784	19.384
3.5432	8.0	8160	3.1575	22.6862	7.7268	19.1961	19.3808

Framework versions

Transformers 4.41.2
Pytorch 2.3.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1