ft-wmt14-5 / README.md
lilferrit's picture
End of training
e188f5a verified
metadata
language:
  - de
  - en
license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
datasets:
  - lilferrit/wmt14-short
metrics:
  - bleu
model-index:
  - name: ft-wmt14-5
    results:
      - task:
          name: Translation
          type: translation
        dataset:
          name: lilferrit/wmt14-short
          type: lilferrit/wmt14-short
        metrics:
          - name: Bleu
            type: bleu
            value: 20.7584

ft-wmt14-5

This model is a fine-tuned version of google/mt5-small on the lilferrit/wmt14-short dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0604
  • Bleu: 20.7584
  • Gen Len: 30.499

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adafactor
  • lr_scheduler_type: constant
  • training_steps: 100000

Training results

Training Loss Epoch Step Bleu Gen Len Validation Loss
1.9166 0.2778 10000 15.8119 32.097 2.3105
1.7184 0.5556 20000 17.5903 31.1153 2.1993
1.6061 0.8333 30000 18.9604 30.327 2.1380
1.516 1.1111 40000 19.1444 30.2727 2.1366
1.4675 1.3889 50000 19.7588 30.1127 2.1208
1.4416 1.6667 60000 19.9263 30.4463 2.0889
1.4111 1.9444 70000 2.0795 20.3323 30.1207
1.3603 2.2222 80000 2.0850 20.5373 30.5943
1.3378 2.5 90000 2.0604 20.7584 30.499
1.3381 2.7778 100000 2.0597 20.6113 30.701

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.2.2+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1