ft-wmt14 / README.md
lilferrit's picture
Model save
0faa663 verified
|
raw
history blame
2.13 kB
metadata
license: apache-2.0
base_model: google-t5/t5-small
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: ft-wmt14
    results: []

ft-wmt14

This model is a fine-tuned version of google-t5/t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7469
  • Bleu: 23.6596
  • Gen Len: 27.526

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 100000

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.7738 0.2778 10000 1.9146 20.1598 28.1563
1.6498 0.5556 20000 1.8550 21.4167 27.853
1.5903 0.8333 30000 1.8277 22.604 27.7613
1.5151 1.1111 40000 1.8128 22.1273 27.3187
1.4866 1.3889 50000 1.7999 22.8295 27.419
1.4696 1.6667 60000 1.7810 22.9923 27.7387
1.4508 1.9444 70000 1.7654 23.1046 27.7057
1.4053 2.2222 80000 1.7587 23.5079 27.643
1.3956 2.5 90000 1.7525 23.3848 27.6637
1.3903 2.7778 100000 1.7469 23.6596 27.526

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.2.2+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1