sriram-sanjeev9s's picture
Model save
1a34a11 verified
metadata
license: apache-2.0
base_model: google-t5/t5-base
tags:
  - generated_from_trainer
datasets:
  - wmt14
metrics:
  - bleu
model-index:
  - name: T5_base_wmt14_En_Fr_1million
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: wmt14
          type: wmt14
          config: fr-en
          split: validation
          args: fr-en
        metrics:
          - name: Bleu
            type: bleu
            value: 8.5002

T5_base_wmt14_En_Fr_1million

This model is a fine-tuned version of google-t5/t5-base on the wmt14 dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9945
  • Bleu: 8.5002
  • Gen Len: 18.0143

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 60
  • eval_batch_size: 60
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.9735 1.0 1667 1.1059 9.3433 17.994
0.8671 2.0 3334 1.1192 9.3551 18.008
0.7975 3.0 5001 1.1509 9.4297 17.996
0.737 4.0 6668 1.1819 9.0739 18.0223
0.6746 5.0 8335 1.2076 9.1258 17.9873
0.6314 6.0 10002 1.2640 9.1364 18.0207
0.5833 7.0 11669 1.2948 8.8072 17.9907
0.5349 8.0 13336 1.3525 8.8513 17.9867
0.5025 9.0 15003 1.4087 8.7599 18.0027
0.4614 10.0 16670 1.4562 8.6011 18.015
0.4227 11.0 18337 1.5169 8.6315 18.018
0.3938 12.0 20004 1.5842 8.6045 18.0133
0.358 13.0 21671 1.6334 8.459 17.9997
0.3271 14.0 23338 1.6989 8.4979 17.9937
0.3056 15.0 25005 1.7529 8.5421 18.0357
0.278 16.0 26672 1.8151 8.3963 18.0027
0.2548 17.0 28339 1.8812 8.3497 18.0193
0.238 18.0 30006 1.9249 8.4306 18.0227
0.223 19.0 31673 1.9742 8.5156 18.013
0.2112 20.0 33340 1.9945 8.5002 18.0143

Framework versions

  • Transformers 4.32.1
  • Pytorch 1.12.1
  • Datasets 2.18.0
  • Tokenizers 0.13.2