Edit model card

T5_base_wmt14_En_Fr_1million

This model is a fine-tuned version of google-t5/t5-base on the wmt14 dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9945
  • Bleu: 8.5002
  • Gen Len: 18.0143

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 60
  • eval_batch_size: 60
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.9735 1.0 1667 1.1059 9.3433 17.994
0.8671 2.0 3334 1.1192 9.3551 18.008
0.7975 3.0 5001 1.1509 9.4297 17.996
0.737 4.0 6668 1.1819 9.0739 18.0223
0.6746 5.0 8335 1.2076 9.1258 17.9873
0.6314 6.0 10002 1.2640 9.1364 18.0207
0.5833 7.0 11669 1.2948 8.8072 17.9907
0.5349 8.0 13336 1.3525 8.8513 17.9867
0.5025 9.0 15003 1.4087 8.7599 18.0027
0.4614 10.0 16670 1.4562 8.6011 18.015
0.4227 11.0 18337 1.5169 8.6315 18.018
0.3938 12.0 20004 1.5842 8.6045 18.0133
0.358 13.0 21671 1.6334 8.459 17.9997
0.3271 14.0 23338 1.6989 8.4979 17.9937
0.3056 15.0 25005 1.7529 8.5421 18.0357
0.278 16.0 26672 1.8151 8.3963 18.0027
0.2548 17.0 28339 1.8812 8.3497 18.0193
0.238 18.0 30006 1.9249 8.4306 18.0227
0.223 19.0 31673 1.9742 8.5156 18.013
0.2112 20.0 33340 1.9945 8.5002 18.0143

Framework versions

  • Transformers 4.32.1
  • Pytorch 1.12.1
  • Datasets 2.18.0
  • Tokenizers 0.13.2
Downloads last month
45,509

Finetuned from

Dataset used to train sriram-sanjeev9s/T5_base_wmt14_En_Fr_1million

Evaluation results