marciovbarbosa's picture
update model card README.md
b80ae71
metadata
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - wmt16
metrics:
  - bleu
model-index:
  - name: t5-small-finetuned-de-to-en
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: wmt16
          type: wmt16
          args: de-en
        metrics:
          - name: Bleu
            type: bleu
            value: 11.3921

t5-small-finetuned-de-to-en

This model is a fine-tuned version of t5-small on the wmt16 dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8219
  • Bleu: 11.3921
  • Gen Len: 17.2471

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 272 2.1014 5.5136 17.4975
2.5302 2.0 544 2.0258 7.4515 17.3941
2.5302 3.0 816 1.9866 8.3061 17.3441
2.3778 4.0 1088 1.9602 8.9169 17.3588
2.3778 5.0 1360 1.9382 9.3651 17.3204
2.2676 6.0 1632 1.9215 9.6428 17.3588
2.2676 7.0 1904 1.9067 9.8039 17.3418
2.2096 8.0 2176 1.8984 9.8545 17.3264
2.2096 9.0 2448 1.8883 10.03 17.3278
2.1501 10.0 2720 1.8797 10.2398 17.3358
2.1501 11.0 2992 1.8738 10.3086 17.3258
2.1025 12.0 3264 1.8677 10.3851 17.3181
2.0638 13.0 3536 1.8623 10.489 17.3014
2.0638 14.0 3808 1.8574 10.4969 17.3204
2.034 15.0 4080 1.8528 10.7067 17.3178
2.034 16.0 4352 1.8493 10.6867 17.3408
1.9852 17.0 4624 1.8473 10.8333 17.3198
1.9852 18.0 4896 1.8429 10.8907 17.3001
1.9646 19.0 5168 1.8405 10.9049 17.3154
1.9646 20.0 5440 1.8385 10.9549 17.3124
1.9264 21.0 5712 1.8361 11.0046 17.3068
1.9264 22.0 5984 1.8338 11.1415 17.2954
1.9161 23.0 6256 1.8333 11.1041 17.2938
1.882 24.0 6528 1.8323 11.0801 17.2651
1.882 25.0 6800 1.8309 11.157 17.2921
1.8751 26.0 7072 1.8290 11.1713 17.2951
1.8751 27.0 7344 1.8279 11.2006 17.2861
1.8425 28.0 7616 1.8267 11.1761 17.2658
1.8425 29.0 7888 1.8278 11.148 17.2841
1.8306 30.0 8160 1.8261 11.1765 17.2748
1.8306 31.0 8432 1.8255 11.2723 17.2454
1.8229 32.0 8704 1.8247 11.2715 17.2621
1.8229 33.0 8976 1.8231 11.2896 17.2698
1.7975 34.0 9248 1.8245 11.322 17.2491
1.7919 35.0 9520 1.8238 11.3854 17.2711
1.7919 36.0 9792 1.8237 11.3304 17.2634
1.7781 37.0 10064 1.8225 11.3184 17.2644
1.7781 38.0 10336 1.8230 11.3382 17.2651
1.7819 39.0 10608 1.8228 11.3656 17.2658
1.7819 40.0 10880 1.8221 11.3934 17.2544
1.7592 41.0 11152 1.8223 11.3625 17.2421
1.7592 42.0 11424 1.8221 11.4068 17.2511
1.7529 43.0 11696 1.8224 11.4199 17.2541
1.7529 44.0 11968 1.8224 11.4051 17.2561
1.7482 45.0 12240 1.8223 11.4195 17.2504
1.7461 46.0 12512 1.8220 11.3873 17.2497
1.7461 47.0 12784 1.8220 11.4214 17.2431
1.739 48.0 13056 1.8218 11.3972 17.2441
1.739 49.0 13328 1.8219 11.3952 17.2457
1.7362 50.0 13600 1.8219 11.3921 17.2471

Framework versions

  • Transformers 4.12.5
  • Pytorch 1.10.0+cu111
  • Datasets 1.16.1
  • Tokenizers 0.10.3