PontifexMaximus's picture
update model card README.md
c0bdccd
|
raw
history blame
3.88 kB
metadata
license: cc-by-nc-sa-4.0
tags:
  - generated_from_trainer
datasets:
  - opus_infopankki
metrics:
  - bleu
model-index:
  - name: mt5-small-parsinlu-opus-translation_fa_en-finetuned-fa-to-en
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: opus_infopankki
          type: opus_infopankki
          args: en-fa
        metrics:
          - name: Bleu
            type: bleu
            value: 9.5106

mt5-small-parsinlu-opus-translation_fa_en-finetuned-fa-to-en

This model is a fine-tuned version of persiannlp/mt5-small-parsinlu-opus-translation_fa_en on the opus_infopankki dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5449
  • Bleu: 9.5106
  • Gen Len: 13.6434

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-06
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 151 3.1656 7.194 14.1885
No log 2.0 302 3.0419 7.7031 14.1005
No log 3.0 453 2.9549 8.1502 13.9834
3.5336 4.0 604 2.8857 8.4488 13.9251
3.5336 5.0 755 2.8297 8.6606 13.786
3.5336 6.0 906 2.7808 8.8217 13.7983
3.2511 7.0 1057 2.7386 8.9221 13.7518
3.2511 8.0 1208 2.7006 9.1988 13.7159
3.2511 9.0 1359 2.6678 9.2751 13.676
3.1055 10.0 1510 2.6387 9.4142 13.6648
3.1055 11.0 1661 2.6154 9.5726 13.6841
3.1055 12.0 1812 2.5945 9.6571 13.6546
3.1055 13.0 1963 2.5813 9.8303 13.6571
3.0199 14.0 2114 2.5709 9.6726 13.5855
3.0199 15.0 2265 2.5619 9.632 13.6125
3.0199 16.0 2416 2.5563 9.5773 13.6256
2.9862 17.0 2567 2.5538 9.5425 13.6366
2.9862 18.0 2718 2.5515 9.5359 13.6326
2.9862 19.0 2869 2.5495 9.5544 13.642
2.9859 20.0 3020 2.5478 9.5183 13.6374
2.9859 21.0 3171 2.5466 9.5387 13.632
2.9859 22.0 3322 2.5458 9.5183 13.6355
2.9859 23.0 3473 2.5451 9.5019 13.6376
2.9731 24.0 3624 2.5449 9.5004 13.6405
2.9731 25.0 3775 2.5449 9.5106 13.6434
2.9731 26.0 3926 2.5449 9.5106 13.6434
2.9671 27.0 4077 2.5449 9.5106 13.6434
2.9671 28.0 4228 2.5449 9.5106 13.6434
2.9671 29.0 4379 2.5449 9.5106 13.6434
2.97 30.0 4530 2.5449 9.5106 13.6434

Framework versions

  • Transformers 4.19.2
  • Pytorch 1.7.1+cu110
  • Datasets 2.2.2
  • Tokenizers 0.12.1