metadata

license: cc-by-nc-sa-4.0
tags:
  - generated_from_trainer
datasets:
  - opus_infopankki
metrics:
  - bleu
model-index:
  - name: mt5-small-parsinlu-opus-translation_fa_en-finetuned-fa-to-en
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: opus_infopankki
          type: opus_infopankki
          args: en-fa
        metrics:
          - name: Bleu
            type: bleu
            value: 9.5106

mt5-small-parsinlu-opus-translation_fa_en-finetuned-fa-to-en

This model is a fine-tuned version of persiannlp/mt5-small-parsinlu-opus-translation_fa_en on the opus_infopankki dataset. It achieves the following results on the evaluation set:

Loss: 2.5449
Bleu: 9.5106
Gen Len: 13.6434

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-06
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	151	3.1656	7.194	14.1885
No log	2.0	302	3.0419	7.7031	14.1005
No log	3.0	453	2.9549	8.1502	13.9834
3.5336	4.0	604	2.8857	8.4488	13.9251
3.5336	5.0	755	2.8297	8.6606	13.786
3.5336	6.0	906	2.7808	8.8217	13.7983
3.2511	7.0	1057	2.7386	8.9221	13.7518
3.2511	8.0	1208	2.7006	9.1988	13.7159
3.2511	9.0	1359	2.6678	9.2751	13.676
3.1055	10.0	1510	2.6387	9.4142	13.6648
3.1055	11.0	1661	2.6154	9.5726	13.6841
3.1055	12.0	1812	2.5945	9.6571	13.6546
3.1055	13.0	1963	2.5813	9.8303	13.6571
3.0199	14.0	2114	2.5709	9.6726	13.5855
3.0199	15.0	2265	2.5619	9.632	13.6125
3.0199	16.0	2416	2.5563	9.5773	13.6256
2.9862	17.0	2567	2.5538	9.5425	13.6366
2.9862	18.0	2718	2.5515	9.5359	13.6326
2.9862	19.0	2869	2.5495	9.5544	13.642
2.9859	20.0	3020	2.5478	9.5183	13.6374
2.9859	21.0	3171	2.5466	9.5387	13.632
2.9859	22.0	3322	2.5458	9.5183	13.6355
2.9859	23.0	3473	2.5451	9.5019	13.6376
2.9731	24.0	3624	2.5449	9.5004	13.6405
2.9731	25.0	3775	2.5449	9.5106	13.6434
2.9731	26.0	3926	2.5449	9.5106	13.6434
2.9671	27.0	4077	2.5449	9.5106	13.6434
2.9671	28.0	4228	2.5449	9.5106	13.6434
2.9671	29.0	4379	2.5449	9.5106	13.6434
2.97	30.0	4530	2.5449	9.5106	13.6434

Framework versions

Transformers 4.19.2
Pytorch 1.7.1+cu110
Datasets 2.2.2
Tokenizers 0.12.1