metadata

license: mit
tags:
  - generated_from_trainer
datasets:
  - iva_mt_wslot
metrics:
  - bleu
model-index:
  - name: iva_mt_wslot-m2m100_418M-en-es-plaintext_10e
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: iva_mt_wslot
          type: iva_mt_wslot
          config: en-es
          split: validation
          args: en-es
        metrics:
          - name: Bleu
            type: bleu
            value: 51.1501

iva_mt_wslot-m2m100_418M-en-es-plaintext_10e

This model is a fine-tuned version of facebook/m2m100_418M on the iva_mt_wslot dataset. It achieves the following results on the evaluation set:

Loss: 0.0116
Bleu: 51.1501
Gen Len: 12.6861

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
0.012	1.0	2104	0.0109	47.9124	12.7523
0.0079	2.0	4208	0.0101	49.9897	12.6763
0.0059	3.0	6312	0.0101	50.5286	12.6435
0.0045	4.0	8416	0.0101	49.6821	12.5472
0.0033	5.0	10520	0.0104	50.3856	12.6638
0.0024	6.0	12624	0.0107	50.359	12.7418
0.0019	7.0	14728	0.0111	50.8234	12.709
0.0014	8.0	16832	0.0111	50.872	12.6671
0.0011	9.0	18936	0.0114	51.3014	12.6291
0.001	10.0	21040	0.0116	51.1501	12.6861

Framework versions

Transformers 4.28.1
Pytorch 2.0.1+cu118
Datasets 2.14.4
Tokenizers 0.13.3