m2m100_1.2B-ft-cy-to-en

This model is a fine-tuned version of facebook/m2m100_1.2B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6629
  • Bleu: 45.7742
  • Gen Len: 27.7876

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 6000
  • training_steps: 30000

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.4169 0.0166 2000 1.2147 30.0616 29.7576
1.1518 0.0332 4000 0.9864 37.829 26.8354
1.0168 0.0499 6000 0.8875 42.0389 25.8031
0.941 0.0665 8000 0.8239 43.8126 26.3229
0.8777 0.0831 10000 0.7891 43.8364 26.8204
0.8453 0.0997 12000 0.7567 43.9052 26.9718
0.8239 0.1164 14000 0.7393 44.1267 27.1687
0.8114 0.1330 16000 0.7199 44.7922 27.5693
0.7949 0.1496 18000 0.7043 46.0559 27.3151
0.7782 0.1662 20000 0.6946 46.1132 27.7021
0.7684 0.1829 22000 0.6841 46.3332 27.4188
0.7576 0.1995 24000 0.6762 45.9629 27.5139
0.7443 0.2161 26000 0.6694 46.5899 27.2528
0.7501 0.2327 28000 0.6650 45.9993 27.5966
0.7501 0.2494 30000 0.6629 45.7742 27.7876

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
12
Safetensors
Model size
1.24B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for DewiBrynJones/m2m100_1.2B-ft-cy-to-en

Finetuned
(18)
this model

Datasets used to train DewiBrynJones/m2m100_1.2B-ft-cy-to-en

Collection including DewiBrynJones/m2m100_1.2B-ft-cy-to-en