m2m100_418M-ft-en-to-cy

This model is a fine-tuned version of facebook/m2m100_418M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7277
  • Bleu: 48.1019
  • Gen Len: 35.226

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 6000
  • training_steps: 30000

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
3.5951 0.0166 2000 3.0960 8.4556 41.3461
2.329 0.0332 4000 1.9313 19.282 36.0086
1.664 0.0499 6000 1.3746 29.2697 36.1992
1.3506 0.0665 8000 1.1272 35.5321 36.6889
1.1703 0.0831 10000 0.9973 38.5762 35.3101
1.0818 0.0997 12000 0.9244 41.288 35.508
1.0269 0.1164 14000 0.8714 42.9136 35.4216
0.9739 0.1330 16000 0.8337 44.36 37.1801
0.9373 0.1496 18000 0.8046 44.6704 35.1619
0.9144 0.1662 20000 0.7843 45.8625 35.598
0.8825 0.1829 22000 0.7637 46.5404 35.2287
0.8773 0.1995 24000 0.7493 47.0536 35.0837
0.8497 0.2161 26000 0.7383 47.8017 35.0227
0.8571 0.2327 28000 0.7302 47.8125 35.2915
0.8568 0.2494 30000 0.7277 48.1019 35.226

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
15
Safetensors
Model size
484M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for DewiBrynJones/m2m100_418M-ft-en-to-cy

Finetuned
(83)
this model

Collection including DewiBrynJones/m2m100_418M-ft-en-to-cy