m2m100_1.2B-ft-en-to-cy

This model is a fine-tuned version of facebook/m2m100_1.2B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5864
  • Bleu: 54.8016
  • Gen Len: 33.9191

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 6000
  • training_steps: 30000

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
2.3252 0.0166 2000 1.9613 21.0612 35.9022
1.5651 0.0332 4000 1.2919 34.3962 34.8431
1.1755 0.0499 6000 0.9977 42.2725 34.4593
0.9801 0.0665 8000 0.8545 46.4396 33.9573
0.8697 0.0831 10000 0.7763 48.9327 34.0146
0.813 0.0997 12000 0.7224 50.2154 33.8613
0.779 0.1164 14000 0.6911 51.4013 33.9477
0.7436 0.1330 16000 0.6648 52.2204 33.9345
0.7224 0.1496 18000 0.6437 52.9165 33.9964
0.7034 0.1662 20000 0.6279 53.6142 33.9663
0.6783 0.1829 22000 0.6134 53.7386 33.9527
0.6765 0.1995 24000 0.6029 54.4546 33.955
0.656 0.2161 26000 0.5941 54.5817 33.9145
0.6522 0.2327 28000 0.5884 54.728 33.9163
0.6562 0.2494 30000 0.5864 54.8016 33.9191

Framework versions

  • Transformers 4.50.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.4.1
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1.24B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for DewiBrynJones/m2m100_1.2B-ft-en-to-cy

Finetuned
(18)
this model

Datasets used to train DewiBrynJones/m2m100_1.2B-ft-en-to-cy

Collection including DewiBrynJones/m2m100_1.2B-ft-en-to-cy