nllb-200-1.3B-ft-cym-to-eng

This model is a fine-tuned version of facebook/nllb-200-1.3B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5352
  • Bleu: 47.7998
  • Gen Len: 21.638

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 6000
  • training_steps: 30000

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.863 0.0455 2000 0.7542 37.4638 27.7533
0.7963 0.0910 4000 0.6780 39.4827 28.2858
0.7547 0.1365 6000 0.6339 42.4413 24.1947
0.725 0.1820 8000 0.6053 39.7103 24.1497
0.6904 0.2275 10000 0.5866 43.7372 23.2546
0.6841 0.2730 12000 0.5748 46.6501 21.7253
0.6633 0.3185 14000 0.5652 47.3222 21.5977
0.6608 0.3640 16000 0.5570 45.4937 23.0514
0.6582 0.4094 18000 0.5518 47.0155 22.1634
0.656 0.4549 20000 0.5471 47.7442 21.8685
0.6512 0.5004 22000 0.5429 47.6474 22.071
0.6373 0.5459 24000 0.5401 46.9893 22.4271
0.6389 0.5914 26000 0.5374 47.6756 21.6087
0.6437 0.6369 28000 0.5359 47.2965 21.9883
0.6359 0.6824 30000 0.5352 47.7998 21.638

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
9
Safetensors
Model size
1.37B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for DewiBrynJones/nllb-200-1.3B-ft-cym-to-eng

Finetuned
(9)
this model

Collection including DewiBrynJones/nllb-200-1.3B-ft-cym-to-eng