mutisya's picture
Training complete
10f23c7 verified
metadata
license: cc-by-nc-4.0
base_model: facebook/nllb-200-distilled-600M
tags:
  - translation
  - generated_from_trainer
model-index:
  - name: nllb_600m-en-kik-kam-luo-mer-som-swh-drL-24_5-filtered-v24_28_5
    results: []

nllb_600m-en-kik-kam-luo-mer-som-swh-drL-24_5-filtered-v24_28_5

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0835
  • Bleu Eng Latn-kik Latn: 3.5356
  • Bleu Kik Latn-eng Latn: 40.2928
  • Bleu Eng Latn-kam Latn: 0.4342
  • Bleu Kam Latn-eng Latn: 29.3834
  • Bleu Eng Latn-mer Latn: 0.0506
  • Bleu Mer Latn-eng Latn: 33.5991
  • Bleu Eng Latn-luo Latn: 6.0172
  • Bleu Luo Latn-eng Latn: 38.2707
  • Bleu Eng Latn-som Latn: 6.1787
  • Bleu Som Latn-eng Latn: 52.6612
  • Bleu Eng Latn-swh Latn: 61.4777
  • Bleu Swh Latn-eng Latn: 65.1554

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Eng Latn-kik Latn Bleu Kik Latn-eng Latn Bleu Eng Latn-kam Latn Bleu Kam Latn-eng Latn Bleu Eng Latn-mer Latn Bleu Mer Latn-eng Latn Bleu Eng Latn-luo Latn Bleu Luo Latn-eng Latn Bleu Eng Latn-som Latn Bleu Som Latn-eng Latn Bleu Eng Latn-swh Latn Bleu Swh Latn-eng Latn
1.3341 1.0000 10662 1.2351 0.6472 36.0303 0.2230 26.3692 0.0361 29.3951 1.6063 35.0198 7.2785 48.1600 63.1216 61.0783
1.1816 2.0000 21324 1.1546 4.0887 37.8187 0.5897 27.7776 0.0928 31.5033 13.8255 36.3146 5.2457 50.0253 48.6244 62.7818
1.0942 2.9999 31986 1.1111 2.8378 39.1437 0.7869 28.6359 0.0871 32.7888 23.2739 37.3564 12.2084 51.4490 24.1228 63.9610
1.0198 4.0 42649 1.0889 2.2290 39.9930 0.4666 29.2854 0.0775 33.2951 6.9817 38.1357 13.6031 52.3044 50.5211 64.7839
0.9859 4.9999 53310 1.0835 3.5356 40.2928 0.4342 29.3834 0.0506 33.5991 6.0172 38.2707 6.1787 52.6612 61.4777 65.1554

Framework versions

  • Transformers 4.43.3
  • Pytorch 2.3.1+cu121
  • Datasets 2.14.7
  • Tokenizers 0.19.1