metadata

license: cc-by-nc-4.0
base_model: facebook/nllb-200-distilled-600M
tags:
  - translation
  - generated_from_trainer
model-index:
  - name: nllb_600m-en-kik-kam-luo-mer-som-swh-drL-24_5-filtered-v24_28_5
    results: []

nllb_600m-en-kik-kam-luo-mer-som-swh-drL-24_5-filtered-v24_28_5

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.0835
Bleu Eng Latn-kik Latn: 3.5356
Bleu Kik Latn-eng Latn: 40.2928
Bleu Eng Latn-kam Latn: 0.4342
Bleu Kam Latn-eng Latn: 29.3834
Bleu Eng Latn-mer Latn: 0.0506
Bleu Mer Latn-eng Latn: 33.5991
Bleu Eng Latn-luo Latn: 6.0172
Bleu Luo Latn-eng Latn: 38.2707
Bleu Eng Latn-som Latn: 6.1787
Bleu Som Latn-eng Latn: 52.6612
Bleu Eng Latn-swh Latn: 61.4777
Bleu Swh Latn-eng Latn: 65.1554

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 5
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu Eng Latn-kik Latn	Bleu Kik Latn-eng Latn	Bleu Eng Latn-kam Latn	Bleu Kam Latn-eng Latn	Bleu Eng Latn-mer Latn	Bleu Mer Latn-eng Latn	Bleu Eng Latn-luo Latn	Bleu Luo Latn-eng Latn	Bleu Eng Latn-som Latn	Bleu Som Latn-eng Latn	Bleu Eng Latn-swh Latn	Bleu Swh Latn-eng Latn
1.3341	1.0000	10662	1.2351	0.6472	36.0303	0.2230	26.3692	0.0361	29.3951	1.6063	35.0198	7.2785	48.1600	63.1216	61.0783
1.1816	2.0000	21324	1.1546	4.0887	37.8187	0.5897	27.7776	0.0928	31.5033	13.8255	36.3146	5.2457	50.0253	48.6244	62.7818
1.0942	2.9999	31986	1.1111	2.8378	39.1437	0.7869	28.6359	0.0871	32.7888	23.2739	37.3564	12.2084	51.4490	24.1228	63.9610
1.0198	4.0	42649	1.0889	2.2290	39.9930	0.4666	29.2854	0.0775	33.2951	6.9817	38.1357	13.6031	52.3044	50.5211	64.7839
0.9859	4.9999	53310	1.0835	3.5356	40.2928	0.4342	29.3834	0.0506	33.5991	6.0172	38.2707	6.1787	52.6612	61.4777	65.1554

Framework versions

Transformers 4.43.3
Pytorch 2.3.1+cu121
Datasets 2.14.7
Tokenizers 0.19.1