nllb-200-1.3B-ft-cym-to-eng

This model is a fine-tuned version of facebook/nllb-200-1.3B on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 6000
training_steps: 30000

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
0.863	0.0455	2000	0.7542	37.4638	27.7533
0.7963	0.0910	4000	0.6780	39.4827	28.2858
0.7547	0.1365	6000	0.6339	42.4413	24.1947
0.725	0.1820	8000	0.6053	39.7103	24.1497
0.6904	0.2275	10000	0.5866	43.7372	23.2546
0.6841	0.2730	12000	0.5748	46.6501	21.7253
0.6633	0.3185	14000	0.5652	47.3222	21.5977
0.6608	0.3640	16000	0.5570	45.4937	23.0514
0.6582	0.4094	18000	0.5518	47.0155	22.1634
0.656	0.4549	20000	0.5471	47.7442	21.8685
0.6512	0.5004	22000	0.5429	47.6474	22.071
0.6373	0.5459	24000	0.5401	46.9893	22.4271
0.6389	0.5914	26000	0.5374	47.6756	21.6087
0.6437	0.6369	28000	0.5359	47.2965	21.9883
0.6359	0.6824	30000	0.5352	47.7998	21.638