m2m100_418M-ft-en-to-cy

This model is a fine-tuned version of facebook/m2m100_418M on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 6000
training_steps: 30000

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
3.5951	0.0166	2000	3.0960	8.4556	41.3461
2.329	0.0332	4000	1.9313	19.282	36.0086
1.664	0.0499	6000	1.3746	29.2697	36.1992
1.3506	0.0665	8000	1.1272	35.5321	36.6889
1.1703	0.0831	10000	0.9973	38.5762	35.3101
1.0818	0.0997	12000	0.9244	41.288	35.508
1.0269	0.1164	14000	0.8714	42.9136	35.4216
0.9739	0.1330	16000	0.8337	44.36	37.1801
0.9373	0.1496	18000	0.8046	44.6704	35.1619
0.9144	0.1662	20000	0.7843	45.8625	35.598
0.8825	0.1829	22000	0.7637	46.5404	35.2287
0.8773	0.1995	24000	0.7493	47.0536	35.0837
0.8497	0.2161	26000	0.7383	47.8017	35.0227
0.8571	0.2327	28000	0.7302	47.8125	35.2915
0.8568	0.2494	30000	0.7277	48.1019	35.226