m2m100_1.2B-ft-en-to-cy

This model is a fine-tuned version of facebook/m2m100_1.2B on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 6000
training_steps: 30000

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
2.3252	0.0166	2000	1.9613	21.0612	35.9022
1.5651	0.0332	4000	1.2919	34.3962	34.8431
1.1755	0.0499	6000	0.9977	42.2725	34.4593
0.9801	0.0665	8000	0.8545	46.4396	33.9573
0.8697	0.0831	10000	0.7763	48.9327	34.0146
0.813	0.0997	12000	0.7224	50.2154	33.8613
0.779	0.1164	14000	0.6911	51.4013	33.9477
0.7436	0.1330	16000	0.6648	52.2204	33.9345
0.7224	0.1496	18000	0.6437	52.9165	33.9964
0.7034	0.1662	20000	0.6279	53.6142	33.9663
0.6783	0.1829	22000	0.6134	53.7386	33.9527
0.6765	0.1995	24000	0.6029	54.4546	33.955
0.656	0.2161	26000	0.5941	54.5817	33.9145
0.6522	0.2327	28000	0.5884	54.728	33.9163
0.6562	0.2494	30000	0.5864	54.8016	33.9191