m2m100_1.2B-ft-cy-to-en

This model is a fine-tuned version of facebook/m2m100_1.2B on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 6000
training_steps: 30000

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.4169	0.0166	2000	1.2147	30.0616	29.7576
1.1518	0.0332	4000	0.9864	37.829	26.8354
1.0168	0.0499	6000	0.8875	42.0389	25.8031
0.941	0.0665	8000	0.8239	43.8126	26.3229
0.8777	0.0831	10000	0.7891	43.8364	26.8204
0.8453	0.0997	12000	0.7567	43.9052	26.9718
0.8239	0.1164	14000	0.7393	44.1267	27.1687
0.8114	0.1330	16000	0.7199	44.7922	27.5693
0.7949	0.1496	18000	0.7043	46.0559	27.3151
0.7782	0.1662	20000	0.6946	46.1132	27.7021
0.7684	0.1829	22000	0.6841	46.3332	27.4188
0.7576	0.1995	24000	0.6762	45.9629	27.5139
0.7443	0.2161	26000	0.6694	46.5899	27.2528
0.7501	0.2327	28000	0.6650	45.9993	27.5966
0.7501	0.2494	30000	0.6629	45.7742	27.7876