Edit model card

ko-en-m2m

This model is a fine-tuned version of facebook/m2m100_418M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4282
  • Bleu: 25.8137
  • Gen Len: 10.9556

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.5891 0.3 5000 0.7640 12.7212 10.465
0.5653 0.6 10000 0.7211 13.4957 11.5844
0.5464 0.91 15000 0.6875 13.5204 10.6604
0.5254 1.21 20000 0.6690 14.5273 10.5754
0.5308 1.51 25000 0.6757 14.1623 11.9493
0.5192 1.81 30000 0.6458 15.1048 10.8811
0.502 2.11 35000 0.6423 14.7989 11.047
0.4971 2.42 40000 0.6259 15.6324 11.0428
0.502 2.72 45000 0.6047 16.684 10.9814
0.4544 3.02 50000 0.5834 16.9704 10.9722
0.4541 3.32 55000 0.5722 17.6061 10.8485
0.4362 3.63 60000 0.5523 19.1337 10.7972
0.4285 3.93 65000 0.5325 19.4546 10.6665
0.3851 4.23 70000 0.5159 20.4035 10.6171
0.3891 4.53 75000 0.4926 21.8822 10.8857
0.3602 4.83 80000 0.4740 22.737 11.0248
0.336 5.14 85000 0.4570 23.7202 10.7115
0.3355 5.44 90000 0.4415 24.9891 10.9077
0.3244 5.74 95000 0.4282 25.8137 10.9556

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.0+cu116
  • Datasets 2.7.1
  • Tokenizers 0.13.2
Downloads last month
1