Edit model card

mbartLarge_koja_mid2_run1

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1251
  • Bleu: 30.7351
  • Gen Len: 18.1559

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.1823 1.0 11354 1.1695 29.4501 18.8118
0.9207 2.0 22708 1.1251 30.842 18.0892
0.7127 3.0 34062 1.1687 31.2642 18.1188
0.5406 4.0 45416 1.2619 30.9531 17.9958
0.4027 5.0 56770 1.3789 30.7923 18.0582
0.286 6.0 68124 1.4784 30.9393 18.1183

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month
1

Finetuned from