Edit model card

koja_mbartLarge_55p_run2

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9303
  • Bleu: 57.3778
  • Gen Len: 16.682

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.0633 0.48 8000 1.0419 52.4575 17.4003
0.9731 0.97 16000 0.9550 55.7136 16.9686
0.7608 1.45 24000 0.9372 56.8788 16.7537
0.7213 1.93 32000 0.9303 57.4421 16.6742
0.5702 2.42 40000 0.9622 56.774 16.4703
0.5416 2.9 48000 0.9697 57.4192 16.6763
0.4226 3.38 56000 1.0399 56.5425 16.4626

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month
1

Finetuned from