Edit model card

mbart-large-50-en-es-translation-lr-1e-05-weight-decay-0.1

This model is a fine-tuned version of facebook/mbart-large-50 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9532
  • Bleu: 45.1551
  • Rouge: {'rouge1': 0.707093830119779, 'rouge2': 0.5240989044660875, 'rougeL': 0.6865395711179825, 'rougeLsum': 0.6867643949864491}

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 4

Training results

Training Loss Epoch Step Validation Loss Bleu Rouge
1.4485 1.0 4500 1.0236 42.1586 {'rouge1': 0.6728104679322686, 'rouge2': 0.4866267759088613, 'rougeL': 0.6507619922873461, 'rougeLsum': 0.6508024989844624}
0.8867 2.0 9000 0.9542 44.1945 {'rouge1': 0.6933374960151913, 'rouge2': 0.5090654274262618, 'rougeL': 0.6722360570050694, 'rougeLsum': 0.6723972406375381}
0.7112 3.0 13500 0.9408 44.9173 {'rouge1': 0.7047659807760827, 'rouge2': 0.5200169348076622, 'rougeL': 0.6839031690668775, 'rougeLsum': 0.6842067045539153}
0.6075 4.0 18000 0.9532 45.2020 {'rouge1': 0.7070170730434684, 'rouge2': 0.5239391023023636, 'rougeL': 0.6863309446860562, 'rougeLsum': 0.6866635686411662}

Framework versions

  • Transformers 4.33.0.dev0
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.4.dev0
  • Tokenizers 0.13.3
Downloads last month
2

Finetuned from

Collection including DrishtiSharma/mbart-large-50-en-es-translation-lr-1e-05-weight-decay-0.1