Edit model card

enko_mbartLarge_36p_tokenize_run1

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1249
  • Bleu: 38.8566
  • Gen Len: 16.4716

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.3157 0.46 5000 1.2895 34.4176 16.4931
1.2575 0.93 10000 1.2279 35.0029 16.8009
1.1578 1.39 15000 1.1733 36.9282 16.5838
1.0885 1.86 20000 1.1464 37.6913 16.6649
1.0451 2.32 25000 1.1437 37.7875 16.5188
1.0465 2.79 30000 1.1425 37.895 16.4987
1.0156 3.25 35000 1.1464 37.8434 16.5515
0.9893 3.72 40000 1.1544 37.358 16.6096
0.8779 4.18 45000 1.1419 38.1772 16.457
0.8565 4.65 50000 1.1249 38.8455 16.4749
0.7293 5.11 55000 1.1566 38.4853 16.3462
0.7294 5.57 60000 1.1824 37.8822 16.3295
0.7254 6.04 65000 1.2153 37.3612 16.381

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month
1

Finetuned from