mBART_translator_json_2

This model is a fine-tuned version of facebook/mbart-large-cc25 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1203
  • Bleu: 77.8658
  • Gen Len: 36.1527

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.7858 1.0 1912 0.6568 55.2937 75.6389
0.994 2.0 3824 0.4015 71.3655 35.744
0.7267 3.0 5736 0.2971 66.7522 34.5473
0.5916 4.0 7648 0.2437 80.0233 37.4331
0.502 5.0 9560 0.2072 80.9632 36.9833
0.433 6.0 11472 0.1767 69.9384 36.6381
0.3581 7.0 13384 0.1566 64.615 34.8954
0.3244 8.0 15296 0.1382 77.5563 36.1736
0.2815 9.0 17208 0.1259 76.1662 36.1548
0.2555 10.0 19120 0.1203 77.8658 36.1527

Framework versions

  • Transformers 4.23.1
  • Pytorch 1.12.1+cu113
  • Datasets 2.5.2
  • Tokenizers 0.13.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support