eunyounglee's picture
update model card README.md
3ad419b
metadata
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: mBART_translator_json_sentence_split
    results: []

mBART_translator_json_sentence_split

This model is a fine-tuned version of facebook/mbart-large-cc25 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0769
  • Bleu: 87.2405
  • Gen Len: 27.425

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
2.0011 1.0 2978 0.5458 63.8087 32.3819
1.1978 2.0 5956 0.1854 76.5291 27.6781
0.9276 3.0 8934 0.1123 84.7194 27.5773
0.776 4.0 11912 0.0845 87.505 27.2845
0.6889 5.0 14890 0.0769 87.2405 27.425

Framework versions

  • Transformers 4.23.1
  • Pytorch 1.12.1+cu113
  • Datasets 2.6.1
  • Tokenizers 0.13.1