jq's picture
Upload TrainableM2MForConditionalGeneration
2c23ded verified
metadata
tags:
  - generated_from_trainer
base_model: jq/nllb-1.3B-many-to-many-step-2k
datasets:
  - generator
model-index:
  - name: nllb-1.3B-many-to-many-pronouncorrection-charaug
    results: []

nllb-1.3B-many-to-many-pronouncorrection-charaug

This model is a fine-tuned version of jq/nllb-1.3B-many-to-many-step-2k on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2075
  • Bleu Ach Eng: 28.371
  • Bleu Lgg Eng: 30.45
  • Bleu Lug Eng: 41.978
  • Bleu Nyn Eng: 32.296
  • Bleu Teo Eng: 30.422
  • Bleu Eng Ach: 20.972
  • Bleu Eng Lgg: 22.362
  • Bleu Eng Lug: 30.359
  • Bleu Eng Nyn: 15.305
  • Bleu Eng Teo: 21.391
  • Bleu Mean: 27.391

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 25
  • eval_batch_size: 25
  • seed: 42
  • gradient_accumulation_steps: 120
  • total_train_batch_size: 3000
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 1500
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Ach Eng Bleu Lgg Eng Bleu Lug Eng Bleu Nyn Eng Bleu Teo Eng Bleu Eng Ach Bleu Eng Lgg Bleu Eng Lug Bleu Eng Nyn Bleu Eng Teo Bleu Mean
No log 0.0667 100 1.1541 29.033 31.47 41.596 34.169 32.442 19.677 19.657 27.889 14.554 19.143 26.963
No log 1.0301 200 1.1570 27.473 31.853 41.934 32.575 31.606 20.25 20.634 28.592 13.672 19.997 26.859
No log 1.0968 300 1.1288 29.086 33.257 43.387 33.678 33.579 20.377 20.91 28.906 14.992 21.013 27.919
No log 2.0603 400 1.1620 28.122 31.46 42.491 33.304 32.331 20.282 21.604 29.577 14.961 20.94 27.507
0.7273 3.0237 500 1.1661 28.311 32.122 42.825 32.333 32.415 19.799 22.287 29.558 15.708 21.948 27.731
0.7273 3.0904 600 1.1652 28.593 30.62 41.964 33.383 32.08 21.142 21.8 30.215 14.717 21.744 27.626
0.7273 4.0538 700 1.2075 28.371 30.45 41.978 32.296 30.422 20.972 22.362 30.359 15.305 21.391 27.391

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.2.0
  • Datasets 2.19.0
  • Tokenizers 0.19.1