---
tags:
- generated_from_trainer
base_model: jq/nllb-1.3B-many-to-many-step-2k
datasets:
- generator
model-index:
- name: nllb-1.3B-many-to-many-pronouncorrection-charaug
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# nllb-1.3B-many-to-many-pronouncorrection-charaug

This model is a fine-tuned version of [jq/nllb-1.3B-many-to-many-step-2k](https://huggingface.co/jq/nllb-1.3B-many-to-many-step-2k) on the generator dataset.
It achieves the following results on the evaluation set:
- Loss: 1.2075
- Bleu Ach Eng: 28.371
- Bleu Lgg Eng: 30.45
- Bleu Lug Eng: 41.978
- Bleu Nyn Eng: 32.296
- Bleu Teo Eng: 30.422
- Bleu Eng Ach: 20.972
- Bleu Eng Lgg: 22.362
- Bleu Eng Lug: 30.359
- Bleu Eng Nyn: 15.305
- Bleu Eng Teo: 21.391
- Bleu Mean: 27.391

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 25
- eval_batch_size: 25
- seed: 42
- gradient_accumulation_steps: 120
- total_train_batch_size: 3000
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 1500
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Bleu Ach Eng | Bleu Lgg Eng | Bleu Lug Eng | Bleu Nyn Eng | Bleu Teo Eng | Bleu Eng Ach | Bleu Eng Lgg | Bleu Eng Lug | Bleu Eng Nyn | Bleu Eng Teo | Bleu Mean |
|:-------------:|:------:|:----:|:---------------:|:------------:|:------------:|:------------:|:------------:|:------------:|:------------:|:------------:|:------------:|:------------:|:------------:|:---------:|
| No log        | 0.0667 | 100  | 1.1541          | 29.033       | 31.47        | 41.596       | 34.169       | 32.442       | 19.677       | 19.657       | 27.889       | 14.554       | 19.143       | 26.963    |
| No log        | 1.0301 | 200  | 1.1570          | 27.473       | 31.853       | 41.934       | 32.575       | 31.606       | 20.25        | 20.634       | 28.592       | 13.672       | 19.997       | 26.859    |
| No log        | 1.0968 | 300  | 1.1288          | 29.086       | 33.257       | 43.387       | 33.678       | 33.579       | 20.377       | 20.91        | 28.906       | 14.992       | 21.013       | 27.919    |
| No log        | 2.0603 | 400  | 1.1620          | 28.122       | 31.46        | 42.491       | 33.304       | 32.331       | 20.282       | 21.604       | 29.577       | 14.961       | 20.94        | 27.507    |
| 0.7273        | 3.0237 | 500  | 1.1661          | 28.311       | 32.122       | 42.825       | 32.333       | 32.415       | 19.799       | 22.287       | 29.558       | 15.708       | 21.948       | 27.731    |
| 0.7273        | 3.0904 | 600  | 1.1652          | 28.593       | 30.62        | 41.964       | 33.383       | 32.08        | 21.142       | 21.8         | 30.215       | 14.717       | 21.744       | 27.626    |
| 0.7273        | 4.0538 | 700  | 1.2075          | 28.371       | 30.45        | 41.978       | 32.296       | 30.422       | 20.972       | 22.362       | 30.359       | 15.305       | 21.391       | 27.391    |


### Framework versions

- Transformers 4.40.1
- Pytorch 2.2.0
- Datasets 2.19.0
- Tokenizers 0.19.1