nllb-200-distilled-es-ja
Model Overview
This model was developed as part of a workshop organized by Yasmin Moslem, focusing on speech-to-text pipelines. The workshop's primary goal was to enable accurate transcription and translation of spoken source languages into written target languages while learning about end-to-end and cascaded approaches in the process.
This model is a fine-tuned version of facebook/nllb-200-distilled-600M trained on the voxpopuli_es-ja dataset.
The model achieves performance metrics on the provided dataset:
Evaluation Set:
- Loss: 0.2088
- BLEU: 37.6263
Test Set:
- BLEU: 36.8192
(Baseline evaluation on test set: 21.33662)
Using Whisper-Small-es along with this model highlights the strengths of cascaded architectures in achieving higher translation accuracy over end2end solutions like Whisper-Small-es-ja.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 5000
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu |
---|---|---|---|---|
3.3461 | 0.3965 | 250 | 0.6368 | 30.6487 |
0.2447 | 0.7930 | 500 | 0.2263 | 33.0129 |
0.2114 | 1.1895 | 750 | 0.2187 | 32.5117 |
0.1922 | 1.5860 | 1000 | 0.2121 | 34.6996 |
0.1903 | 1.9826 | 1250 | 0.2080 | 35.5595 |
0.165 | 2.3791 | 1500 | 0.2098 | 35.9749 |
0.1574 | 2.7756 | 1750 | 0.2072 | 36.6129 |
0.1406 | 3.1721 | 2000 | 0.2078 | 36.6204 |
0.1419 | 3.5686 | 2250 | 0.2074 | 36.6043 |
0.1417 | 3.9651 | 2500 | 0.2059 | 36.9861 |
0.1247 | 4.3616 | 2750 | 0.2079 | 37.0112 |
0.1262 | 4.7581 | 3000 | 0.2072 | 36.9232 |
0.1196 | 5.1546 | 3250 | 0.2078 | 36.9248 |
0.1152 | 5.5511 | 3500 | 0.2076 | 37.3149 |
0.1137 | 5.9477 | 3750 | 0.2077 | 37.4817 |
0.105 | 6.3442 | 4000 | 0.2088 | 37.6263 |
0.1105 | 6.7407 | 4250 | 0.2084 | 35.4415 |
0.102 | 7.1372 | 4500 | 0.2088 | 37.3749 |
0.1029 | 7.5337 | 4750 | 0.2089 | 37.3476 |
0.1018 | 7.9302 | 5000 | 0.2090 | 37.5204 |
Framework versions
- Transformers 4.45.2
- Pytorch 2.4.0+cu124
- Datasets 3.2.0
- Tokenizers 0.20.3
Linked Models
- Whisper-Small-es-ja: An end-to-end model trained on this dataset.
- Whisper-Small-es: The ASR model of the cascaded approach built using this dataset.
Model Card Contact
Mariano González (marianoleiras@hotmail.com)
- Downloads last month
- 13
Model tree for Marianoleiras/nllb-200-distilled-es-ja
Base model
facebook/nllb-200-distilled-600M