whisper-small-es-ja

Model Overview

This model was developed as part of a workshop organized by Yasmin Moslem, focusing on speech-to-text pipelines. The workshop's primary goal was to enable accurate transcription and translation of spoken source languages into written target languages while learning about end-to-end and cascaded approaches in the process.

This model represents an end-to-end solution for Spanish-to-Japanese speech-to-text (STT) tasks and is a fine-tuned version of OpenAI's Whisper-small, specifically trained on the Marianoleiras/voxpopuli_es-ja dataset for Spanish-to-Japanese speech-to-text (STT) tasks.

The model achieves performance metrics on the provided dataset:

Evaluation Set:

Loss: 1.1724
BLEU: 22.2850

Test Set:

BLEU: 20.8607
ChrF++: 23.3571
Comet: 77.6979

(Baseline evaluation on test set: BLEU 0.4793)

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
training_steps: 3500
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Bleu	Validation Loss
1.5787	0.3962	250	11.6756	1.5196
1.3535	0.7924	500	16.0514	1.3470
1.0658	1.1886	750	17.7743	1.2533
1.0303	1.5848	1000	19.1894	1.2046
0.9893	1.9810	1250	20.1198	1.1591
0.7569	2.3772	1500	21.0054	1.1546
0.7571	2.7734	1750	21.6425	1.1378
0.5557	3.1696	2000	21.7563	1.1500
0.5612	3.5658	2250	21.1391	1.1395
0.5581	3.9620	2500	22.0412	1.1343
0.4144	4.3582	2750	22.2850	1.1724
0.4114	4.7544	3000	22.1925	1.1681
0.3005	5.1506	3250	21.4948	1.1947
0.2945	5.5468	3500	22.1454	1.1921

Framework versions

Transformers 4.47.1
Pytorch 2.4.0+cu124
Datasets 3.2.0
Tokenizers 0.21.0

Linked Models

Whisper-Small-es: The ASR model of the cascaded approach built using this dataset.
NLLB-200-Distilled-es-ja: The MT model of the cascaded approach built using this dataset.

Model Card Contact

Mariano González (marianoleiras@hotmail.com)

Marianoleiras
/

whisper-small-es-ja