Marianoleiras
/

whisper-small-es-ja

Automatic Speech Recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Marianoleiras commited on Jan 13

Commit

dde4de1

·

verified ·

1 Parent(s): 9a17c43

Update README.md

Files changed (1) hide show

README.md +17 -7

README.md CHANGED Viewed

@@ -21,15 +21,25 @@ should probably proofread and complete it, then remove this comment. -->
 # whisper-small-es-ja
-This model is a fine-tuned version of OpenAI's whisper-small on the Marianoleiras/voxpopuli_es-ja dataset, designed for Spanish-to-Japanese speech-to-text (STT) tasks.
-It leverages OpenAI's Whisper architecture, which is well-suited for multilingual speech recognition and translation tasks.
-The model achieves the following results on the evaluation set:
-- Loss: 1.1724
-- Bleu: 22.2850
-And the following result on the test set:
-- Bleu: 21.4557
 ## Training procedure

 # whisper-small-es-ja
+## Model Overview
+This model is a fine-tuned version of OpenAI's Whisper-small, specifically trained on the **Marianoleiras/voxpopuli_es-ja** dataset for Spanish-to-Japanese speech-to-text (STT) tasks.
+It employs the Whisper architecture, which is known for its robustness in multilingual speech recognition and translation scenarios.
+The primary goal of this model is to enable accurate end-to-end transcription and translation of spoken Spanish into written Japanese.
+It was developed as part of a **three-week workshop organized by Yasmin Moslem**, focusing on speech-to-text pipelines.
+The workshop involved:
+1. **Dataset creation** during the first week.
+2. **Model training and optimization** during the second week.
+3. **In-depth exploration and evaluation** in the third week.
+The model achieves competitive performance metrics on the provided dataset:
+**Evaluation Set:**
+- Loss: **1.1724**
+- BLEU: **22.2850**
+**Test Set:**
+- BLEU: **21.4557**
 ## Training procedure