Marianoleiras commited on
Commit
dde4de1
·
verified ·
1 Parent(s): 9a17c43

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -7
README.md CHANGED
@@ -21,15 +21,25 @@ should probably proofread and complete it, then remove this comment. -->
21
 
22
  # whisper-small-es-ja
23
 
24
- This model is a fine-tuned version of OpenAI's whisper-small on the Marianoleiras/voxpopuli_es-ja dataset, designed for Spanish-to-Japanese speech-to-text (STT) tasks.
25
- It leverages OpenAI's Whisper architecture, which is well-suited for multilingual speech recognition and translation tasks.
 
26
 
27
- The model achieves the following results on the evaluation set:
28
- - Loss: 1.1724
29
- - Bleu: 22.2850
 
 
 
30
 
31
- And the following result on the test set:
32
- - Bleu: 21.4557
 
 
 
 
 
 
33
 
34
  ## Training procedure
35
 
 
21
 
22
  # whisper-small-es-ja
23
 
24
+ ## Model Overview
25
+ This model is a fine-tuned version of OpenAI's Whisper-small, specifically trained on the **Marianoleiras/voxpopuli_es-ja** dataset for Spanish-to-Japanese speech-to-text (STT) tasks.
26
+ It employs the Whisper architecture, which is known for its robustness in multilingual speech recognition and translation scenarios.
27
 
28
+ The primary goal of this model is to enable accurate end-to-end transcription and translation of spoken Spanish into written Japanese.
29
+ It was developed as part of a **three-week workshop organized by Yasmin Moslem**, focusing on speech-to-text pipelines.
30
+ The workshop involved:
31
+ 1. **Dataset creation** during the first week.
32
+ 2. **Model training and optimization** during the second week.
33
+ 3. **In-depth exploration and evaluation** in the third week.
34
 
35
+ The model achieves competitive performance metrics on the provided dataset:
36
+
37
+ **Evaluation Set:**
38
+ - Loss: **1.1724**
39
+ - BLEU: **22.2850**
40
+
41
+ **Test Set:**
42
+ - BLEU: **21.4557**
43
 
44
  ## Training procedure
45