kurianbenoy commited on
Commit
41ee953
1 Parent(s): e2a480f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -3
README.md CHANGED
@@ -4,16 +4,39 @@ datasets:
4
  - thennal/IMaSC
5
  language:
6
  - ml
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  ---
8
 
9
- On Evaluating the model:
10
 
11
- In Mozilla CommonVoice dataset:
 
 
 
 
 
 
12
 
13
  WER - 24.83
14
  CER - 12.84
15
 
16
- In SMC dataset:
17
 
18
  WER - 27.28
19
  CER - 14.64
 
4
  - thennal/IMaSC
5
  language:
6
  - ml
7
+ model-index:
8
+ - name: Malwhisper-v1-small - Kurian Benoy
9
+ results:
10
+ - task:
11
+ type: automatic-speech-recognition
12
+ name: Automatic Speech Recognition
13
+ dataset:
14
+ name: Common Voice 11.0
15
+ type: mozilla-foundation/common_voice_11_0
16
+ config: ml
17
+ split: test
18
+ args: ml
19
+ metrics:
20
+ - type: wer
21
+ value: 24.83
22
+ name: WER
23
+ library_name: transformers
24
  ---
25
 
26
+ ## kurianbenoy/Malwhisper-v1-small
27
 
28
+ This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) fine-tuned on [IMASc dataset](https://www.kaggle.com/datasets/thennal/imasc).
29
+
30
+ IMaSC is a Malayalam text and speech corpus made available by ICFOSS for the purpose of developing speech technology for Malayalam, particularly text-to-speech. The corpus contains 34,473 text-audio pairs of Malayalam sentences spoken by 8 speakers, totalling in approximately 50 hours of audio.
31
+
32
+ The fine-tuned model on evaluating in the following dataset:
33
+
34
+ **In Mozilla CommonVoice 11.0 dataset (Malayalam subset):**
35
 
36
  WER - 24.83
37
  CER - 12.84
38
 
39
+ **In SMC Malayalam Speech Corpus dataset:**
40
 
41
  WER - 27.28
42
  CER - 14.64