poonehmousavi commited on
Commit
9dfef9f
1 Parent(s): c35e918

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -14
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  language:
3
- - ar
4
  thumbnail: null
5
  pipeline_tag: automatic-speech-recognition
6
  tags:
@@ -16,31 +16,31 @@ metrics:
16
  - wer
17
  - cer
18
  model-index:
19
- - name: asr-whisper-medium-commonvoice-ar
20
  results:
21
  - task:
22
  name: Automatic Speech Recognition
23
  type: automatic-speech-recognition
24
  dataset:
25
- name: CommonVoice 10.0 (Arabic)
26
  type: mozilla-foundation/common_voice_14_0
27
- config: ar
28
  split: test
29
  args:
30
- language: ar
31
  metrics:
32
  - name: Test WER
33
  type: wer
34
- value: '14.82'
35
  ---
36
 
37
  <iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=medium" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
38
  <br/><br/>
39
 
40
- # whisper medium fine-tuned on CommonVoice-14.0 Arabic
41
 
42
  This repository provides all the necessary tools to perform automatic speech
43
- recognition from an end-to-end whisper model fine-tuned on CommonVoice (Arabic Language) within
44
  SpeechBrain. For a better experience, we encourage you to learn more about
45
  [SpeechBrain](https://speechbrain.github.io).
46
 
@@ -48,14 +48,14 @@ The performance of the model is the following:
48
 
49
  | Release | Test CER | Test WER | GPUs |
50
  |:-------------:|:--------------:|:--------------:| :--------:|
51
- | 1-08-23 | 4.95 | 14.82 | 1xV100 32GB |
52
 
53
  ## Pipeline description
54
 
55
  This ASR system is composed of whisper encoder-decoder blocks:
56
  - The pretrained whisper-medium encoder is frozen.
57
  - The pretrained Whisper tokenizer is used.
58
- - A pretrained Whisper-medium decoder ([openai/whisper-medium](https://huggingface.co/openai/whisper-medium)) is finetuned on CommonVoice ar.
59
  The obtained final acoustic representation is given to the greedy decoder.
60
 
61
  The system is trained with recordings sampled at 16kHz (single channel).
@@ -72,14 +72,14 @@ pip install speechbrain transformers
72
  Please notice that we encourage you to read our tutorials and learn more about
73
  [SpeechBrain](https://speechbrain.github.io).
74
 
75
- ### Transcribing your own audio files (in Arabic)
76
 
77
  ```python
78
 
79
  from speechbrain.pretrained import WhisperASR
80
 
81
- asr_model = WhisperASR.from_hparams(source="speechbrain/asr-whisper-medium-commonvoice-ar", savedir="pretrained_models/asr-whisper-medium-commonvoice-ar")
82
- asr_model.transcribe_file("speechbrain/asr-whisper-lmedium-commonvoice-ar/example-ar.mp3")
83
 
84
 
85
  ```
@@ -129,4 +129,4 @@ SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to b
129
 
130
  Website: https://speechbrain.github.io/
131
 
132
- GitHub: https://github.com/speechbrain/speechbrain
 
1
  ---
2
  language:
3
+ - fr
4
  thumbnail: null
5
  pipeline_tag: automatic-speech-recognition
6
  tags:
 
16
  - wer
17
  - cer
18
  model-index:
19
+ - name: asr-whisper-medium-commonvoice-fr
20
  results:
21
  - task:
22
  name: Automatic Speech Recognition
23
  type: automatic-speech-recognition
24
  dataset:
25
+ name: CommonVoice 10.0 (French)
26
  type: mozilla-foundation/common_voice_14_0
27
+ config: fr
28
  split: test
29
  args:
30
+ language: fr
31
  metrics:
32
  - name: Test WER
33
  type: wer
34
+ value: '9.65'
35
  ---
36
 
37
  <iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=medium" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
38
  <br/><br/>
39
 
40
+ # whisper medium fine-tuned on CommonVoice-14.0 French
41
 
42
  This repository provides all the necessary tools to perform automatic speech
43
+ recognition from an end-to-end whisper model fine-tuned on CommonVoice (French Language) within
44
  SpeechBrain. For a better experience, we encourage you to learn more about
45
  [SpeechBrain](https://speechbrain.github.io).
46
 
 
48
 
49
  | Release | Test CER | Test WER | GPUs |
50
  |:-------------:|:--------------:|:--------------:| :--------:|
51
+ | 1-08-23 | 3.26 | 9.65 | 1xV100 32GB |
52
 
53
  ## Pipeline description
54
 
55
  This ASR system is composed of whisper encoder-decoder blocks:
56
  - The pretrained whisper-medium encoder is frozen.
57
  - The pretrained Whisper tokenizer is used.
58
+ - A pretrained Whisper-medium decoder ([openai/whisper-medium](https://huggingface.co/openai/whisper-medium)) is finetuned on CommonVoice fr.
59
  The obtained final acoustic representation is given to the greedy decoder.
60
 
61
  The system is trained with recordings sampled at 16kHz (single channel).
 
72
  Please notice that we encourage you to read our tutorials and learn more about
73
  [SpeechBrain](https://speechbrain.github.io).
74
 
75
+ ### Transcribing your own audio files (in French)
76
 
77
  ```python
78
 
79
  from speechbrain.pretrained import WhisperASR
80
 
81
+ asr_model = WhisperASR.from_hparams(source="speechbrain/asr-whisper-medium-commonvoice-fr", savedir="pretrained_models/asr-whisper-medium-commonvoice-fr")
82
+ asr_model.transcribe_file("speechbrain/asr-whisper-lmedium-commonvoice-fr/example-fr.mp3")
83
 
84
 
85
  ```
 
129
 
130
  Website: https://speechbrain.github.io/
131
 
132
+ GitHub: https://github.com/speechbrain/speechbrain