chaanks commited on
Commit
9441246
1 Parent(s): 47a089b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +138 -0
README.md ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ thumbnail: null
5
+ pipeline_tag: automatic-speech-recognition
6
+ tags:
7
+ - whisper
8
+ - pytorch
9
+ - speechbrain
10
+ - Transformer
11
+ - hf-asr-leaderboard
12
+ license: apache-2.0
13
+ model-index:
14
+ - name: asr-whisper-tiny-sb
15
+ results:
16
+ - task:
17
+ name: Automatic Speech Recognition
18
+ type: automatic-speech-recognition
19
+ dataset:
20
+ name: LibriSpeech (clean)
21
+ type: librispeech_asr
22
+ config: clean
23
+ split: test
24
+ args:
25
+ language: en
26
+ metrics:
27
+ - name: Test WER
28
+ type: wer
29
+ value: 7.54
30
+ - task:
31
+ name: Automatic Speech Recognition
32
+ type: automatic-speech-recognition
33
+ dataset:
34
+ name: LibriSpeech (other)
35
+ type: librispeech_asr
36
+ config: other
37
+ split: test
38
+ args:
39
+ language: en
40
+ metrics:
41
+ - name: Test WER
42
+ type: wer
43
+ value: 17.15
44
+ - task:
45
+ name: Automatic Speech Recognition
46
+ type: automatic-speech-recognition
47
+ dataset:
48
+ name: Common Voice 11.0
49
+ type: mozilla-foundation/common_voice_11_0
50
+ config: hi
51
+ split: test
52
+ args:
53
+ language: hi
54
+ metrics:
55
+ - name: Test WER
56
+ type: wer
57
+ value: 141
58
+ ---
59
+
60
+ <iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=large&v=2" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
61
+ <br/><br/>
62
+
63
+ # whisper tiny SpeechBrain
64
+
65
+ This repository provides all the necessary tools to perform automatic speech
66
+ recognition from an end-to-end whisper model within
67
+ SpeechBrain. For a better experience, we encourage you to learn more about
68
+ [SpeechBrain](https://speechbrain.github.io).
69
+
70
+ ## Install SpeechBrain
71
+
72
+ First of all, please install tranformers and SpeechBrain with the following command:
73
+
74
+ ```
75
+ pip install speechbrain transformers==4.28.0
76
+ ```
77
+
78
+ Please notice that we encourage you to read our tutorials and learn more about
79
+ [SpeechBrain](https://speechbrain.github.io).
80
+
81
+ ### Transcribing your own audio files (in Arabic)
82
+
83
+ ```python
84
+
85
+ from speechbrain.pretrained import WhisperASR
86
+
87
+ asr_model = WhisperASR.from_hparams(source="chaanks/asr-whisper-tiny-sb", savedir="pretrained_models/asr-whisper-tiny-sb")
88
+ asr_model.transcribe_file("speechbrain/chaanks/asr-whisper-tiny-sb/example.wav")
89
+
90
+
91
+ ```
92
+ ### Inference on GPU
93
+ To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
94
+
95
+ ### Training
96
+ The model was trained with SpeechBrain.
97
+ To train it from scratch follow these steps:
98
+ 1. Clone SpeechBrain:
99
+ ```bash
100
+ git clone https://github.com/speechbrain/speechbrain/
101
+ ```
102
+ 2. Install it:
103
+ ```bash
104
+ cd speechbrain
105
+ pip install -r requirements.txt
106
+ pip install -e .
107
+ ```
108
+
109
+ 3. Run Training:
110
+ ```bash
111
+ cd recipes/CommonVoice/ASR/transformer/
112
+ python train_with_whisper.py hparams/train_ar_hf_whisper.yaml --data_folder=your_data_folder
113
+ ```
114
+
115
+ You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/10mYPYfj9NpDNAa0nO16Zd_K1bIEUOIpx?usp=share_link).
116
+
117
+ ### Limitations
118
+ The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.
119
+
120
+ #### Referencing SpeechBrain
121
+
122
+ ```
123
+ @misc{SB2021,
124
+ author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua },
125
+ title = {SpeechBrain},
126
+ year = {2021},
127
+ publisher = {GitHub},
128
+ journal = {GitHub repository},
129
+ howpublished = {\\\\url{https://github.com/speechbrain/speechbrain}},
130
+ }
131
+ ```
132
+
133
+ #### About SpeechBrain
134
+ SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to be simple, extremely flexible, and user-friendly. Competitive or state-of-the-art performance is obtained in various domains.
135
+
136
+ Website: https://speechbrain.github.io/
137
+
138
+ GitHub: https://github.com/speechbrain/speechbrain