shhossain
/

whisper-base-bn

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

shhossain commited on Jan 16, 2023

Commit

d570704

•

1 Parent(s): 66b5c77

Update README.md

Files changed (1) hide show

README.md +71 -0

README.md CHANGED Viewed

@@ -4,9 +4,80 @@ datasets:
 - mozilla-foundation/common_voice_11_0
 language:
 - en
 metrics:
 - wer
 library_name: transformers
 pipeline_tag: automatic-speech-recognition
 ---

 - mozilla-foundation/common_voice_11_0
 language:
 - en
+- bn
 metrics:
 - wer
 library_name: transformers
 pipeline_tag: automatic-speech-recognition
 ---
+## Results
+- WER 46
+# Use with banglaSpeech2text
+## Installation
+```bash
+pip install banglaspeech2text
+```
+__Note__: Must have git and git lfs installed. For more info visit banglaspeech2text doc [here](https://github.com/shhossain/BanglaSpeech2Text#download-git)
+## Usage
+### Use with file
+```python
+from banglaspeech2text import Model
+base_model = Model('whisper_base_bn_sifat')
+base_model.load() # loading the pipline. first time loading will take time as the model is not downloaded yet.
+audio_file = "test.wav" # .wav, .mp3, mp4, .ogg, etc.
+print(base_model.recognize(audio_file))
+```
+### Use with SpeechRecognition
+```python
+import speech_recognition as sr
+from banglaspeech2text import Model, available_models
+# Load a model
+models = available_models()
+model = models[0] # select a model
+model = Model(model) # load the model
+model.load()
+r = sr.Recognizer()
+with sr.Microphone() as source:
+    print("Say something!")
+    audio = r.listen(source)
+    output = model.recognize(audio)
+print(output) # output will be a direct containing text
+print(output['text'])
+```
+__Note__: For more usecases and models -> [BanglaSpeech2Text](https://github.com/shhossain/BanglaSpeech2Text)
+# Use with transformers
+### Installation
+```
+pip install transformers
+pip install torch
+```
+## Usage
+### Use with file
+```python
+from transformers import pipeline
+pipe = pipeline('automatic-speech-recognition','shhossain/whisper-base-bn')
+def transcribe(audio_path):
+  return pipe(audio_path)['text']
+audio_file = "test.wav"
+print(transcribe(audio_file))
+```