--- license: apache-2.0 datasets: - mozilla-foundation/common_voice_11_0 language: - en - bn metrics: - wer library_name: transformers pipeline_tag: automatic-speech-recognition --- ## Results - WER 46 # Use with banglaSpeech2text ## Installation ```bash pip install banglaspeech2text ``` __Note__: Must have git and git lfs installed. For more info visit banglaspeech2text doc [here](https://github.com/shhossain/BanglaSpeech2Text#download-git) ## Usage ### Use with file ```python from banglaspeech2text import Model base_model = Model('whisper_base_bn_sifat') base_model.load() # loading the pipline. first time loading will take time as the model is not downloaded yet. audio_file = "test.wav" # .wav, .mp3, mp4, .ogg, etc. print(base_model.recognize(audio_file)) ``` ### Use with SpeechRecognition ```python import speech_recognition as sr from banglaspeech2text import Model, available_models # Load a model models = available_models() model = models[0] # select a model model = Model(model) # load the model model.load() r = sr.Recognizer() with sr.Microphone() as source: print("Say something!") audio = r.listen(source) output = model.recognize(audio) print(output) # output will be a direct containing text print(output['text']) ``` __Note__: For more usecases and models -> [BanglaSpeech2Text](https://github.com/shhossain/BanglaSpeech2Text) # Use with transformers ### Installation ``` pip install transformers pip install torch ``` ## Usage ### Use with file ```python from transformers import pipeline pipe = pipeline('automatic-speech-recognition','shhossain/whisper-base-bn') def transcribe(audio_path): return pipe(audio_path)['text'] audio_file = "test.wav" print(transcribe(audio_file)) ```