shhossain
/

whisper-tiny-bn

Automatic Speech Recognition

feature-extraction

Inference Endpoints

Model card Files Files and versions Community

whisper-tiny-bn / README.md

shhossain's picture

Update README.md

267e426 about 1 year ago

|

3.2 kB

	---
	license: apache-2.0
	language:
	- en
	- bn
	metrics:
	- wer
	library_name: transformers
	pipeline_tag: automatic-speech-recognition
	---
	## Results
	- WER 74

	# Use with [BanglaSpeech2text](https://github.com/shhossain/BanglaSpeech2Text)

	## Test it in Google Colab
	- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/shhossain/BanglaSpeech2Text/blob/main/banglaspeech2text_in_colab.ipynb)

	## Installation
	You can install the library using pip:

	```bash
	pip install banglaspeech2text
	```

	## Usage
	### Model Initialization
	To use the library, you need to initialize the Speech2Text class with the desired model. By default, it uses the "base" model, but you can choose from different pre-trained models: "tiny", "small", "medium", "base", or "large". Here's an example:

	```python
	from banglaspeech2text import Speech2Text

	stt = Speech2Text(model="shhossain/whisper-tiny-bn")
	```

	### Transcribing Audio Files
	You can transcribe an audio file by calling the transcribe method and passing the path to the audio file. It will return the transcribed text as a string. Here's an example:

	```python
	transcription = stt.transcribe("audio.wav")
	print(transcription)
	```

	### Use with SpeechRecognition
	You can use [SpeechRecognition](https://pypi.org/project/SpeechRecognition/) package to get audio from microphone and transcribe it. Here's an example:
	```python
	import speech_recognition as sr
	from banglaspeech2text import Speech2Text

	stt = Speech2Text(model="shhossain/whisper-tiny-bn")

	r = sr.Recognizer()
	with sr.Microphone() as source:
	print("Say something!")
	audio = r.listen(source)
	output = stt.recognize(audio)

	print(output)
	```

	### Use GPU
	You can use GPU for faster inference. Here's an example:
	```python

	stt = Speech2Text(model="shhossain/whisper-tiny-bn",use_gpu=True)

	```
	### Advanced GPU Usage
	For more advanced GPU usage you can use `device` or `device_map` parameter. Here's an example:
	```python
	stt = Speech2Text(model="shhossain/whisper-tiny-bn",device="cuda:0")
	```
	```python
	stt = Speech2Text(model="shhossain/whisper-tiny-bn",device_map="auto")
	```
	__NOTE__: Read more about [Pytorch Device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.device)

	### Instantly Check with gradio
	You can instantly check the model with gradio. Here's an example:
	```python
	from banglaspeech2text import Speech2Text, available_models
	import gradio as gr

	stt = Speech2Text(model="shhossain/whisper-tiny-bn",use_gpu=True)

	# You can also open the url and check it in mobile
	gr.Interface(
	fn=stt.transcribe,
	inputs=gr.Audio(source="microphone", type="filepath"),
	outputs="text").launch(share=True)
	```

	__Note__: For more usecases and models -> [BanglaSpeech2Text](https://github.com/shhossain/BanglaSpeech2Text)

	# Use with transformers
	### Installation
	```
	pip install transformers
	pip install torch
	```

	## Usage

	### Use with file
	```python
	from transformers import pipeline

	pipe = pipeline('automatic-speech-recognition','shhossain/whisper-tiny-bn')

	def transcribe(audio_path):
	return pipe(audio_path)['text']

	audio_file = "test.wav"

	print(transcribe(audio_file))
	```