shhossain commited on
Commit
d4909fe
1 Parent(s): 5a5ab73

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +104 -18
README.md CHANGED
@@ -1,30 +1,116 @@
1
  ---
2
  license: apache-2.0
3
- base_model: openai/whisper-tiny
4
- tags:
5
- - generated_from_trainer
6
- metrics:
7
- - wer
8
- model-index:
9
- - name: whisper-tiny-bn
10
- results: []
11
  language:
 
12
  - bn
 
 
 
13
  pipeline_tag: automatic-speech-recognition
14
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
 
16
 
17
- # whisper-tiny-bn
 
 
18
 
19
- This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on the None dataset.
20
- It achieves the following results on the evaluation set:
21
- - Loss: 0.4041
22
- - Wer: 74.0213
23
 
 
 
24
 
25
- ### Framework versions
26
 
27
- - Transformers 4.33.2
28
- - Pytorch 2.0.1+cu118
29
- - Datasets 2.14.5
30
- - Tokenizers 0.13.3
 
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
3
  language:
4
+ - en
5
  - bn
6
+ metrics:
7
+ - wer
8
+ library_name: transformers
9
  pipeline_tag: automatic-speech-recognition
10
  ---
11
+ ## Results
12
+ - WER 74
13
+
14
+ # Use with [BanglaSpeech2text](https://github.com/shhossain/BanglaSpeech2Text)
15
+
16
+ ## Test it in Google Colab
17
+ - [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/shhossain/BanglaSpeech2Text/blob/main/BanglaSpeech2Text_in_Colab.ipynb)
18
+
19
+ ## Installation
20
+ You can install the library using pip:
21
+
22
+ ```bash
23
+ pip install banglaspeech2text
24
+ ```
25
+
26
+ ## Usage
27
+ ### Model Initialization
28
+ To use the library, you need to initialize the Speech2Text class with the desired model. By default, it uses the "base" model, but you can choose from different pre-trained models: "tiny", "small", "medium", "base", or "large". Here's an example:
29
+
30
+ ```python
31
+ from banglaspeech2text import Speech2Text
32
+
33
+ stt = Speech2Text(model="shhossain/whisper-tiny-bn")
34
+ ```
35
+
36
+ ### Transcribing Audio Files
37
+ You can transcribe an audio file by calling the transcribe method and passing the path to the audio file. It will return the transcribed text as a string. Here's an example:
38
+
39
+ ```python
40
+ transcription = stt.transcribe("audio.wav")
41
+ print(transcription)
42
+ ```
43
+
44
+ ### Use with SpeechRecognition
45
+ You can use [SpeechRecognition](https://pypi.org/project/SpeechRecognition/) package to get audio from microphone and transcribe it. Here's an example:
46
+ ```python
47
+ import speech_recognition as sr
48
+ from banglaspeech2text import Speech2Text
49
+
50
+ stt = Speech2Text(model="shhossain/whisper-tiny-bn")
51
+
52
+ r = sr.Recognizer()
53
+ with sr.Microphone() as source:
54
+ print("Say something!")
55
+ audio = r.listen(source)
56
+ output = stt.recognize(audio)
57
+
58
+ print(output)
59
+ ```
60
+
61
+ ### Use GPU
62
+ You can use GPU for faster inference. Here's an example:
63
+ ```python
64
+
65
+ stt = Speech2Text(model="shhossain/whisper-tiny-bn",use_gpu=True)
66
+
67
+ ```
68
+ ### Advanced GPU Usage
69
+ For more advanced GPU usage you can use `device` or `device_map` parameter. Here's an example:
70
+ ```python
71
+ stt = Speech2Text(model="shhossain/whisper-tiny-bn",device="cuda:0")
72
+ ```
73
+ ```python
74
+ stt = Speech2Text(model="shhossain/whisper-tiny-bn",device_map="auto")
75
+ ```
76
+ __NOTE__: Read more about [Pytorch Device](https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.device)
77
+
78
+ ### Instantly Check with gradio
79
+ You can instantly check the model with gradio. Here's an example:
80
+ ```python
81
+ from banglaspeech2text import Speech2Text, available_models
82
+ import gradio as gr
83
+
84
+ stt = Speech2Text(model="shhossain/whisper-tiny-bn",use_gpu=True)
85
+
86
+ # You can also open the url and check it in mobile
87
+ gr.Interface(
88
+ fn=stt.transcribe,
89
+ inputs=gr.Audio(source="microphone", type="filepath"),
90
+ outputs="text").launch(share=True)
91
+ ```
92
+
93
+ __Note__: For more usecases and models -> [BanglaSpeech2Text](https://github.com/shhossain/BanglaSpeech2Text)
94
+
95
+ # Use with transformers
96
+ ### Installation
97
+ ```
98
+ pip install transformers
99
+ pip install torch
100
+ ```
101
 
102
+ ## Usage
103
 
104
+ ### Use with file
105
+ ```python
106
+ from transformers import pipeline
107
 
108
+ pipe = pipeline('automatic-speech-recognition','shhossain/whisper-tiny-bn')
 
 
 
109
 
110
+ def transcribe(audio_path):
111
+ return pipe(audio_path)['text']
112
 
113
+ audio_file = "test.wav"
114
 
115
+ print(transcribe(audio_file))
116
+ ```