shhossain commited on
Commit
d570704
1 Parent(s): 66b5c77

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -0
README.md CHANGED
@@ -4,9 +4,80 @@ datasets:
4
  - mozilla-foundation/common_voice_11_0
5
  language:
6
  - en
 
7
  metrics:
8
  - wer
9
  library_name: transformers
10
  pipeline_tag: automatic-speech-recognition
11
  ---
 
 
12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - mozilla-foundation/common_voice_11_0
5
  language:
6
  - en
7
+ - bn
8
  metrics:
9
  - wer
10
  library_name: transformers
11
  pipeline_tag: automatic-speech-recognition
12
  ---
13
+ ## Results
14
+ - WER 46
15
 
16
+ # Use with banglaSpeech2text
17
+ ## Installation
18
+ ```bash
19
+ pip install banglaspeech2text
20
+ ```
21
+ __Note__: Must have git and git lfs installed. For more info visit banglaspeech2text doc [here](https://github.com/shhossain/BanglaSpeech2Text#download-git)
22
+
23
+
24
+ ## Usage
25
+
26
+ ### Use with file
27
+ ```python
28
+ from banglaspeech2text import Model
29
+
30
+ base_model = Model('whisper_base_bn_sifat')
31
+ base_model.load() # loading the pipline. first time loading will take time as the model is not downloaded yet.
32
+
33
+ audio_file = "test.wav" # .wav, .mp3, mp4, .ogg, etc.
34
+
35
+ print(base_model.recognize(audio_file))
36
+
37
+ ```
38
+ ### Use with SpeechRecognition
39
+ ```python
40
+ import speech_recognition as sr
41
+ from banglaspeech2text import Model, available_models
42
+
43
+ # Load a model
44
+ models = available_models()
45
+ model = models[0] # select a model
46
+ model = Model(model) # load the model
47
+ model.load()
48
+
49
+
50
+ r = sr.Recognizer()
51
+ with sr.Microphone() as source:
52
+ print("Say something!")
53
+ audio = r.listen(source)
54
+ output = model.recognize(audio)
55
+
56
+ print(output) # output will be a direct containing text
57
+ print(output['text'])
58
+ ```
59
+
60
+ __Note__: For more usecases and models -> [BanglaSpeech2Text](https://github.com/shhossain/BanglaSpeech2Text)
61
+
62
+ # Use with transformers
63
+ ### Installation
64
+ ```
65
+ pip install transformers
66
+ pip install torch
67
+ ```
68
+
69
+ ## Usage
70
+
71
+ ### Use with file
72
+ ```python
73
+ from transformers import pipeline
74
+
75
+ pipe = pipeline('automatic-speech-recognition','shhossain/whisper-base-bn')
76
+
77
+ def transcribe(audio_path):
78
+ return pipe(audio_path)['text']
79
+
80
+ audio_file = "test.wav"
81
+
82
+ print(transcribe(audio_file))
83
+ ```