sivan22 commited on
Commit
b7bc7e4
1 Parent(s): 38bec15

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -61
README.md CHANGED
@@ -134,75 +134,19 @@ It is a 1550M parameters multi-lingual ASR solution.
134
  To transcribe audio samples, the model has to be used alongside a [`WhisperProcessor`](https://huggingface.co/docs/transformers/model_doc/whisper#transformers.WhisperProcessor).
135
 
136
  ```python
137
- import torch
138
- from transformers import WhisperProcessor, WhisperForConditionalGeneration
139
 
140
- SAMPLING_RATE = 16000
141
 
142
- has_cuda = torch.cuda.is_available()
143
- model_path = 'ivrit-ai/whisper-large-v2-tuned'
144
-
145
- model = WhisperForConditionalGeneration.from_pretrained(model_path)
146
- if has_cuda:
147
- model.to('cuda:0')
148
-
149
- processor = WhisperProcessor.from_pretrained(model_path)
150
-
151
- # audio_resample based on entry being part of an existing dataset.
152
- # Alternatively, this can be loaded from an audio file.
153
- audio_resample = librosa.resample(entry['audio']['array'], orig_sr=entry['audio']['sampling_rate'], target_sr=SAMPLING_RATE)
154
-
155
- input_features = processor(audio_resample, sampling_rate=SAMPLING_RATE, return_tensors="pt").input_features
156
- if has_cuda:
157
- input_features = input_features.to('cuda:0')
158
-
159
- predicted_ids = model.generate(input_features, language='he', num_beams=5)
160
- transcript = processor.batch_decode(predicted_ids, skip_special_tokens=True)
161
-
162
- print(f'Transcript: {transcription[0]}')
163
  ```
164
 
165
  ## Evaluation
166
 
167
  You can use the [evaluate_model.py](https://github.com/yairl/ivrit.ai/blob/master/evaluate_model.py) reference on GitHub to evalute the model's quality.
168
 
169
- ## Long-Form Transcription
170
-
171
- The Whisper model is intrinsically designed to work on audio samples of up to 30s in duration. However, by using a chunking
172
- algorithm, it can be used to transcribe audio samples of up to arbitrary length. This is possible through Transformers
173
- [`pipeline`](https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.AutomaticSpeechRecognitionPipeline)
174
- method. Chunking is enabled by setting `chunk_length_s=30` when instantiating the pipeline. With chunking enabled, the pipeline
175
- can be run with batched inference. It can also be extended to predict sequence level timestamps by passing `return_timestamps=True`:
176
-
177
- ```python
178
- >>> import torch
179
- >>> from transformers import pipeline
180
- >>> from datasets import load_dataset
181
-
182
- >>> device = "cuda:0" if torch.cuda.is_available() else "cpu"
183
-
184
- >>> pipe = pipeline(
185
- >>> "automatic-speech-recognition",
186
- >>> model="ivrit-ai/whisper-large-v2-tuned",
187
- >>> chunk_length_s=30,
188
- >>> device=device,
189
- >>> )
190
-
191
- >>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
192
- >>> sample = ds[0]["audio"]
193
-
194
- >>> prediction = pipe(sample.copy(), batch_size=8)["text"]
195
- " Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel."
196
-
197
- >>> # we can also return timestamps for the predictions
198
- >>> prediction = pipe(sample.copy(), batch_size=8, return_timestamps=True)["chunks"]
199
- [{'text': ' Mr. Quilter is the apostle of the middle classes and we are glad to welcome his gospel.',
200
- 'timestamp': (0.0, 5.44)}]
201
- ```
202
-
203
- Refer to the blog post [ASR Chunking](https://huggingface.co/blog/asr-chunking) for more details on the chunking algorithm.
204
-
205
-
206
 
207
  ### BibTeX entry and citation info
208
 
 
134
  To transcribe audio samples, the model has to be used alongside a [`WhisperProcessor`](https://huggingface.co/docs/transformers/model_doc/whisper#transformers.WhisperProcessor).
135
 
136
  ```python
137
+ from faster_whisper import WhisperModel
 
138
 
139
+ model = WhisperModel("sivan22/faster-whisper-ivrit-ai-whisper-large-v2-tuned")
140
 
141
+ segments, info = model.transcribe("audio.mp3")
142
+ for segment in segments:
143
+ print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
144
  ```
145
 
146
  ## Evaluation
147
 
148
  You can use the [evaluate_model.py](https://github.com/yairl/ivrit.ai/blob/master/evaluate_model.py) reference on GitHub to evalute the model's quality.
149
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
150
 
151
  ### BibTeX entry and citation info
152