pere commited on
Commit
3bc1038
1 Parent(s): 1c94076

updated template

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md CHANGED
@@ -232,6 +232,32 @@ $ ./main -l no -m models/nb-medium-ggml-model.bin king.wav
232
  $ ./main -l no -m models/nb-medium-ggml-model-q5_0.bin king.wav
233
  ```
234
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
235
  ### API
236
  Instructions for accessing the models via a simple API are included in the demos under Spaces. Note that these demos are temporary and will only be available for a few weeks.
237
 
 
232
  $ ./main -l no -m models/nb-medium-ggml-model-q5_0.bin king.wav
233
  ```
234
 
235
+ ### WhisperX and Speaker Diarization
236
+ Speaker diarization is a technique in natural language processing and automatic speech recognition that identifies and separates different speakers in an audio recording. It segments the audio into parts based on who is speaking, enhancing the quality of transcribing meetings or phone calls. We find that [WhisperX](https://github.com/m-bain/whisperX) is the easiest way to use our models for diarizing speech. In addition, WhisperX is using phoneme-based Wav2Vec-models for improving the alignment of the timestamps. As of December 2023 it also has native support for using the nb-wav2vec-models. It currently uses [PyAnnote-audio](https://github.com/pyannote/pyannote-audio) for doing the actual diarization. This package has a fairly strict licence where you have to agree to user terms. Follow the instructions below.
237
+
238
+ ```bash
239
+ # Follow the install instructions on https://github.com/m-bain/whisperX
240
+ # Make sure you have a HuggingFace account and have agreed to the pyannote terms
241
+
242
+ # Log in (or supply HF Token in command line)
243
+ huggingface-cli login
244
+
245
+ # Download a test file
246
+ wget -N https://github.com/NbAiLab/nb-whisper/raw/main/audio/knuthamsun.mp3
247
+
248
+ # Optional. If you get complains about not support for Norwegian, do:
249
+ pip uninstall whisperx && pip install git+https://github.com/m-bain/whisperx.git@8540ff5985fceee764acbed94f656063d7f56540
250
+
251
+ # Transcribe the test file. All transcripts will end up in the directory of the mp3-file
252
+ whisperx knuthamsun.mp3 --model NbAiLabBeta/nb-whisper-medium-semantic --language no --diarize
253
+
254
+ ```
255
+
256
+ You can also run WhisperX from Python. Please take a look at the instructions on [WhisperX homepage](https://github.com/m-bain/whisperX).
257
+
258
+
259
+
260
+
261
  ### API
262
  Instructions for accessing the models via a simple API are included in the demos under Spaces. Note that these demos are temporary and will only be available for a few weeks.
263