speechbrain
/

vad-crdnn-libriparty

Voice Activity Detection

Speech Activity Detection

Speaker Diarization

Model card Files Files and versions Community

speechbrainteam commited on Sep 10, 2021

Commit

edc9f92

•

1 Parent(s): f475678

Update README.md

Files changed (1) hide show

README.md +5 -3

README.md CHANGED Viewed

@@ -92,13 +92,14 @@ Sometimes it is useful to jointly visualize the VAD output with the input signal
 To do it:
 ```python
-upsampled_boundaries = VAD.upsample_boundaries(boundaries, audio_file)
-torchaudio.save('vad_final.wav', upsampled_boundaries.cpu(), sample_rate)
 ```
 This creates a "VAD signal" with the same dimensionality as the original signal.
-You can now open *vad_final.wav* and *speechbrain/vad_example.wav* with software like audacity to visualize them jointly.
 ### VAD pipeline details
@@ -117,6 +118,7 @@ We designed the VAD such that you can have access to all of these steps (this mi
 ```python
 # 1- Let's compute frame-level posteriors first
 prob_chunks = VAD.get_speech_prob_file(audio_file)
 # 2- Let's apply a threshold on top of the posteriors

 To do it:
 ```python
+import torchaudio
+upsampled_boundaries = VAD.upsample_boundaries(boundaries, 'pretrained_model_checkpoints/example_vad.wav')
+torchaudio.save('vad_final.wav', upsampled_boundaries.cpu(), 16000)
 ```
 This creates a "VAD signal" with the same dimensionality as the original signal.
+You can now open *vad_final.wav* and *pretrained_model_checkpoints/example_vad.wav* with software like audacity to visualize them jointly.
 ### VAD pipeline details
 ```python
 # 1- Let's compute frame-level posteriors first
+audio_file = 'pretrained_model_checkpoints/example_vad.wav'
 prob_chunks = VAD.get_speech_prob_file(audio_file)
 # 2- Let's apply a threshold on top of the posteriors