Is it possible to silence non-verbal parts of an audio?

#116
by rdx2k7 - opened

Hi,

Basically anytime I am not talking it's either silent, some other noise or there is some throat clearing. My audio is noise free so it's quite clear, and I want to keep only the verbal parts, without changing the audio length as its synced to video.

Is there any tool or API that can do this? I tried a few splitter tools online but they failed to remove throat clearing from verbal parts.

I thought maybe I can use the Whisper API here to detect the timestamps of where there is speech and silent any other parts. Is that feasible?

It's about 80 hours of audio (~200 files).

I attached a very small sample if you want to test it:

I tried this code that uses the original whisper API on this audio but it didn't silence the throat clearing part:
https://paste.ofcode.org/Gc9MUy83K9UHATUHPDVyZ4

Thanks a lot in advance.

Sign up or log in to comment