Spaces:
Running
Generates garbage output (can verify when you decode it)
I noticed with the bad generations, if you decode the audio from the nzb it creates, it's garbage. Example:
I've decoded other nzbs fine, but the ones created from this tool all have this sound.
Are there specific audio formats required for the uploaded wav files for this tool?
I tried 32-bit float, and 16 and 24 bit PCM. All 3 generate different types of garbage. So this tool must require a specific bitrate / format that isn't mentioned anywhere.
I see what you mean, wow, I guess I should have tested more before pushing that. I didn't know gradio exported audio in 16 bit basically always when using an audio input element.
It's been fixed by adding a cast and dividing by 32767.0 (from int16 to float32), and making sure it flattens no matter if the channels stride is 0 or 1