phoonnx-vocoders

Neural vocoders (mel โ†’ waveform) for two-stage phoonnx voices (GlowTTS, Matcha-TTS, OptiSpeech). One vocoder per subfolder; each ships a vocoder.json declaring its vocoder_type and any mel-preprocessing flags (e.g. stats_norm with mel_mean/mel_std). Part of the phoonnx community mirror.

subfolder type source
wavenext-22khz wavenext (raw) BSC-LT/wavenext-mel (apache-2.0)
alvocat-vocos-22khz vocos (ISTFT) projecte-aina/alvocat-vocos-22khz (cc-by-nc-4.0)
vocos-mel-22khz-univ vocos (universal) BSC-LT/vocos-mel-22khz
larynx-hifigan-vctk-small hifigan rhasspy/larynx (MIT)
tr-common-voice-hifigan hifigan coqui-ai/TTS
be-common-voice-hifigan hifigan coqui-ai/TTS
en-ljspeech-multiband-melgan melgan (stats-norm) coqui-ai/TTS
uk-mai-multiband-melgan melgan (stats-norm) coqui-ai/TTS

A vocoder only works with an acoustic model whose mel features match it (sample rate, FFT/hop, n_mels, fmin/fmax, normalization). See the vocoder guide.

alvocat-vocos = Vocos finetuned on Catalan (Matxa); vocos-mel-22khz-univ = universal Vocos for any HiFi-GAN 80-mel (Mixer-TTS etc.). Different mel domains โ€” not interchangeable.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including OpenVoiceOS/phoonnx-vocoders