nl_NL-mls-medium voice output is weird

#13

by BramNH - opened Apr 8, 2024

Apr 8, 2024

Are the nl_NL models verified? The 7432-low and 5809-low are sounding weird, but understandable.
The mls-medium model is just gibberish. I am doing something wrong? I have tested them by manually installing piper in a python venv and outputted .wav files, but also installed the Docker container and tested within Home Assistant.

synesthesiam

Rhasspy org Apr 16, 2024

Those models were trained from audio books, and so they perform poorly with shorter sentences. They are also VERY sensitive to punctuation. Some things that help:

Always use a period at the end of your phrase
Use 0.333 for noise-scale and noise-scale-w

In the Piper sample generator, we have a method for correcting the short sentence problem that hasn't made it into Piper itself yet. We basically just repeat the phrase over and over, and then pull out the audio of the last spoken instance.

What we really need is more Dutch audio datasets with people reading specific Dutch phrases.

BramNH

Apr 16, 2024

I only tested with shorter sentences. Longer sentences are indeed producing something hearable, but very slow.

I assume that retraining the model will not make it perfect and that simply more Dutch audio datasets are required. Could you provide links to where I can help providing these datasets?

For now I will stick to the Belgian Dutch models, those are working fine!

synesthesiam

Rhasspy org Apr 16, 2024

If you're interested in contributing (or know someone who is), send me an e-mail at voice@nabucasa.com and I can get your a login code for the contribution website. Another option is to install Piper recording studio locally and record a dataset.

Thanks!

m0nsky

May 11, 2024

Just ran into the same issue. I've tried them with 0.333 for both the noise-scale and noise-scale-w (and verified that the settings are actually getting applied!) but no luck. It doesn't produce anything that sounds like dutch (or any language at all), except for the very long sample sentence.

I will also stick to the nl_BE voices for now, which are working great. Sadly, I don't have any datasets to contribute for nl_NL. I hope to see a working nl_NL voice in the future.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment