zongxiao/speecht5_tts_voxpopuli_nl_2 · Cant push (basically) the same model as TTS

Oct 9, 2023

Hey man, I'm trying to finish the audio course and I'm kind of stuck on the third, text to speech exercise. I think it's because when I push models they are labeled as Text to Audio, but you somehow got it working for this model.. It's clearly labeled as Text-to-Speech, so I wanted to ask is there anything you've done specially different with this model? Have you experienced the same? Do you have any advices for me?

Thanks in advance!

zongxiao

Owner Oct 9, 2023

•

edited Oct 9, 2023

I am not clear with your problem. In TTS task I try to use "cmn_hans_cn", but the output is just noise( passed). So I guess I need change cn to pingyin first. As TTS task has no baseline_metric, just do as the https://huggingface.co/learn/audio-course/chapter6/fine-tuning will be ok I think.
kwargs = {
"dataset_tags": "facebook/voxpopuli",
"dataset": "VoxPopuli", # a 'pretty' name for the training dataset
"dataset_args": "config: nl, split: train",
"language": "nl",
"model_name": "SpeechT5 TTS Dutch", # a 'pretty' name for your model
"finetuned_from": "microsoft/speecht5_tts",
"tasks": "text-to-speech",
"tags": "text-to-speech",
}

Stopwolf

Oct 9, 2023

Thank you! It seems that beside the task text-to-speech you also have to set tags to text-to-speech, otherwise the model will be tagged as Text-To-Audio and it won't go into consideration for the course exercises.

But yeah, the models still output pure noise..