Licensing / F5-TTS PR for Model sharing
#2
by
mame82
- opened
Hey folks, great work. This model greatly outperforms every tattempt I did for German f5-tts training, with financial resources of a private person. As your 8x H100 training seems to be funded by BMBF 8at least partially), would you mind to:
- add a clear license statement for the model (mix of Emilia and Mozilla Common)
- File a PR to legacy F5-TTS repo to add the model to shared.md
- If possible, could you file an additional PR or feature request to the origin repo iot add BigVGAN to infer-gradio interface for convenience (the authors added customizable model arch configs for the gradio interface recently, yet "vocos" is used by default for the interface, so code has to be modified iot use your BigVGAN models, see here for reference: https://github.com/SWivid/F5-TTS/blob/main/src/f5_tts/infer/infer_gradio.py#L56)
Also, is it possible to train a "small" German model on your end ({"dim": 768, "depth": 18, "heads": 12, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}
instead of {"dim": 1024, "depth": 22, "heads": 16, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}
) ?