Add Fish Speech

#48
by lengyue233 - opened

Hi everyone,

We are thrilled to announce that we have open sourced our new text-to-speech model, Fish Speech 1, today! You can find the model and more details on our Hugging Face blog post: https://huggingface.co/blog/lengyue233/fish-speech-1.

We have prepared two demos for you to try out:

  • The medium pretrain demo, which excels at general speaking, can be found at Fish Audio.
  • The large SFT demo, which works particularly well on ACGN content, is available on Hugging Face Space.

To better understand our model's performance, we are eager to integrate the medium pretrain model into TTS Arena for evaluation. We believe this will provide valuable insights into how Fish Speech 1 compares to other state-of-the-art TTS models. If the TTS Arena team requires any assistance or support during the integration process, we are more than happy to provide any necessary resources or guidance.

Best regards,
The Fisu Audio Team

TTS AGI org

Hi, congratulations on your launch!! Are there any plans to switch to an open source license?

Hi, Fish Speech is an open-source model. The code is available under the BSD-3-Clause license, and the model weights are released under the BY-CC-NC-SA 4.0 license.
Feel free to use it for any non-commercial purposes.

TTS AGI org

Thanks! Are there any plans to release the weights under an open source license (see OSD)?

Currently, we don't have any plan to release the weights for commercial use.

We have a very strong release coming soon, it's close to elvenlabs now. Some samples here:



We have a very strong release coming soon, it's close to elvenlabs now. Some samples here:

With that kind statement of confidence, I have to be honest here. While it is better than half of the current models in the Arena, I predict that it will score below StyleTTS and XTTS if added. No were near ElevenLabs. It feels unstable, as in, it always has a slight stuttering. ๐Ÿ˜•

Of course that is for the voting public to decide.

Sign up or log in to comment