Spaces:
Running
on
CPU Upgrade
Add Kokoro, the #1๐ฅ TTS Model in TTS-Spaces-Arena ๐ with only 82M params ๐ค
Hello, I'd like to request that https://hf.co/spaces/hexgrad/Kokoro-TTS is added to this Arena.
Kokoro is only 82M params. The weights are currently private but its StyleTTS2 architecture is open.
At the time of this post, Kokoro is internally versioned at v0.19 (checkpoint from 22 Nov 2024), and ranks ๐ฅ on
@Pendrokar
's https://hf.co/spaces/Pendrokar/TTS-Spaces-Arena over:
2. Microsoft's EdgeTTS (? params)
3. XTTS v2 (467M params)
4. MetaVoice-1B (1B params)
5. Parler Mini (880M params)
At v0.19, Kokoro might not be as flexible as some of these larger models in voice cloning or language support (yet), but much like an NBA 3-point specialist (e.g. Ray Allen, Kyle Korver), Kokoro really excels at its strengths, delivering high Elo, precise English speech.
I understand everyone wants their TTS model listed in this Arena. But Kokoro stands out from the rest since it is already a proven contender in another Arena, does more with less, and can be accessed immediately via a semi-private Gradio API.
Feel free to DM @rzvzn
on Discord to coordinate. I have also DM'd
@mrfakename
"Kokoro is only 82M params. The weights are currently private but its StyleTTS2 architecture is open."
"Kokoro is only 82M params. The weights are currently private but its StyleTTS2 architecture is open."
@lengyue233 Yes ๐ Feel free to audit the inference code in https://hf.co/spaces/hexgrad/Kokoro-TTS/tree/main
- The weights are loaded starting at Line 20 in
app.py
: https://hf.co/spaces/hexgrad/Kokoro-TTS/blob/main/app.py#L20 - The param count assert is on Line 34: https://hf.co/spaces/hexgrad/Kokoro-TTS/blob/main/app.py#L34
@mrfakename
@reach-vb
@Steveeeeeeen
Edit: this particular voice is actually v0.22x, still a WIP and a bit shaky, but happy to submit anything from v0.19 and up, whatever gets me in the door.
Hey !
Thanks for notifying us. We will add the model alongside the others requested in the coming days or weeks.