Spaces:

TTS-AGI
/

TTS-Arena

Running on CPU Upgrade

App Files Files Community

Add Kokoro, the #1🥇 TTS Model in TTS-Spaces-Arena 🏆 with only 82M params 🤏

#70

by hexgrad - opened Dec 7, 2024

Discussion

hexgrad

Dec 7, 2024

Hello, I'd like to request that https://hf.co/spaces/hexgrad/Kokoro-TTS is added to this Arena.

Kokoro is only 82M params. The weights are currently private but its StyleTTS2 architecture is open.

At the time of this post, Kokoro is internally versioned at v0.19 (checkpoint from 22 Nov 2024), and ranks 🥇 on @Pendrokar 's https://hf.co/spaces/Pendrokar/TTS-Spaces-Arena over:
2. Microsoft's EdgeTTS (? params)
3. XTTS v2 (467M params)
4. MetaVoice-1B (1B params)
5. Parler Mini (880M params)

At v0.19, Kokoro might not be as flexible as some of these larger models in voice cloning or language support (yet), but much like an NBA 3-point specialist (e.g. Ray Allen, Kyle Korver), Kokoro really excels at its strengths, delivering high Elo, precise English speech.

I understand everyone wants their TTS model listed in this Arena. But Kokoro stands out from the rest since it is already a proven contender in another Arena, does more with less, and can be accessed immediately via a semi-private Gradio API.

Feel free to DM @rzvzn on Discord to coordinate. I have also DM'd @mrfakename

lengyue233

Dec 8, 2024

"Kokoro is only 82M params. The weights are currently private but its StyleTTS2 architecture is open."

hexgrad

Dec 8, 2024

"Kokoro is only 82M params. The weights are currently private but its StyleTTS2 architecture is open."

@lengyue233 Yes 😊 Feel free to audit the inference code in https://hf.co/spaces/hexgrad/Kokoro-TTS/tree/main

The weights are loaded starting at Line 20 in app.py: https://hf.co/spaces/hexgrad/Kokoro-TTS/blob/main/app.py#L20
The param count assert is on Line 34: https://hf.co/spaces/hexgrad/Kokoro-TTS/blob/main/app.py#L34

hexgrad

Dec 10, 2024

•

edited Dec 10, 2024

@mrfakename @reach-vb @Steveeeeeeen

Edit: this particular voice is actually v0.22x, still a WIP and a bit shaky, but happy to submit anything from v0.19 and up, whatever gets me in the door.

Steveeeeeeen

TTS AGI org Dec 10, 2024

Hey !

Thanks for notifying us. We will add the model alongside the others requested in the coming days or weeks.

hexgrad

Dec 26, 2024

Kokoro v0.19 has been open sourced under Apache 2.0 at https://hf.co/hexgrad/Kokoro-82M along with the voicepack(s) used in the other Arena.

Leaderboard maintainers are free to run the model locally on their private inference server if they wish. The provided inference code is a bit hacky, but it should get the job done.

mrfakename

TTS AGI org Dec 26, 2024

Thanks! We have added Kokoro to the Arena.

mrfakename changed discussion status to closed Dec 26, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment