Pocket TTS ONNX (English)

This repository provides ONNX format conversions of Kyutai's Pocket TTS model (English).

These ONNX models were specifically compiled and optimized to power the local, real-time AI voice generation in Core Dialog / NPC Talk Engine (Local TTS Assets) developed by Ignitive Labs.

While these files are tailored for the Core Dialog package, they are open-source. The model weights are licensed under the CC-BY-4.0 license, and the conversion scripts and packaging structures are licensed under the MIT license. Anyone in the community is free to download, use, and integrate these files into their own projects, subject to attribution.

Model Components Included:

flow_lm_main.onnx / flow_lm_main_int8.onnx: Main flow language model graph.
flow_lm_flow.onnx / flow_lm_flow_int8.onnx: Flow step (iterative refinement).
mimi_decoder.onnx / mimi_decoder_int8.onnx: Mimi codec decoder (tokens to audio).
mimi_encoder.onnx / mimi_encoder_int8.onnx: Mimi codec encoder (audio to tokens).
text_conditioner.onnx / text_conditioner_int8.onnx: Text to embedding conditioner.
tokenizer.model: SentencePiece tokenizer.
bundle.json: Configuration metadata.

Attribution & Original Authors

This is a compiled/format-converted release of Pocket TTS weights developed and published by Kyutai Labs.

Original PyTorch Weights: kyutai/pocket-tts (CC-BY-4.0 / gated)
Original Repository: kyutai-labs/pocket-tts
Export Tool Used: KevinAHM/pocket-tts-onnx-export

License

Original Model Weights: Licensed under the Creative Commons Attribution 4.0 International License (CC-BY-4.0). You must provide appropriate credit and attribution to the original creators, Kyutai Labs.
ONNX Conversion, Code, & Packaging: Licensed under the MIT License.

🛡️ Kyutai Acceptable Use Policy (AUP) & Ethical Guidelines

This model is capable of high-quality voice cloning from short audio samples. To prevent misuse, the use of this converted model is subject to Kyutai's strict ethical and safety conditions. By downloading or using these files, you agree not to use the model for the following prohibited activities:

Voice Impersonation / Unauthorized Cloning: Creating clones or synthesis of any individual's voice without their explicit, lawful, and documented consent.
Misinformation & Deception: Generating fake news, deceptive content, fraudulent phone calls, or presenting AI-synthesized audio as a genuine recording of a real person or event.
Harmful or Unlawful Content: Generating defamatory, libelous, abusive, harassing, discriminatory, hateful, or privacy-invasive content.
Failure to Disclose: Representing synthetic audio output generated by this model as a real human voice to end-users without clearly disclosing that it is AI-generated.

Liability Disclaimer & Disclaimer of Affiliation

Format Conversion Only: Ignitive Labs has only performed a format translation (converting PyTorch files to ONNX representation) for optimization purposes. Ignitive Labs did not design, train, tune, or host the underlying AI algorithms, model weights, training data, or model logic.
No Association: Ignitive Labs is completely independent and has no association, partnership, or official affiliation with Kyutai Labs.
No Warranty & No Liability: Both Kyutai Labs (the original model creators) and Ignitive Labs (the compiler and uploader of these ONNX files) provide these files "AS IS", without warranties or conditions of any kind, either express or implied.
User Sole Responsibility: Both parties disclaim all liability, responsibility, and legal accountability for any outcomes, outputs, actions, or damages resulting from downloading, using, or distributing these converted files. The end-user assumes full responsibility for compliance with all voice recording, privacy, likeness, and intellectual property laws.

Downloads last month: -