Pocket TTS ONNX (English)

This repository provides ONNX format conversions of Kyutai's Pocket TTS model (English).

These ONNX models were specifically compiled and optimized to power the local, real-time AI voice generation in Core Dialog / NPC Talk Engine (Local TTS Assets) developed by Ignitive Labs.

While these files are tailored for the Core Dialog package, they are open-source. The model weights are licensed under the CC-BY-4.0 license, and the conversion scripts and packaging structures are licensed under the MIT license. Anyone in the community is free to download, use, and integrate these files into their own projects, subject to attribution.


Model Components Included:

  • flow_lm_main.onnx / flow_lm_main_int8.onnx: Main flow language model graph.
  • flow_lm_flow.onnx / flow_lm_flow_int8.onnx: Flow step (iterative refinement).
  • mimi_decoder.onnx / mimi_decoder_int8.onnx: Mimi codec decoder (tokens to audio).
  • mimi_encoder.onnx / mimi_encoder_int8.onnx: Mimi codec encoder (audio to tokens).
  • text_conditioner.onnx / text_conditioner_int8.onnx: Text to embedding conditioner.
  • tokenizer.model: SentencePiece tokenizer.
  • bundle.json: Configuration metadata.

Attribution & Original Authors

This is a compiled/format-converted release of Pocket TTS weights developed and published by Kyutai Labs.

License


🛡️ Kyutai Acceptable Use Policy (AUP) & Ethical Guidelines

This model is capable of high-quality voice cloning from short audio samples. To prevent misuse, the use of this converted model is subject to Kyutai's strict ethical and safety conditions. By downloading or using these files, you agree not to use the model for the following prohibited activities:

  1. Voice Impersonation / Unauthorized Cloning: Creating clones or synthesis of any individual's voice without their explicit, lawful, and documented consent.
  2. Misinformation & Deception: Generating fake news, deceptive content, fraudulent phone calls, or presenting AI-synthesized audio as a genuine recording of a real person or event.
  3. Harmful or Unlawful Content: Generating defamatory, libelous, abusive, harassing, discriminatory, hateful, or privacy-invasive content.
  4. Failure to Disclose: Representing synthetic audio output generated by this model as a real human voice to end-users without clearly disclosing that it is AI-generated.

Liability Disclaimer & Disclaimer of Affiliation

  • Format Conversion Only: Ignitive Labs has only performed a format translation (converting PyTorch files to ONNX representation) for optimization purposes. Ignitive Labs did not design, train, tune, or host the underlying AI algorithms, model weights, training data, or model logic.
  • No Association: Ignitive Labs is completely independent and has no association, partnership, or official affiliation with Kyutai Labs.
  • No Warranty & No Liability: Both Kyutai Labs (the original model creators) and Ignitive Labs (the compiler and uploader of these ONNX files) provide these files "AS IS", without warranties or conditions of any kind, either express or implied.
  • User Sole Responsibility: Both parties disclaim all liability, responsibility, and legal accountability for any outcomes, outputs, actions, or damages resulting from downloading, using, or distributing these converted files. The end-user assumes full responsibility for compliance with all voice recording, privacy, likeness, and intellectual property laws.
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support