Instructions to use soniqo/Supertonic-3-ONNX-INT8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Supertonic
How to use soniqo/Supertonic-3-ONNX-INT8 with Supertonic:
from supertonic import TTS tts = TTS(auto_download=True) style = tts.get_voice_style(voice_name="M1") text = "The train delay was announced at 4:45 PM on Wed, Apr 3, 2024 due to track maintenance." wav, duration = tts.synthesize(text, voice_style=style) tts.save_audio(wav, "output.wav")
- Notebooks
- Google Colab
- Kaggle
SupertonicTTS-3 β int8 ONNX (on-device)
Dynamic int8 quantization of Supertone/supertonic-3 for low-memory on-device inference. Same 4-graph
non-autoregressive flow-matching pipeline (duration_predictor β text_encoder β vector_estimator ΓN β vocoder), 31 languages, 44.1 kHz, ONNX Runtime β just smaller and lighter.
What changed vs the base
- Weights quantized to int8 via
onnxruntime.quantization.quantize_dynamic(QInt8); activations fp32. - ONNX weights: 398 MB β 102 MB. Inference peak RSS: ~1026 MB β ~327 MB (3.1Γ) (measured, Apple Silicon, ORT CPU).
- RTF β neutral on Apple Silicon (~0.20 @ 8 steps); the int8 speed win lands on weaker ARM CPUs / NPUs.
- Roundtrip verified for English, German, Korean.
Contents
onnx/{duration_predictor,text_encoder,vector_estimator,vocoder}.onnx (int8) Β· onnx/tts.json Β·
onnx/unicode_indexer.json Β· voice_styles/*.json (10 voices) Β· config.json Β· LICENSE.
Usage
Drop-in for the supertonic package via model_dir, or run the
4 graphs directly with ONNX Runtime. The text front-end is G2P-free (NFKD + unicode_indexer.json
lookup β no espeak/phonemizer).
Attribution & license
Derivative of Supertone/supertonic-3 (commit 3cadd1ee6394adea1bd021217a0e650ede09a323) by Supertone, Inc. (paper
arXiv:2503.23108). Licensed under BigScience OpenRAIL-M β the
upstream use-based restrictions carry over (no non-consensual impersonation/deepfakes, no undisclosed
machine-generated content, etc.) and must pass through to downstream users. This card marks it a modified
(quantized) artifact per the license. The original LICENSE is included.
Other Supertonic-3 formats
- Supertonic-3 β LiteRT β Android / Qualcomm NPU (.tflite).
- Supertonic-3 β CoreML β iOS / Apple Neural Engine (.mlpackage).
Ecosystem
- soniqo.audio β website / use-case explorer (transcription, voice cloning, live ASR, voice agents).
- speech-core β C++ orchestration library; Supertonic plugs in as a
TTSInterfaceONNX model. - speech-swift β Apple Silicon MLX + CoreML runtime.
- speech-android β Android SDK consuming on-device LiteRT bundles.
Other ONNX models in this collection
- Downloads last month
- 11
Model tree for soniqo/Supertonic-3-ONNX-INT8
Base model
Supertone/supertonic-3