TADA-1B — GGUF (ggml-quantised)

GGUF / ggml conversion of HumeAI/tada-1b for use with CrispStrobe/CrispASR.

TADA-1B is a text-to-speech model built on Meta Llama 3.2 1B with a flow-matching (FM) speech decoder and custom Hume codec. TADA uses 1:1 token alignment: every text token maps to one speech vector before the codec decoder renders 24 kHz mono PCM. This repo packages the talker model, required codec decoder, and a ready-to-use reference prompt GGUF for CrispASR's tada backend.

License: Llama 3.2 Community License. See the upstream HumeAI/tada-1b model card and LICENSE file for the original model terms.

Pair the talker with tada-codec-f16.gguf (included in this repo). The talker outputs continuous acoustic vectors; the codec converts those vectors to waveform audio.

Files

File Quant Size Notes
tada-tts-1b-f16.gguf F16 ~3.1 GB Reference-quality talker model
tada-tts-1b-q4_k.gguf Q4_K ~1.7 GB Recommended for CrispASR auto-download
tada-codec-f16.gguf F16 ~1.0 GB Codec decoder, required companion
tada-ref.gguf F32 ~456 KB Reference voice prompt for --voice; also used by CrispASR's TADA diff harness

The Q4_K file uses a TADA-aware quantization policy: large transformer block projection matrices are quantized, while talker.token_embd.* and all tada.* tensors are preserved at source precision. This keeps the flow-matching head, acoustic conditioning, and timing path stable.

Architecture

Text Input
  |
BPE Tokenize (Llama-3.2 vocab)
  |
Llama-3.2-1B AR Forward
  + acoustic embedding + gray-code time embedding
  |
Flow-Matching Speech Head
  |-- Euler ODE denoising: noise -> speech vector
  |
TADA Codec Decoder
  |-- speech vectors -> 24 kHz PCM
  |
Output: float32 mono @ 24 kHz

Quick start

# 1. Build CrispASR
git clone https://github.com/CrispStrobe/CrispASR
cd CrispASR
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j --target crispasr

# 2. Pull the model, codec, and reference prompt
huggingface-cli download cstr/tada-tts-1b-GGUF \
  tada-tts-1b-q4_k.gguf tada-codec-f16.gguf tada-ref.gguf \
  --local-dir .

# 3. Synthesize
./build/bin/crispasr --backend tada-1b --gpu-backend cpu \
  -m tada-tts-1b-q4_k.gguf \
  --codec-model tada-codec-f16.gguf \
  --voice tada-ref.gguf \
  --tts "Please call Stella." \
  --tts-output tada.wav \
  --seed 42

For F16 quality, replace tada-tts-1b-q4_k.gguf with tada-tts-1b-f16.gguf.

Recent CrispASR builds can also resolve this repo through the model registry:

./build/bin/crispasr --backend tada-1b -m auto --auto-download \
  --voice tada-ref.gguf \
  --tts "Hello from TADA one billion." \
  --tts-output hello.wav

Source model

Validation

The uploaded Q4_K model was smoke-tested locally with CrispASR by synthesizing:

Please call Stella.

and transcribing the generated WAV with ggml-tiny.en.bin; the ASR roundtrip returned:

Please call Stella!

Checksums

7be26395d37412dff5fd2bbeb47b3f584c3172a4cd0ac3793208c82b107b28cf  tada-tts-1b-f16.gguf
035b6edbf0f58e6e0c5ec77943aec233df1946e68e4b09c2bf002b113abe3a9a  tada-tts-1b-q4_k.gguf
ef5652e7a346c8a55dd6692676da2827320fd141042e87175880e032e1953082  tada-codec-f16.gguf
7efcc96795dd2b27577a4a81eb52d0c3add5ffa67f325fba5a938f3f98067ace  tada-ref.gguf

Notes

  • Use a recent CrispASR build with the TADA runtime fixes for prompt timing, codec expansion, and PyTorch-compatible MT19937 noise generation.
  • tada-ref.gguf is a ready-to-use reference prompt, not a Python-only cache. Pass it directly via --voice.
  • Custom voice references can be packed with CrispASR's TADA reference conversion tooling.
Downloads last month
-
GGUF
Model size
0.5B params
Architecture
tada-codec
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cstr/tada-tts-1b-GGUF

Finetuned
HumeAI/tada-1b
Quantized
(1)
this model

Paper for cstr/tada-tts-1b-GGUF