BlueMagpie-TTS — GGUF

GGUF conversions of OpenFormosa/BlueMagpie-TTS, a Taiwanese-Mandarin text-to-speech model, for use with llama.rn and codec.cpp.

BlueMagpie is a continuous-latent autoregressive-diffusion TTS — VoxCPM2 with its Text-Semantic LM swapped from MiniCPM-4 to Barbet (Mamba2 + attention hybrid, 1B params). The AudioVAE decodes the continuous latent sequence to a 48 kHz waveform.

Files

Text-Semantic LM (Barbet-1B backbone, runs in llama.cpp / llama.rn)

File	Quant	Size
`BlueMagpie-Barbet-1B-q4_k_m.gguf`	Q4_K_M	661 MB
`BlueMagpie-Barbet-1B-q5_k_m.gguf`	Q5_K_M	756 MB
`BlueMagpie-Barbet-1B-q6_k.gguf`	Q6_K	857 MB
`BlueMagpie-Barbet-1B-q8_0.gguf`	Q8_0	1.08 GB
`BlueMagpie-Barbet-1B-f16.gguf`	F16	2.03 GB

The BPE (GPT2-family) tokenizer is baked into every GGUF, so llama.cpp can tokenize text natively — no external tokenizer runtime needed.

Codec (AudioVAE + LM adaptor stack, runs in codec.cpp)

File	Size
`BlueMagpie-AudioVAE.gguf`	1.76 GB (F16)

This bundle carries all continuous-latent codec_lm components the AR loop needs: tslm_adapter + FSQ + RALM (MiniCPM4-8L) + LocEnc + LocDiT (12L CFM diffusion) + enc_to_lm_proj + enc_to_tslm_proj + lm_to_dit_proj + res_to_dit_proj + AudioVAE decoder + stop head. codec.cpp probes codec_common codec_lm_get_info().is_continuous == true at load.

Runtime

The llama.rn side loads Barbet as the backbone context and this codec.gguf as the vocoder. getFormattedAudioCompletion returns flow = "continuous_embd" + embedding = true; the standard completion loop drives the codec_lm step machine per llama_decode, accumulating latent patches into result.embeddings, which decodeAudioEmbeddings turns into PCM at 48 kHz via the AudioVAE.

Requires llama.rn ≥ codec branch (adds LLM_ARCH_BARBET, the Mamba2/ attention hybrid graph builder, and the codec_common continuous-latent completion-loop hook).

License

Model weights follow the upstream Apache-2.0 license.

Provenance

Backbone converted via scripts/vendor/convert_barbet_to_gguf.py (llama.rn), which fuses the 5 Mamba2 in-projections + 3 conv1d into the ssm_in / ssm_conv1d tensors llama.cpp expects and bakes the GPT2 BPE tokenizer from the upstream tokenizer.json.
Codec converted via scripts/convert-to-gguf.py --model-type bluemagpie (codec.cpp).

Downloads last month: -

GGUF

Model size

0.9B params

Architecture

bluemagpie_audiovae

Hardware compatibility

4-bit

5-bit

6-bit

8-bit

16-bit

View +1 variant

Model tree for hans00/BlueMagpie-TTS-GGUF

Base model

OpenFormosa/BlueMagpie-TTS

Quantized

(1)

this model