MeloTTS English V2 — GGUF

myshell-ai/MeloTTS English V2 checkpoint converted to GGUF for CrispASR.

Model Details

Architecture: VITS2 (52M params)
Output: 44.1 kHz mono PCM
Speakers: EN-US (0), EN-BR (1), EN-INDIA (2), EN-AU (4)
Format: F16 GGUF (97 MB)
License: MIT

Usage

# Auto-download + synthesize
crispasr --backend melotts -m auto --tts "Hello world." --tts-output hello.wav

# Manual download
wget https://huggingface.co/cstr/melotts-en-v2-GGUF/resolve/main/melotts-en-v2-f16.gguf
crispasr -m melotts-en-v2-f16.gguf --tts "Hello world." --tts-output hello.wav

Conversion

Converted from the official PyTorch checkpoint using:

python models/convert-melotts-to-gguf.py     --ckpt checkpoint.pth --config config.json     --output melotts-en-v2-f16.gguf

Weight-norm pairs fused. Embeddings and SDP/DP weights stored as F32 for precision; everything else F16.

Downloads last month: 288

GGUF

Model size

94.7M params

Architecture

bert

Hardware compatibility

8-bit

16-bit

View +2 variants

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support