MeloTTS English V2 โ€” GGUF

myshell-ai/MeloTTS English V2 checkpoint converted to GGUF for CrispASR.

Model Details

  • Architecture: VITS2 (52M params)
  • Output: 44.1 kHz mono PCM
  • Speakers: EN-US (0), EN-BR (1), EN-INDIA (2), EN-AU (4)
  • Format: F16 GGUF (97 MB)
  • License: MIT

Usage

# Auto-download + synthesize
crispasr --backend melotts -m auto --tts "Hello world." --tts-output hello.wav

# Manual download
wget https://huggingface.co/cstr/melotts-en-v2-GGUF/resolve/main/melotts-en-v2-f16.gguf
crispasr -m melotts-en-v2-f16.gguf --tts "Hello world." --tts-output hello.wav

Conversion

Converted from the official PyTorch checkpoint using:

python models/convert-melotts-to-gguf.py     --ckpt checkpoint.pth --config config.json     --output melotts-en-v2-f16.gguf

Weight-norm pairs fused. Embeddings and SDP/DP weights stored as F32 for precision; everything else F16.

Downloads last month
288
GGUF
Model size
94.7M params
Architecture
bert
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support