MeloTTS-EN β€” QHexRT NPU bundle (Hexagon v79)

Precompiled MeloTTS English text-to-speech for the QHexRT runtime on Qualcomm Hexagon v79 (Snapdragon 8 Elite / SM8750, e.g. Galaxy S25). The 3 acoustic graphs (encoder β†’ flow β†’ decoder) are Qualcomm AI Hub qnn_context_binary graphs; the g2p frontend + duration alignment + vocoder windowing are QHexRT's own host pipeline. Device-validated: text β†’ 44.1 kHz mono audio.

Measured on S25/v79: ~1.0 s to synthesize 4.56 s of speech (β‰ˆ4.5Γ— real-time).

Contents (v79/)

file what
melotts-en.json QHexRT manifest (TTS family, tts_synthesize plan)
melo_encoder.bin text/phoneme encoder (β†’ durations + priors)
melo_flow.bin flow (normalizing-flow latent)
melo_decoder.bin vocoder/decoder (β†’ waveform)
melo_tokens.txt phoneme token table (g2p)
melo_lexicon.txt open pronunciation lexicon (g2p)

Run (QHexRT CLI)

huggingface-cli download runanywhere/melotts_en_HNPU --local-dir melotts_en_HNPU
# QNN libs come from the QAIRT SDK (lib/aarch64-android) + the v79 HTP skel; push them next to qhx_say.
adb push melotts_en_HNPU/v79 /data/local/tmp/wq/melotts
adb shell "cd /data/local/tmp/wq && LD_LIBRARY_PATH=. ADSP_LIBRARY_PATH=. \
  ./qhx_say melotts/melotts-en.json libQnnHtp.so libQnnSystem.so melotts 'Hello from Hexagon.' melotts/out.wav"
# -> melotts/out.wav (44.1 kHz mono)

Notes

  • Arch: v79 only (context binaries are dsp-arch-pinned).
  • No custom op-package needed β€” pure-native AI Hub graphs.
  • v1: no BERT prosody (ja_bert=0). English only.
  • Source: MeloTTS-EN, compiled via Qualcomm AI Hub for qualcomm-snapdragon-8-elite-for-galaxy.
Downloads last month
26
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support