whisper-base β€” QHexRT NPU bundle (Hexagon v79)

Precompiled Whisper-base ASR for the QHexRT runtime on Qualcomm Hexagon v79 (Snapdragon 8 Elite / SM8750, e.g. Galaxy S25). Encoder + decoder are Qualcomm AI Hub qnn_context_binary (float/fp16) graphs; the host pipeline (log-mel, decode loop, detok) is QHexRT's own. Device-validated: transcription matches the HF openai/whisper-base reference exactly.

Contents (v79/)

file what
whisper-base.json QHexRT manifest (ASR family, asr_transcribe plan)
encoder.bin AI Hub Whisper encoder (audio mel β†’ 12 cross-attn KV)
decoder.bin AI Hub Whisper decoder (greedy step: ids+mask+self/cross-KV β†’ logits)
whisper_base_mel_filters.bin HF mel filter bank [201,80] f32 (host log-mel)
tokenizer.json Whisper multilingual tokenizer (vocab 51865)

Run (QHexRT CLI)

huggingface-cli download runanywhere/whisper_base_HNPU --local-dir whisper_base_HNPU
# QNN libs come from the QAIRT SDK (lib/aarch64-android) + the v79 HTP skel; push them next to qhx_asr.
adb push whisper_base_HNPU/v79 /data/local/tmp/wq/whisper
adb shell "cd /data/local/tmp/wq && LD_LIBRARY_PATH=. ADSP_LIBRARY_PATH=. \
  ./qhx_asr whisper/whisper-base.json libQnnHtp.so libQnnSystem.so whisper whisper/<audio16k>.wav"
# -> TRANSCRIPT: ...

Audio: mono WAV (PCM16 or float32); resampled to 16 kHz host-side. Clips ≀ 30 s.

Notes

  • Arch: v79 only (context binaries are dsp-arch-pinned). Other arches = re-export from AI Hub.
  • No custom op-package needed β€” these are pure-native AI Hub graphs.
  • Source model: openai/whisper-base, compiled via Qualcomm AI Hub for qualcomm-snapdragon-8-elite-for-galaxy.
Downloads last month
30
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support