Qwen3-ASR 0.6B Β· OpenASR

Multilingual speech recognition across 52 languages & dialects β€” the fast, lightweight Qwen3-ASR

License Format Runtime Base model

Native speech-to-text in the OpenASR runtime β€” engineered for peak performance on CPU & GPU, no Python at inference time.


✨ Highlights

  • 🌍 52 languages & dialects β€” 30 languages plus 22 Chinese dialects, with built-in spoken-language identification
  • 🎧 Robust on hard audio β€” clean speech, singing voice, and songs over background music
  • ⚑ Fast & light β€” the efficiency-tuned member of the Qwen3-ASR family; one model for both offline and streaming
  • πŸ¦€ Native in OpenASR β€” .oasr packs run with no Python at inference, engineered for peak performance on CPU & GPU

πŸš€ Quickstart

# 1. Install the OpenASR CLI  Β·  https://openasr.org
# 2. Pull a build (pick a quant β€” see the table below)
openasr pull qwen3-asr-0.6b:q8

# 3. Transcribe
openasr transcribe audio.wav --model qwen3-asr-0.6b

All builds for this model:

openasr pull qwen3-asr-0.6b:fp16
openasr pull qwen3-asr-0.6b:q8
openasr pull qwen3-asr-0.6b:q4

πŸ“¦ Available builds

Quant File (.oasr) Size RAM peak RTF Β· M1 CPU RTF Β· M1 GPU JFK Ξ”WER vs fp16
fp16 qwen3-asr-0.6b-fp16.oasr 1.88 GB 4.51 GB 0.58Γ— 0.41Γ— 0.0%
q8_0 qwen3-asr-0.6b-q8_0.oasr 1.01 GB 2.86 GB 0.55Γ— 0.27Γ— 0.0%
q4_k qwen3-asr-0.6b-q4_k.oasr 599 MB 3.50 GB 0.52Γ— 0.20Γ— 0.0%

RTF = real-time factor on the fixed 11s JFK clip (lower is faster); RAM peak measured per pack in an isolated subprocess. JFK Ξ”WER compares each quantized build's JFK transcript to this model's fp16 JFK transcript, so it measures quantization drift rather than absolute recognition accuracy. q8_0 is the recommended default β€” near-reference quality at a fraction of the footprint.

🧠 About Qwen3-ASR 0.6B

Qwen3-ASR-0.6B is the compact, efficiency-optimized member of Alibaba's Qwen3-ASR family, built on the Qwen3-Omni audio-understanding foundation. It performs language identification and speech recognition across 30 languages and 22 Chinese dialects (52 in total), and stays robust on challenging audio β€” clean speech, singing voice, and songs with background music. A single unified checkpoint handles both offline and real-time streaming transcription and can process long audio; the 0.6B size targets a strong accuracy-vs-efficiency trade-off (the Qwen team reports up to ~2000Γ— throughput at high concurrency), making it the family's go-to for lightweight, high-throughput deployments. This OpenASR repo repackages the original weights as .oasr packs that run natively in the OpenASR runtime β€” no Python at inference time. The q8_0 build is the recommended default (near-reference accuracy at roughly half the footprint); q4_k suits tight-memory devices and fp16 is for verification or maximum fidelity. For word-level timestamps, pair it upstream with Qwen3-ForcedAligner-0.6B.

βš™οΈ How these packs were made

Converted from Qwen/Qwen3-ASR-0.6B with the OpenASR importer:

openasr model-pack import-qwen-local <src> <out>.oasr \
  --package-id qwen3-asr-0.6b --quantization {fp16,q8-0,q4-k}

The .oasr container is GGUF-backed; packs use zero-copy mmap weight binding and graph buffer reuse to keep peak memory low.

βš–οΈ License

These packs inherit the upstream model's license: Apache-2.0 (source). OpenASR packaging retains the upstream copyright and NOTICE; the only modifications are format conversion and quantization.

πŸ™ Acknowledgements

This pack is a redistribution of Qwen3-ASR-0.6B, created and open-sourced by the Qwen team at Alibaba (Qwen/Qwen3-ASR-0.6B). All credit for the original architecture, training, and weights belongs to them; the license is inherited from and identical to the upstream model (Apache-2.0). The GGUF quantization recipe and bit-identity verification methodology were informed by the community GGUF work at cstr/qwen3-asr-1.7b-GGUF. Thank you to both teams for releasing their work openly.

πŸ”— Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for OpenASR/qwen3-asr-0.6b

Finetuned
(35)
this model