Breeze-ASR-26 โ€” MLX (fp16)

Apple MLX port of MediaTek-Research/Breeze-ASR-26, the Taiwanese Hokkien (Taigi) ASR model from MediaTek Research's MR Breeze 3 series.

Runs on Apple Silicon Macs (M1/M2/M3/M4) via the mlx-whisper package โ€” same API as mlx-community/whisper-* checkpoints.

Files

file size
weights.safetensors ~3.1 GB
config.json <1 KB

Usage

import mlx_whisper

result = mlx_whisper.transcribe(
    "audio.wav",
    path_or_hf_repo="fredchu/breeze-asr-26-mlx-fp16",
    language="zh",
)
print(result["text"])

CLI:

mlx_whisper audio.wav --model fredchu/breeze-asr-26-mlx-fp16 --language zh

Performance (M1 Max, mlx-whisper)

sample duration RTF
Mandarin financial speech (60s) 60.0s 8.72ร— real-time
Taiwanese Hokkien sample (25s) 25.0s 6.28ร— real-time

When to use this vs. the 4-bit variant

A companion fredchu/breeze-asr-26-mlx-4bit (877 MB, palette quantized) is also available.

In our 2026-04-30 evaluation on a real Taigi sample (a Mandarin-speaking creator using the Taigi word "ๆผ‚ๆณŠ" pio-pรดa), the 4-bit variant transcribed "ๆผ‚ๆณŠ" correctly while the fp16 variant produced "็€Ÿ็‘" instead โ€” a counterintuitive result, possibly because palette quantization re-calibrates outlier weights in a way that helps generalize to underrepresented Taigi tokens. Worth verifying if you have a Taigi-heavy use case.

For pure Mandarin or read benchmarks, fp16 should remain the safer choice.

Limitations (inherited from base model)

  • Outputs Mandarin Chinese characters, not Taigi orthography (ๅฐ่ชžๆญฃๅญ— / ๅฐ็พ…)
  • Trained on ~10,000 hours of synthetic Taigi speech โ€” distribution gap with real spontaneous speech
  • English brand/proper nouns are aggressively transliterated: in our Mandarin test, Hello became ๅ“ˆๅ›‰, Austin became Alsted, Netflix became Nathalie ็š„ๆ™‚ไบ‹. ASR-25 (MediaTek-Research/Breeze-ASR-25) handles these correctly. Do not use this model for content with frequent English code-switching.
  • All segments come back as one ~30-second block regardless of audio content (model training behaviour, not framework setting). Post-process if you need finer subtitle granularity.

Conversion

Built with a custom wrapper around mlx-examples/whisper/convert.py that adds sharded-safetensors loader support (the source repo ships weights as 5 GB + 1 GB shards, which the upstream converter doesn't handle).

The conversion script is open source on GitHub โ€” search for convert_breeze_asr_26.py in For_Claude/scripts/asr-eval/.

License

Apache 2.0 โ€” inherits from the base model.

Acknowledgments

Downloads last month
34
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for fredchu/breeze-asr-26-mlx-fp16

Finetuned
(9)
this model