Moonshine Tiny Β· OpenASR
Tiny 27M-parameter English ASR built for real-time, on-device transcription
Native speech-to-text in the OpenASR runtime β engineered for peak performance on CPU & GPU, no Python at inference time.
β¨ Highlights
- πͺΆ Just 27M parameters β the smallest Moonshine, sized for memory- and compute-constrained edge hardware
- β‘ Real-time on-device β engineered by Useful Sensors for live transcription and voice commands on low-cost devices
- π― Accurate for its size β beats similarly-sized ASR systems on standard English benchmarks (per the Moonshine paper)
- π£οΈ English speech-to-text β sequence-to-sequence ASR trained on 200K hours of audio
- π¦ Native in OpenASR β
.oasrpacks run with no Python at inference, engineered for peak performance on CPU & GPU
π Quickstart
# 1. Install the OpenASR CLI Β· https://openasr.org
# 2. Pull a build (pick a quant β see the table below)
openasr pull moonshine-tiny:q8
# 3. Transcribe
openasr transcribe audio.wav --model moonshine-tiny
All builds for this model:
openasr pull moonshine-tiny:fp16
openasr pull moonshine-tiny:q8
π¦ Available builds
| Quant | File (.oasr) |
Size | RAM peak | RTF Β· M1 CPU | RTF Β· M1 GPU | JFK ΞWER vs fp16 |
|---|---|---|---|---|---|---|
| fp16 | moonshine-tiny-fp16.oasr |
109 MB | 323 MB | 0.04Γ | 0.03Γ | 0.0% |
| q8_0 | moonshine-tiny-q8_0.oasr |
34 MB | 306 MB | 0.03Γ | 0.03Γ | 0.0% |
RTF = real-time factor on the fixed 11s JFK clip (lower is faster); RAM peak measured per pack in an isolated subprocess. JFK ΞWER compares each quantized build's JFK transcript to this model's fp16 JFK transcript, so it measures quantization drift rather than absolute recognition accuracy. q8_0 is the recommended default β near-reference quality at a fraction of the footprint.
π§ About Moonshine Tiny
Moonshine Tiny is the smallest model in Useful Sensors' Moonshine family β a 27M-parameter,
sequence-to-sequence English speech-recognition model designed for real-time, on-device
transcription on hardware that is severely constrained in memory and compute. Trained on 200,000
hours of audio, it transcribes English speech to text and, despite its size, reports greater accuracy
than existing ASR systems of comparable scale on standard benchmarks. It targets developers building
live transcription and voice-command experiences on low-cost devices. Like other autoregressive ASR
models it can occasionally hallucinate or repeat on very short or clipped segments, so robust
in-domain evaluation is recommended before deployment. This OpenASR repo repackages the original
weights as .oasr packs that run natively in the OpenASR runtime β no Python at inference time. The
q8_0 build is the recommended default (near-reference accuracy at roughly a third of the
footprint); fp16 is for verification or maximum fidelity.
βοΈ How these packs were made
Converted from UsefulSensors/moonshine-tiny with the OpenASR importer:
openasr model-pack import-moonshine-local <src> <out>.oasr \
--package-id moonshine-tiny --quantization {fp16,q8-0,q4-k}
The .oasr container is GGUF-backed; packs use zero-copy mmap weight binding and graph
buffer reuse to keep peak memory low.
βοΈ License
These packs inherit the upstream model's license: MIT (source). OpenASR packaging retains the upstream copyright and NOTICE; the only modifications are format conversion and quantization.
π Acknowledgements
This pack is a redistribution of Moonshine Tiny, created and open-sourced by Useful Sensors (UsefulSensors/moonshine-tiny). All credit for the original architecture, training, and weights belongs to them; the license is inherited from and identical to the upstream model (MIT). Thank you to the Moonshine authors β Nat Jeffries, Evan King, Manjunath Kudlur, Guy Nicholson, James Wang, and Pete Warden β for releasing their work openly.
π Links
- π¦ OpenASR β https://github.com/QuintinShaw/OpenASR
- π Website β https://openasr.org
- π€ Upstream model β UsefulSensors/moonshine-tiny
Model tree for OpenASR/moonshine-tiny
Base model
UsefulSensors/moonshine-tiny