Whisper Tiny Β· OpenASR
The smallest multilingual Whisper for fast local transcription
Native speech-to-text in the OpenASR runtime β engineered for peak performance on CPU & GPU, no Python at inference time.
β¨ Highlights
- π§ Multilingual ASR β transcribes many languages and can translate speech to English
- β‘ 39M parameters β the smallest Whisper checkpoint, the fastest and lightest to run
- π Weak-supervision scale β trained with Whisper's 680k-hour labelled speech corpus
- π¦ Native in OpenASR β
.oasrpacks run with no Python at inference, engineered for peak performance on CPU & GPU
π Quickstart
# 1. Install the OpenASR CLI Β· https://openasr.org
# 2. Pull a build (pick a quant β see the table below)
openasr pull whisper-tiny:q8
# 3. Transcribe
openasr transcribe audio.wav --model whisper-tiny
All builds for this model:
openasr pull whisper-tiny:fp16
openasr pull whisper-tiny:q8
openasr pull whisper-tiny:q4
π¦ Available builds
| Quant | File (.oasr) |
Size | RAM peak | RTF Β· M1 CPU | RTF Β· M1 GPU | JFK ΞWER vs fp16 |
|---|---|---|---|---|---|---|
| fp16 | whisper-tiny-fp16.oasr |
79 MB | 321 MB | 0.04Γ | 0.03Γ | 0.0% |
| q8_0 | whisper-tiny-q8_0.oasr |
63 MB | 275 MB | 0.04Γ | 0.03Γ | 0.0% |
| q4_k | whisper-tiny-q4_k.oasr |
61 MB | 274 MB | 0.04Γ | 0.03Γ | 0.0% |
RTF = real-time factor on the fixed 11s JFK clip (lower is faster); RAM peak measured per pack in an isolated subprocess. JFK ΞWER compares each quantized build's JFK transcript to this model's fp16 JFK transcript, so it measures quantization drift rather than absolute recognition accuracy. q8_0 is the recommended default β near-reference quality at a fraction of the footprint.
π§ About Whisper Tiny
Whisper Tiny is OpenAI's 39M-parameter multilingual Whisper checkpoint, the smallest member of
the Whisper family. It uses the standard Whisper encoder-decoder architecture for automatic
speech recognition and speech translation, trained with large-scale weak supervision on 680k
hours of labelled speech. The tiny model trades some accuracy for the lowest footprint and
fastest inference, which suits low-resource devices and latency-sensitive use. This OpenASR
repo repackages the original openai/whisper-tiny weights as .oasr packs that run natively
in the OpenASR runtime with no Python at inference time. For most users the q8_0 build is the
recommended default; q4_k is for the tightest memory budgets and fp16 is for verification or
maximum fidelity.
βοΈ How these packs were made
Converted from openai/whisper-tiny with the OpenASR importer:
openasr model-pack import-whisper-local <src> <out>.oasr \
--package-id whisper-tiny --quantization {fp16,q8-0,q4-k}
The .oasr container is GGUF-backed; packs use zero-copy mmap weight binding and graph
buffer reuse to keep peak memory low.
βοΈ License
These packs inherit the upstream model's license: Apache-2.0 (source). OpenASR packaging retains the upstream copyright and NOTICE; the only modifications are format conversion and quantization.
π Acknowledgements
This pack is a redistribution of Whisper Tiny, released by OpenAI
(openai/whisper-tiny).
All credit for the original model, training recipe, and weights belongs to OpenAI. The
upstream Hugging Face model card declares Apache-2.0 licensing; OpenASR only converts the
weights into .oasr packages and adds quantized builds for local runtime use.
π Links
- π¦ OpenASR β https://github.com/QuintinShaw/openasr
- π Website β https://openasr.org
- π€ Upstream model β openai/whisper-tiny
Model tree for OpenASR/whisper-tiny
Base model
openai/whisper-tiny