openai/whisper-base β 4-Graph ONNX Export
Self-exported 4-graph Whisper ONNX for asrjs/speech-recognition.
Model
openai/whisper-base β 74M params, 6 encoder / 6 decoder layers.
Format
whisper-browser-self-export-v1 β 4-graph KV-cache split:
| Graph | Input | Output | Runs |
|---|---|---|---|
encoder_model.onnx |
mel [1, 80, 3000] | hidden [1, 1500, 512] | Once per chunk |
decoder_init.onnx |
prompt_ids + encoder hidden | logits + full KV cache | Once per chunk |
decoder_step.onnx |
token_id + past KV | logits + updated self-attn KV | Per token |
decoder_align.onnx |
all token ids + encoder hidden | alignment [1, T, S] | Once after generation |
No external data files β all weights are inline (model fits well under the 2 GB protobuf limit).
Variants
| Dir | Precision | Total Size | Encoder | Init | Step | Align |
|---|---|---|---|---|---|---|
fp32/ |
float32 | 753 MB | 79 MB | 300 MB | 187 MB | 189 MB |
fp16/ |
float16 (export-time) | 377 MB | 39 MB | 150 MB | 93 MB | 94 MB |
q8/ |
int8 dynamic | 256 MB | 22 MB | 75 MB | 110 MB | 48 MB |
Each variant directory is self-contained: manifest.json, ONNX graphs, tokenizer.json, config files.
Dimensions
d_model: 512decoder_layers: 6decoder_attention_heads: 8head_dim: 64num_mel_bins: 80max_source_positions: 1500 (encoder output frames, 3000 mel input)max_target_positions: 448vocab_size: 51865
Alignment heads
From generation_config.alignment_heads:
[3,1], [4,2], [4,3], [4,7], [5,1], [5,2], [5,4], [5,6]
Usage
TypeScript (asrjs/speech-recognition)
import { loadSplitGraphLocalModel } from '@asrjs/speech-recognition/models/whisper-seq2seq';
const model = loadSplitGraphLocalModel('./whisper-base-onnx-4graph', { variant: 'fp32' });
// or: { variant: 'fp16' }, { variant: 'q8' }
Python export (reproduce)
cd tools/whisper-onnx-export
# fp32
.venv/bin/python export_whisper.py openai/whisper-base ./output --device cpu --variant fp32
# fp16 (export-time, ORT-safe)
.venv/bin/python export_whisper.py openai/whisper-base ./output --device cpu --variant fp16
# q8 (post-export dynamic quantization)
.venv/bin/python export_whisper.py openai/whisper-base ./output --device cpu --variant q8
Validation
- ONNX checker: pass (all variants, all graphs)
- ORT CPU load: pass (all variants, all graphs)
- Audit: 52 passes, 0 failures
License
Apache-2.0 (same as openai/whisper-base)
- Downloads last month
- 39
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Model tree for ysdede/whisper-base-onnx-4graph
Base model
openai/whisper-base