Canary-1B-v2 — CoreML (ANE)
CoreML conversion of nvidia/canary-1b-v2
for Apple Silicon / Neural Engine, packaged for FluidAudio.
Canary is a FastConformer encoder + Transformer attention encoder-decoder (AED) ASR model (25 European languages, 16384-token SentencePiece BPE). It is decoded autoregressively: the transformer decoder cross-attends to the encoder output and emits tokens greedily until EOS (id 3), with a 1024→16384 projection head.
Files
| File | Role | Precision |
|---|---|---|
Preprocessor.mlmodelc |
waveform [1,240000] → mel [1,128,1501] |
fp32 (CPU) |
EncoderInt4.mlmodelc |
mel → encoder [1,1024,188] |
int4 (ANE, iOS18) |
DecoderInt4.mlmodelc |
autoregressive transformer → hidden [1,256,1024] |
int4 (ANE, iOS18) |
Projection.mlmodelc |
hidden [1,1024] → logits [1,16384] |
fp16 (ANE) |
vocab.json |
16384 SentencePiece pieces (id → piece) |
— |
projection_weights.npz |
raw projection weights (for Python reference pipelines) | fp32 |
metadata.json |
shapes, sample rate, special token ids | — |
Contract: 15 s window (240000 samples @ 16 kHz), 256 decoder steps,
eos=3, pad=2, bos=4. int4 weight payloads require iOS 18 / macOS 15.
Variants
- int4 (this default): ANE-runnable, ~573 MB, fastest. Per-block-32 symmetric.
- fp16: exact parity with PyTorch, iOS 17, ~1.8 GB (not included here by default).
- int8 per-channel decodes correctly only on CPU (crashes the GPU/ANE MPSGraph backend), so it is not recommended; use int4 for an ANE-resident small build.
Accuracy / speed (LibriSpeech test-clean, ≤15 s, int4, M-series ANE)
| Metric | Value |
|---|---|
| WER | ~2.1% |
| RTFx | ~7x |
fp16 CoreML output is byte-identical to the NeMo PyTorch greedy decode.
Usage (FluidAudio)
let manager = try await CanaryManager.load(precision: .int4)
let text = try await manager.transcribe(audioURL: url)
Conversion
See the mobius conversion pipeline
(models/stt/canary-1b-v2/coreml/): convert-coreml.py (NeMo→CoreML),
quantize_int4.py, build_projection.py, validate.py, stage_hf.py.
License
Inherits cc-by-4.0 from the base model nvidia/canary-1b-v2.
Model tree for FluidInference/canary-1b-v2-coreml
Base model
nvidia/canary-1b-v2