Shenava — Koochik 1.0 (114M) · CoreML iOS15 NeuralNetwork fp16
CoreML NeuralNetwork (not ML Program) fp16 export of Reza2kn/Shenava-Koochik-1.0 — built so older Apple devices capped at iOS 15 (e.g. iPad Air 2 / iOS 15.8) can load and run it. ML Program packages require iOS 16+; this targets NeuralNetwork / CoreML spec v5 with iOS 14 availability, so it runs on iOS 15.
This is the cache-aware streaming step (one 170 ms prediction), the same kind of artifact as shenava-fa-fastconformer-streaming-32m-coreml-ios15-fp16.
The Shenava-1 family (CoreML iOS15)
Shenava-Koochik-1.0-CoreML-iOS15-fp16— Koochik 1.0 (114M) · teacher / flagshipShenava-Rizeh-v1.0-CoreML-iOS15-fp16— Rizeh v1.0 (32M) · mid-tierShenava-Rizeh-Pizeh-v1.0-CoreML-iOS15-fp16— Rizeh Pizeh v1.0 (6.9M) · tiniest
Benchmark — fair WER/CER (parent model, decoded @ [70,13])
| Member | golden-6669 WER | CER | FLEURS-fa WER | CER |
|---|---|---|---|---|
| Koochik 1.0 (114M) | 7.49% | 2.30% | 10.64% | 3.79% |
CoreML contract (cache-aware streaming CTC step, att_context [70,0])
Inputs:
processed_signal:Float32 [1, 80, 17]cache_last_channel:Float32 [17, 1, 70, 512]cache_last_time:Float32 [17, 1, 512, 8]
Outputs:
logits:Float32 [1, 1, 1025]cache_last_channel_next:Float32 [17, 1, 70, 512]cache_last_time_next:Float32 [17, 1, 512, 8]
Streaming geometry: feature_frames per prediction = 17 (pre_encode_cache 9 + chunk 8), audio window 170 ms, constant cache length 70, d_model=512, 17 conformer layers, ×8 subsampling (80 ms/frame).
Compatibility (Xcode coremlc)
- model type:
MLModelType_neuralNetwork - storage precision:
Float16 - specification version:
5 - availability:
iOS 14.0,macOS 11.0
coremlc compile shenava_koochik_1_0_ctc_streaming_att70_0_ios15_fp16.mlmodel /tmp/out --deployment-target 15.0 --platform ios
Files
shenava_koochik_1_0_ctc_streaming_att70_0_ios15_fp16.mlmodel— fp16 NeuralNetwork model (~212 MB)tokens.json,preprocessor.json,mel_filters_slaney_80x257.json— sidecars (ve_tok_v4, shared across the family)shenava_koochik_1_0_ctc_streaming_att70_0_ios15_fp16_manifest.json— export manifestexport_koochik10_streaming_coreml.py— reproducible export script
Tokenizer: ve_tok_v4 (SentencePiece BPE-1024 +blank, digit/punct/«»-aware). Numbers are emitted in spoken form; apply Persian ITN at display for digits. Part of VisualEars / Shenava.
Export stack: coremltools 9.0, torch 2.7.0, NeMo 2.7.3. fp16 vs fp32 argmax agreement: 1.000.
- Downloads last month
- 8
Model tree for Reza2kn/Shenava-Koochik-1.0-CoreML-iOS15-fp16
Base model
nvidia/stt_fa_fastconformer_hybrid_large