multilingual-e5-small β CoreML (int8) for Hark
A CoreML (mlprogram, int8-weight-quantized) conversion of
intfloat/multilingual-e5-small
(384-dim, multilingual), packaged for on-device vault search in
Hark β a local-first, macOS-only meeting
transcription app. Runs on the Apple Neural Engine; nothing is sent off the
machine (Hark embeds the whole vault locally).
This repo exists so Hark can download a ready-to-run CoreML artifact instead of shipping it in the app bundle. It is a faithful conversion β see Provenance and Validation β not a new model.
Files
| File | What |
|---|---|
MultilingualE5Small.mlpackage/ |
the CoreML model (int8 weights, ~113 MB) |
tokenizer.json, tokenizer_config.json, special_tokens_map.json |
XLM-RoBERTa tokenizer (SentencePiece Unigram) |
sentencepiece.bpe.model |
the SentencePiece model |
Hark's loader snapshots this repo at a pinned revision into its app-support
models dir, compiles the .mlpackage to the ANE, and runs fully offline
thereafter.
I/O contract
- inputs:
input_ids(int32[1, L]),attention_mask(int32[1, L]), flexibleL β 1..512 - output:
last_hidden_state(float32[1, L, 384]) - Hark applies masked mean-pooling + L2-normalization in Swift, and the e5
asymmetric prefixes (
"query: "/"passage: "). Reproduce those if you reuse this model directly.
Provenance
- Converted from
intfloat/multilingual-e5-smallat source revision614241f622f53c4eeff9890bdc4f31cfecc418b3viaengine/scripts/convert-embedder-coreml.py(coremltools 9,convert_to="mlprogram",minimum_deployment_target=macOS14). - int8 weight quantization (per-channel, symmetric) via
engine/scripts/quantize-embedder-int8.py(coremltools.optimize.coreml.linear_quantize_weights).
Validation
- Fidelity: worst-case cosine between the fp16 and int8 pooled+L2-normalized embeddings was 0.99986 across EN/VI/TH probe sentences β the int8 weights are essentially indistinguishable from fp16 for retrieval.
- On-device: Hark's gated cross-lingual + end-to-end retrieval tests pass on the Apple Neural Engine with this int8 artifact (ENβVI/TH closer than unrelated; full chunk β embed β index β retrieve pipeline).
License
MIT, inherited from intfloat/multilingual-e5-small. This is a format conversion
- weight quantization of that model; all credit to the original authors.
- Downloads last month
- 25
Model tree for tuanda2912/hark-multilingual-e5-small-coreml
Base model
intfloat/multilingual-e5-small