punct_cap_seg_en — CoreML (INT8)

CoreML conversion of 1-800-BAD-CODE/punct_cap_seg_en (Apache 2.0), a 52M-parameter BERT-style token classifier that predicts, per subtoken: post-punctuation (period, comma, question mark, acronym dotting), per-character true-casing, and sentence boundaries for English text. All credit for the model itself goes to the original author.

Built for the Babble dictation app, where it provides on-device live punctuation alongside NVIDIA Nemotron streaming ASR.

punctuation.mlmodelc/ — compiled CoreML model, INT8 weights / FP32 activations (per-block 32 quantization)
tokenizer.model — SentencePiece unigram tokenizer (32k lowercase English vocabulary; bos=1, eos=2, pad=3, unk=0), from the original repo (spe_32k_lc_en.model)

Model details

Input: input_ids, int32 [1, 256], padded with pad id 3; BOS/EOS added. The graph computes its own attention mask from the input ids.
Outputs (argmax baked into the graph): pre_preds [1,256], post_preds [1,256] (labels: null, acronym, ., ,, ?), cap_preds [1,256,16] (per-character capitalization), seg_preds [1,256] (sentence boundaries).
Input text must be lowercase with whitespace collapsed to single spaces — the vocabulary contains no uppercase characters.
Latency: ~6 ms per 256-token window on Apple Silicon (CoreML, all compute units).

Conversion notes

Converted via ONNX → PyTorch (onnx2torch) → coremltools, validated for end-to-end text parity against the original ONNX model. INT8 weight quantization preserves parity except for rare near-tie boundary decisions.

Do not reconvert with FP16 activations: the graph's internal attention-mask constant overflows in half precision and silently degrades output quality. Use FP32 activations (weight-only quantization is fine).

License

Apache 2.0, inherited from the original model.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for soloish90/punct-cap-seg-en-coreml-int8

Base model

1-800-BAD-CODE/punctuation_fullstop_truecase_english

Finetuned

(1)

this model

soloish90
/

punct-cap-seg-en-coreml-int8

punct_cap_seg_en — CoreML (INT8)

Contents

Model details

Conversion notes

License

Model tree for soloish90/punct-cap-seg-en-coreml-int8