punct_cap_seg_en β€” CoreML (INT8)

CoreML conversion of 1-800-BAD-CODE/punct_cap_seg_en (Apache 2.0), a 52M-parameter BERT-style token classifier that predicts, per subtoken: post-punctuation (period, comma, question mark, acronym dotting), per-character true-casing, and sentence boundaries for English text. All credit for the model itself goes to the original author.

Built for the Babble dictation app, where it provides on-device live punctuation alongside NVIDIA Nemotron streaming ASR.

Contents

  • punctuation.mlmodelc/ β€” compiled CoreML model, INT8 weights / FP32 activations (per-block 32 quantization)
  • tokenizer.model β€” SentencePiece unigram tokenizer (32k lowercase English vocabulary; bos=1, eos=2, pad=3, unk=0), from the original repo (spe_32k_lc_en.model)

Model details

  • Input: input_ids, int32 [1, 256], padded with pad id 3; BOS/EOS added. The graph computes its own attention mask from the input ids.
  • Outputs (argmax baked into the graph): pre_preds [1,256], post_preds [1,256] (labels: null, acronym, ., ,, ?), cap_preds [1,256,16] (per-character capitalization), seg_preds [1,256] (sentence boundaries).
  • Input text must be lowercase with whitespace collapsed to single spaces β€” the vocabulary contains no uppercase characters.
  • Latency: ~6 ms per 256-token window on Apple Silicon (CoreML, all compute units).

Conversion notes

Converted via ONNX β†’ PyTorch (onnx2torch) β†’ coremltools, validated for end-to-end text parity against the original ONNX model. INT8 weight quantization preserves parity except for rare near-tie boundary decisions.

Do not reconvert with FP16 activations: the graph's internal attention-mask constant overflows in half precision and silently degrades output quality. Use FP32 activations (weight-only quantization is fine).

License

Apache 2.0, inherited from the original model.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for soloish90/punct-cap-seg-en-coreml-int8

Finetuned
(1)
this model