Rampart PII NER โ€” MLX

An MLX build of Rampart, a compact encoder-only BERT (MiniLM-L6, hidden 384, 6 layers, ~18.5M params) with a 35-label BIO token-classification head for detecting personally identifiable information (PII). Intended for on-device, client-side PII redaction on Apple Silicon.

This repository ships float (fp16) MLX weights in model.safetensors plus a small self-contained MLX implementation (rampart_mlx.py).

Provenance

This is an independent MLX conversion of the original nationaldesignstudio/rampart. The original is distributed as a 4-bit quantized ONNX export; the float weights here were recovered directly from that export (4-bit MatMulNBits linears and INT8 embeddings dequantized to float) and then stored in MLX safetensors.

The conversion was verified to reproduce the original ONNX model exactly: on the validation prompts, MLX vs. ONNX Runtime token-label agreement is 100% with a maximum logit difference of ~1e-5 (floating-point rounding).

No third-party MLX port was used in producing these weights.

Labels

17 entity types in BIO format (35 classes incl. O): GIVEN_NAME, SURNAME, EMAIL, PHONE, URL, TAX_ID, BANK_ACCOUNT, ROUTING_NUMBER, GOVERNMENT_ID, PASSPORT, DRIVERS_LICENSE, BUILDING_NUMBER, STREET_NAME, SECONDARY_ADDRESS, CITY, STATE, ZIP_CODE.

Usage

pip install mlx tokenizers
python demo.py "My name is John Smith, email john.smith@example.com"
import mlx.core as mx
from tokenizers import Tokenizer
from rampart_mlx import load

model, cfg = load(".")
tok = Tokenizer.from_file("tokenizer.json")
enc = tok.encode("Call me at (555) 123-4567")
logits = model(mx.array([enc.ids]), mx.array([enc.attention_mask]))
label_ids = mx.argmax(logits[0], axis=-1).tolist()
labels = [cfg.id2label[i] for i in label_ids]

See demo.py for BIO span decoding using the tokenizer's char offsets (needed to map predicted labels back onto the original text for redaction).

Files

File Purpose
model.safetensors fp16 MLX weights (HuggingFace-style key names)
config.json model architecture + id2label
rampart_mlx.py self-contained MLX model + loader
demo.py tokenize โ†’ infer โ†’ decode spans
tokenizer.json, vocab.txt, tokenizer_config.json, special_tokens_map.json WordPiece tokenizer

License & attribution

Released under CC-BY-4.0, the same license as the upstream model. Attribution:

Downloads last month
-
Safetensors
Model size
18.4M params
Tensor type
F16
ยท
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for OsaurusAI/rampart-mlx

Finetuned
(1)
this model