Instructions to use OpenMed/OpenMed-PII-ClinicalE5-Small-33M-v1-mlx-q8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use OpenMed/OpenMed-PII-ClinicalE5-Small-33M-v1-mlx-q8 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir OpenMed-PII-ClinicalE5-Small-33M-v1-mlx-q8 OpenMed/OpenMed-PII-ClinicalE5-Small-33M-v1-mlx-q8
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
OpenMed-PII-ClinicalE5-Small-33M-v1 for OpenMed MLX
This repository contains an MLX packaging of OpenMed/OpenMed-PII-ClinicalE5-Small-33M-v1 for Apple Silicon inference with OpenMed.
At a Glance
- Source checkpoint:
OpenMed/OpenMed-PII-ClinicalE5-Small-33M-v1 - Model family:
bert(BertForTokenClassification) - Primary language hint: English (
en) - Artifact layout: legacy-compatible MLX (
config.json,id2label.json, MLX weight files) - Weight format:
safetensors - Quantization: 8-bit (MLX affine, group size 64)
- Weights size: 37.6 MB (~38 MB full bundle incl. tokenizer) vs 133 MB fp32
- Python MLX: supported through
openmed[mlx]on Apple Silicon Macs
Quantization & On-Device Footprint
This is the 8-bit quantized MLX build of the source checkpoint, intended for sub-50 MB on-device deployment (iPhone/iPad via OpenMedKit, or Apple Silicon Macs).
| Build | Weights | Notes |
|---|---|---|
fp32 (source / -mlx) |
133 MB | full precision |
this repo (-mlx-q8) |
37.6 MB | 8-bit affine, group size 64 |
The companion 4-bit build (~21 MB) is intentionally not published: although its
aggregate F1 is within ~0.5 point of fp32, it regresses sharply on a few sensitive
fields (notably cvv: 83 → 7 F1), so 8-bit is the recommended sub-50 MB target.
Quality (real-data gate)
Span-level, type-aware evaluation on 1,000 documents from
nvidia/Nemotron-PII (test split),
run through the identical OpenMed extract_pii pipeline — only the weight precision differs:
| Build | Strict F1 | Relaxed F1 (IoU≥0.5) | Predictions identical to fp32 |
|---|---|---|---|
| fp32 | 85.89 | 87.84 | — |
| this repo (8-bit) | 85.90 | 87.87 | 99.8% |
8-bit quantization is effectively lossless here: predictions match the full-precision model on 99.8% of spans, with no per-label regressions.
Python Quick Start
Use the standard OpenMed API if you want OpenMed to choose the right runtime automatically:
pip install "openmed[mlx]"
from openmed import extract_pii
text = "<your clinical note here>"
result = extract_pii(
text,
model_name="OpenMed/OpenMed-PII-ClinicalE5-Small-33M-v1",
use_smart_merging=True,
)
for entity in result.entities:
print(entity.label, entity.text, round(entity.confidence, 4))
On Apple Silicon, OpenMed can use this preconverted MLX artifact when openmed[mlx] is installed. On other systems, OpenMed falls back to the Hugging Face / PyTorch backend.
Use This Preconverted MLX Repo Directly
If you want to use this MLX snapshot explicitly, download it locally and point OpenMed at the directory:
pip install "openmed[mlx]"
hf download OpenMed/OpenMed-PII-ClinicalE5-Small-33M-v1-mlx-q8 --local-dir ./OpenMed-PII-ClinicalE5-Small-33M-v1-mlx-q8
If this repo is private in your environment, authenticate first with hf auth login or set HF_TOKEN.
from openmed import extract_pii
from openmed.core import OpenMedConfig
text = "<your clinical note here>"
result = extract_pii(
text,
model_name="./OpenMed-PII-ClinicalE5-Small-33M-v1-mlx-q8",
config=OpenMedConfig(backend="mlx"),
use_smart_merging=True,
)
print(result.entities)
Swift Status
This repo is based on bert. Python MLX supports this artifact today, and this family is in the current OpenMedKit Swift MLX support matrix.
If you are building an Apple app today, the recommended paths for this model are:
- Python MLX for evaluation or local workflows on Apple Silicon
- CoreML in OpenMedKit if you already have a compatible bundled Apple export
- Track the current Swift support matrix in the OpenMedKit docs
Artifact Notes
This repo uses the current legacy-compatible MLX layout:
config.jsonid2label.json- MLX weight files (
weights.safetensorsand/orweights.npz)
Tokenizer assets are bundled in this repo.
Links
- Source checkpoint:
OpenMed/OpenMed-PII-ClinicalE5-Small-33M-v1 - OpenMed GitHub: https://github.com/maziyarpanahi/openmed
- MLX backend docs: https://openmed.life/docs/mlx-backend/
- OpenMedKit docs: https://openmed.life/docs/swift-openmedkit/
- Downloads last month
- 137
8-bit
Model tree for OpenMed/OpenMed-PII-ClinicalE5-Small-33M-v1-mlx-q8
Base model
intfloat/e5-small-v2