Instructions to use philterd/ph-eye-pii-en-small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- GLiNER
How to use philterd/ph-eye-pii-en-small with GLiNER:
from gliner import GLiNER model = GLiNER.from_pretrained("philterd/ph-eye-pii-en-small") - Notebooks
- Google Colab
- Kaggle
ph-eye-pii-en-small
A GLiNER person-name detector for English PII redaction, fine-tuned by
Philterd from
urchade/gliner_small-v2.1
on nvidia/Nemotron-PII.
It detects a single label, name, which is what PhEye
consumes for redaction.
This is the small, low-latency variant for on-device use. Sibling sizes share
the same recipe and data:
philterd/ph-eye-pii-en-medium
(mid-size, the recommended default) and
philterd/ph-eye-pii-en-large
(highest-capacity, for server-side use).
Intended use
Detecting personal names in English text so they can be redacted. The model is recall-leaning by design: in redaction, missing a name (a leak) is worse than flagging an extra span (over-redaction). We recommend a default threshold of 0.90 β the operating point the evaluation below reports. (This model's scores run lower than the medium variant's, so it needs a higher cutoff for the same precision.) Lower it to push recall higher at the cost of more over-redaction.
Why names only? Names are the one PII type that genuinely needs a model: they have no fixed format, overlap with ordinary words, and depend on context. The other common types β emails, phone numbers, SSNs, credit cards, IP addresses β follow regular patterns and are caught more reliably (and far more cheaply) with regexes, checksums, or dictionary lookups. Spending a neural model on those would be overkill, so this model focuses on the hard part and leaves the structured types to the pattern-based detectors in PhEye.
Usage
from gliner import GLiNER
model = GLiNER.from_pretrained("philterd/ph-eye-pii-en-small")
text = "Please forward the invoice to Toni Levine and copy Maria Gonzalez."
for ent in model.predict_entities(text, ["name"], threshold=0.9):
print(ent["text"], "->", ent["label"], round(ent["score"], 2))
Evaluation
Scored on the nvidia/Nemotron-PII test split (131,105 windows, long docs
chunked at 1,200 chars), at the recommended threshold 0.90, name only
(support 97,012).
| Matching | Precision | Recall | F1 |
|---|---|---|---|
| Exact span + label | 0.96 | 0.99 | 0.98 |
| Overlapping span + label | 0.97 | 0.99 | 0.98 |
Recall is 0.991 unrounded β the same as the medium model. At its recommended threshold this model nearly matches medium on precision too (~0.96), from a smaller, lower-latency base (deberta-v3-small; ~610 MB vs ~780 MB, and ~half the transformer layers). These numbers are in-distribution (trained and tested on the same synthetic Nemotron distribution) and are a ceiling, not a real-document production claim β accuracy on real (non-synthetic) English is likely lower.
Because the training data is synthetic, real-text recall is materially lower than these in-distribution figures, and the recommended threshold above is a synthetic operating point. For recall-first use on real-world documents, tune the threshold down β a lower cutoff accepts more over-redaction to recover recall, which is the right trade when a missed name is a leak. Always pick the operating point on a sample of your own data.
Limitations
- English only, and detects the
namelabel only β other PII types (emails, phone numbers, IDs, addresses) are out of scope and handled elsewhere in PhEye. - Trained on synthetic data (Nemotron-PII), so real-document accuracy is lower than the in-distribution numbers above; validate on your own data.
- May miss atypical name forms (lowercase, nicknames, non-Western name structures). It is recall-leaning, so expect some over-redaction; tune the threshold for your precision/recall balance.
Training
- Base model:
urchade/gliner_small-v2.1(deberta-v3-small, Apache-2.0). - Dataset:
nvidia/Nemotron-PII(English, 100k train / 100k test, 50+ industries). - Labels: Nemotron's
first_name/last_nameare merged toname(label_map), and adjacent first/last spans are merged into one full-name span, matching standard NER benchmarks and the full-name span PhEye consumes. - Long documents are windowed at 1,200 chars during training and evaluation so names past the 512-token window are still learned and scored.
- Recipe:
configs/nemotron-pii.yamlwithbase_model: urchade/gliner_small-v2.1.
License and attribution
Released under CC-BY-4.0, inherited from the training data. The NVIDIA Nemotron-PII dataset is CC-BY-4.0; commercial use is permitted with attribution to NVIDIA. Any use of this model must retain that attribution.
Disclaimer
This model makes no guarantees of accuracy or completeness. Name detection is probabilistic β it will miss some names and flag some non-names. Before relying on it, validate precision and recall on your own data. You remain the data controller for any personal data you process with this model and are responsible for your own compliance.
- Downloads last month
- -
Model tree for philterd/ph-eye-pii-en-small
Base model
urchade/gliner_small-v2.1