--- tags: - flair - token-classification - sequence-tagger-model language: en widget: - text: >- SELECT shipping FROM users WHERE shipping = '201 Thayer St Providence RI 02912' license: mit datasets: - beki/privy --- | Feature | Description | | --- | --- | | **Name** | `en_spacy_pii_distilbert` | | **Version** | `0.0.0` | | **spaCy** | `>=3.4.1,<3.5.0` | | **Default Pipeline** | `transformer`, `ner` | | **Components** | `transformer`, `ner` | | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) | | **Sources** | Trained on a new [dataset for structured PII](https://huggingface.co/datasets/beki/privy) generated by [Privy](https://github.com/pixie-io/pixie/tree/main/src/datagen/pii/privy). For more details, see this [blog post](https://blog.px.dev/detect-pii/) | | **License** | MIT | | **Author** | [Benjamin Kilimnik](https://www.linkedin.com/in/benkilimnik/) | --- ## English PII in Flair This is the large 5-class NER model for English trained on protocol trace data generated by [Privy](https://github.com/pixie-io/pixie/tree/main/src/datagen/pii/privy/) F1-Score: **0.9522** Predicts 5 tags: | **tag** | **meaning** | |---------------------------------|-----------| | PER | person name | | LOC | location name | | ORG | organization name | | DATE_TIME | dates and times | | NRP | nationalities, religious and political groups | Uses distilbert embeddings. ---