unimelb-nlp/wikiann
Viewer • Updated • 2M • 43k • 121
How to use bohrariyanshi/pii-ner-extraction with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("token-classification", model="bohrariyanshi/pii-ner-extraction") # Load model directly
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("bohrariyanshi/pii-ner-extraction")
model = AutoModelForTokenClassification.from_pretrained("bohrariyanshi/pii-ner-extraction")The model was fine-tuned on the WikiANN dataset:
The model achieves high confidence predictions on standard NER tasks:
from transformers import pipeline
# Load the model
ner = pipeline("ner", model="bohrariyanshi/pii-ner-extraction", aggregation_strategy="simple")
# Example usage
text = "Barack Obama was born in Hawaii."
entities = ner(text)
print(entities)
# Output: [{'entity_group': 'PER', 'score': 0.968, 'word': 'Barack Obama', 'start': 0, 'end': 12}, ...]
The model demonstrates superior performance compared to base BERT:
Training was performed on a Google Colab T4 GPU for a short duration (fine-tuning only).
The overall environmental impact is minimal compared to large-scale pretraining runs.
If you use this model, please cite:
@model{bohrariyanshi-pii-ner-extraction,
author = {bohrariyanshi},
title = {Multilingual NER Model for PII Detection},
year = {2025},
url = {https://huggingface.co/bohrariyanshi/pii-ner-extraction}
}
Base model
google-bert/bert-base-multilingual-cased