πŸš€ DistilBert Urdu NER by BlaikHole

πŸ“Œ Overview

This repository provides a fine-tuned model trained on private Urdu NER data using quick still efficient DistilBert architecture. It can be used for Urdu NER with 7 classes.


🎨 Model Outputs & Labels

The model identifies the following labels:

Label Name Description
πŸŸ₯ LABEL_0 > Date Date in text like 5 Feb.
🟩 LABEL_1 > Designation Designation of person like Doctor.
🟦 LABEL_2 > Location Location i.e office, city name.
🟨 LABEL_3 > Number Any number.
πŸŸͺ LABEL_4 > Organization Name of company or organization etc.
🟧 LABEL_5 > Other Outside entity.
⬛ LABEL_6 > Person Name of person.
🟫 LABEL_7 > Time Time related words.

πŸš€ Quick Usage

You can easily load and use this model with transformers:

πŸ”Ή Named Entity Recognition (NER)

import torch
from transformers import pipeline, AutoModelForTokenClassification, AutoTokenizer

# Label Mapping
LABEL_MAP = {
    0: "DATE",
    1: "DESIGNATION",
    2: "LOCATION",
    3: "NUMBER",
    4: "ORGANIZATION",
    5: "OTHER",
    6: "PERSON",
    7: "TIME",
}

# Model Name
MODEL_NAME = "blaikhole/distilbert-urdu-ner"

# Load Model & Tokenizer
device = 0 if torch.cuda.is_available() else -1
model = AutoModelForTokenClassification.from_pretrained(MODEL_NAME).to("cuda" if device == 0 else "cpu")
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

# Load NER Pipeline
ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer, device=device)

def process_text(text):
    entities = ner_pipeline(text)
    results = [(entity["word"], LABEL_MAP.get(int(entity["entity"].split("_")[-1]), "OTHER")) for entity in entities]
    return results

# Example Usage
if __name__ == "__main__":
    sample_text = "پی ٹی ؒئی ΩΎΨ§Ϊ©Ψ³ΨͺΨ§Ω† Ω…ΫŒΪΊ ایک ΨͺΩ†ΨΈΫŒΩ… ہے۔"
    print(process_text(sample_text))

πŸ“¦ Installation

To use this model, install the required dependencies:

pip install transformers torch

πŸ“œ License

MIT

Downloads last month
32
Safetensors
Model size
66.4M params
Tensor type
F32
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for blaikhole/distilbert-urdu-ner

Finetuned
(7924)
this model

Space using blaikhole/distilbert-urdu-ner 1