File size: 2,865 Bytes
2771032
 
341afd1
 
 
 
 
2771032
 
f30249b
2771032
f30249b
2771032
 
 
 
 
f30249b
2771032
f30249b
 
 
 
2771032
f30249b
2771032
f30249b
 
2771032
 
 
 
f30249b
2771032
f30249b
2771032
f30249b
2771032
 
 
f30249b
2771032
 
 
f30249b
2771032
 
 
f30249b
2771032
f30249b
 
2771032
f30249b
 
 
2771032
f30249b
 
 
 
 
 
2771032
f30249b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
library_name: transformers
tags:
- Persian
- Named Entity Recognition
- NER
- Albert
---

# Model Card for Behpoyan-NER

Behpoyan-NER is a fine-tuned Albert model for Named Entity Recognition (NER) in the Persian language. It is based on the `HooshvareLab/albert-fa-zwnj-base-v2-ner` model and identifies ten types of entities: Date (DAT), Event (EVE), Facility (FAC), Location (LOC), Money (MON), Organization (ORG), Percent (PCT), Person (PER), Product (PRO), and Time (TIM).

## Model Details

### Model Description

Behpoyan-NER is designed to recognize named entities in Persian text, improving upon the capabilities of its base model, `HooshvareLab/albert-fa-zwnj-base-v2-ner`. It was fine-tuned on a dataset combining ARMAN, PEYMA, and WikiANN datasets, which are widely used for NER in the Persian language.

- **Developed by:** Behpoyan  
- **Model type:** Albert for Token Classification  
- **Language(s) (NLP):** Persian (fa)  
- **License:** MIT  

### Model Sources

- **Repository:** [Behpoyan/Behpoyan-NER](https://huggingface.co/Behpoyan/Behpoyan-NER)  
- **Base Model Repository:** [HooshvareLab/albert-fa-zwnj-base-v2-ner](https://huggingface.co/HooshvareLab/albert-fa-zwnj-base-v2-ner)  


### Direct Use

This model can be directly used for Named Entity Recognition tasks in Persian text. Example applications include text analysis, information extraction, and Persian-language NLP applications.

### Downstream Use

The model can be fine-tuned further for domain-specific NER tasks or combined with other models for complex NLP pipelines.

### Out-of-Scope Use

The model is not designed for languages other than Persian or tasks outside token classification. Misuse for generating biased or harmful content is discouraged.

### Recommendations

While the model performs well for general-purpose NER in Persian, users should validate its performance on their specific datasets. Be cautious of biases in the training data, especially in identifying less-represented entities.

## How to Get Started with the Model

Here’s how you can use the model:

```python
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline

model_name = "Behpoyan/Behpoyan-NER"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)

nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = '''
"در سال ۱۴۰۱، شرکت علی‌بابا اعلام کرد که با همکاری بانک ملت، یک پروژه بزرگ برای توسعه زیرساخت‌های تجارت الکترونیک در ایران آغاز خواهد کرد. 
این پروژه در تهران و اصفهان اجرا می‌شود و پیش‌بینی می‌شود تا پایان سال ۱۴۰۲ تکمیل شود."
'''
ner_results = nlp(example)

print(ner_results)