ParsTwiNER: Transformer-based Model for Named Entity Recognition at Informal Persian

An open, broad-coverage corpus and model for informal Persian named entity recognition collected from Twitter. Paper presenting ParsTwiNER: 2021.wnut-1.16


The following table summarizes the F1 score on our corpus obtained by ParsTwiNER as compared to ParsBERT as a SoTA for Persian NER.

Named Entity Recognition on Our Corpus

Entity Type ParsTwiNER F1 ParsBert F1
PER 91 80
LOC 82 68
ORG 69 55
EVE 41 12
POG 85 -
NAT 82.3 -
Total 81.5 69.5

How to use

TensorFlow 2.0

from transformers import TFAutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("overfit/twiner-bert-base-mtl")
model = TFAutoModelForTokenClassification.from_pretrained("overfit/twiner-bert-base-mtl")
twiner_mtl = pipeline('ner', model=model, tokenizer=tokenizer, ignore_labels=[])


The authors would like to thank Dr. Momtazi for her support. Furthermore, we would like to acknowledge the accompaniment provided by Mohammad Mahdi Samiei and Abbas Maazallahi.



Release v1.0.0 (Aug 01, 2021)

This is the first version of our ParsTwiNER.

