metadata
language: tr
datasets:
- SUNLP-NER-Twitter
bert-loodos-sunlp-ner-turkish
Introduction
[bert-loodos-sunlp-ner-turkish] is a NER model that was fine-tuned from the loodos/bert-base-turkish-cased model on the SUNLP-NER-Twitter dataset.
Training data
The model was trained on the SUNLP-NER-Twitter dataset (5000 tweets). The dataset can be found at https://github.com/SU-NLP/SUNLP-Twitter-NER-Dataset Named entity types are as follows: Person, Location, Organization, Time, Money, Product, TV-Show
How to use bert-loodos-sunlp-ner-turkish with HuggingFace
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("busecarik/bert-loodos-sunlp-ner-turkish")
model = AutoModelForTokenClassification.from_pretrained("busecarik/bert-loodos-sunlp-ner-turkish")
Model performances on SUNLP-NER-Twitter test set (metric: seqeval)
Precision | Recall | F1 |
---|---|---|
84.66 | 84.36 | 84.51 |
Classification Report
Entity | Precision | Recall | F1 |
---|---|---|---|
LOCATION | 0.74 | 0.78 | 0.76 |
MONEY | 0.93 | 0.82 | 0.87 |
ORGANIZATION | 0.83 | 0.81 | 0.82 |
PERSON | 0.90 | 0.92 | 0.91 |
PRODUCT | 0.55 | 0.50 | 0.52 |
TIME | 0.91 | 0.87 | 0.89 |
TVSHOW | 0.63 | 0.58 | 0.54 |
You can cite the following paper, if you use this model:
@InProceedings{ark-yeniterzi:2022:LREC,
author = {\c{C}ar\i k, Buse and Yeniterzi, Reyyan},
title = {A Twitter Corpus for Named Entity Recognition in Turkish},
booktitle = {Proceedings of the Language Resources and Evaluation Conference},
month = {June},
year = {2022},
address = {Marseille, France},
publisher = {European Language Resources Association},
pages = {4546--4551},
url = {https://aclanthology.org/2022.lrec-1.484}
}