kyrgyz_language_NER / README.md
murat's picture
Upload README.md with huggingface_hub
4ada751
|
raw
history blame
1.37 kB
metadata
language: ky
datasets:
  - wikiann
examples: null
widget:
  - text: Бириккен Улуттар Уюму
    example_title: Sentence_1
  - text: Жусуп Мамай
    example_title: Sentence_2

Kyrgyz Named Entity Recognition

Fine-tuning bert-base-multilingual-cased on Wikiann dataset for performing NER on Kyrgyz language. WARNING: this model is not usable (see metrics below). I'll update the model after cleaning up the Wikiann dataset and re-training.

Label ID and its corresponding label name

Label ID Label Name
0 O
1 B-PER
2 I-PER
3 B-ORG
4 I-ORG
5 B-LOC
6 I-LOC

Results

Name Overall F1 LOC F1 ORG F1 PER F1
Train set 0.595683 0.570312 0.687179 0.549180
Validation set 0.461333 0.551181 0.401913 0.425087
Test set 0.442622 0.456852 0.469565 0.413114

Example

from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("murat/kyrgyz_language_NER")
model = AutoModelForTokenClassification.from_pretrained("murat/kyrgyz_language_NER")
nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "Жусуп Мамай"
ner_results = nlp(example)
ner_results