murat commited on
Commit
4ada751
1 Parent(s): 51c1c7e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +50 -0
README.md ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ language: ky
4
+ datasets:
5
+ - wikiann
6
+ examples:
7
+ widget:
8
+ - text: "Бириккен Улуттар Уюму"
9
+ example_title: "Sentence_1"
10
+ - text: "Жусуп Мамай"
11
+ example_title: "Sentence_2"
12
+ ---
13
+
14
+ <h1>Kyrgyz Named Entity Recognition</h1>
15
+ Fine-tuning bert-base-multilingual-cased on Wikiann dataset for performing NER on Kyrgyz language.
16
+ WARNING: this model is not usable (see metrics below). I'll update the model after cleaning up the Wikiann dataset and re-training.
17
+
18
+
19
+ ## Label ID and its corresponding label name
20
+
21
+ | Label ID | Label Name|
22
+ | -------- | ----- |
23
+ | 0 | O |
24
+ | 1 | B-PER |
25
+ | 2 | I-PER |
26
+ | 3 | B-ORG|
27
+ | 4 | I-ORG |
28
+ | 5 | B-LOC |
29
+ | 6 | I-LOC |
30
+
31
+ <h1>Results</h1>
32
+
33
+ | Name | Overall F1 | LOC F1 | ORG F1 | PER F1 |
34
+ | ---- | -------- | ----- | ---- | ---- |
35
+ | Train set | 0.595683 | 0.570312 | 0.687179 | 0.549180 |
36
+ | Validation set | 0.461333 | 0.551181 | 0.401913 | 0.425087 |
37
+ | Test set | 0.442622 | 0.456852 | 0.469565 | 0.413114 |
38
+
39
+
40
+ Example
41
+ ```py
42
+ from transformers import AutoTokenizer, AutoModelForTokenClassification
43
+ from transformers import pipeline
44
+ tokenizer = AutoTokenizer.from_pretrained("murat/kyrgyz_language_NER")
45
+ model = AutoModelForTokenClassification.from_pretrained("murat/kyrgyz_language_NER")
46
+ nlp = pipeline("ner", model=model, tokenizer=tokenizer)
47
+ example = "Жусуп Мамай"
48
+ ner_results = nlp(example)
49
+ ner_results
50
+ ```