better readme
Browse files
README.md
CHANGED
@@ -51,9 +51,74 @@ It achieves the following results on the evaluation set:
|
|
51 |
|
52 |
More information needed
|
53 |
|
54 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
55 |
|
56 |
-
More information needed
|
57 |
|
58 |
## Training and evaluation data
|
59 |
|
|
|
51 |
|
52 |
More information needed
|
53 |
|
54 |
+
## Code example
|
55 |
+
|
56 |
+
```python:
|
57 |
+
from transformers import pipeline, AutoModel, AutoTokenizer
|
58 |
+
from spacy import displacy
|
59 |
+
import os
|
60 |
+
|
61 |
+
|
62 |
+
#model_path = "/Users/juraj/slovakbert-conll2003-sk-ner/outputs/result/slovakbert-conll2003-sk-ner"
|
63 |
+
model_path="ju-bezdek/slovakbert-conll2003-sk-ner"
|
64 |
+
|
65 |
+
aggregation_strategy="max"
|
66 |
+
ner_pipeline = pipeline(task='ner', model=model_path, aggregation_strategy=aggregation_strategy)
|
67 |
+
|
68 |
+
input_sentence= "Ruský premiér Viktor Černomyrdin v piatok povedal, že prezident Boris Jeľcin , ktorý je na dovolenke mimo Moskvy , podporil mierový plán šéfa bezpečnosti Alexandra Lebedu pre Čečensko, uviedla tlačová agentúra Interfax"
|
69 |
+
ner_ents = ner_pipeline(input_sentence)
|
70 |
+
print(ner_ents)
|
71 |
+
|
72 |
+
ent_group_labels = [ner_pipeline.model.config.id2label[i][2:] for i in ner_pipeline.model.config.id2label if i>0]
|
73 |
+
|
74 |
+
options = {"ents":ent_group_labels}
|
75 |
+
|
76 |
+
dicplacy_ents = [{"start":ent["start"], "end":ent["end"], "label":ent["entity_group"]} for ent in ner_ents]
|
77 |
+
displacy.render({"text":input_sentence, "ents":dicplacy_ents}, style="ent", options=options, jupyter=True, manual=True)
|
78 |
+
```
|
79 |
+
|
80 |
+
### Result:
|
81 |
+
<div>
|
82 |
+
<span class="tex2jax_ignore"><div class="entities" style="line-height: 2.5; direction: ltr">
|
83 |
+
<mark class="entity" style="background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;">
|
84 |
+
Ruský
|
85 |
+
<span style="font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem">MISC</span>
|
86 |
+
</mark>
|
87 |
+
premiér
|
88 |
+
<mark class="entity" style="background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;">
|
89 |
+
Viktor Černomyrdin
|
90 |
+
<span style="font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem">PER</span>
|
91 |
+
</mark>
|
92 |
+
v piatok povedal, že prezident
|
93 |
+
<mark class="entity" style="background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;">
|
94 |
+
Boris Jeľcin,
|
95 |
+
<span style="font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem">PER</span>
|
96 |
+
</mark>
|
97 |
+
, ktorý je na dovolenke mimo
|
98 |
+
<mark class="entity" style="background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;">
|
99 |
+
Moskvy
|
100 |
+
<span style="font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem">LOC</span>
|
101 |
+
</mark>
|
102 |
+
, podporil mierový plán šéfa bezpečnosti
|
103 |
+
<mark class="entity" style="background: #ddd; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;">
|
104 |
+
Alexandra Lebedu
|
105 |
+
<span style="font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem">PER</span>
|
106 |
+
</mark>
|
107 |
+
pre
|
108 |
+
<mark class="entity" style="background: #ff9561; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;">
|
109 |
+
Čečensko,
|
110 |
+
<span style="font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem">LOC</span>
|
111 |
+
</mark>
|
112 |
+
uviedla tlačová agentúra
|
113 |
+
<mark class="entity" style="background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;">
|
114 |
+
Interfax
|
115 |
+
<span style="font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem">ORG</span>
|
116 |
+
</mark>
|
117 |
+
</div></span>
|
118 |
+
</div>
|
119 |
+
|
120 |
+
|
121 |
|
|
|
122 |
|
123 |
## Training and evaluation data
|
124 |
|