metadata
tags:
- spacy
- token-classification
language:
- mk
license: CC-BY-SA-4.0
model-index:
- name: mk_core_news_md
results:
- tasks:
name: NER
type: token-classification
metrics:
- name: Precision
type: precision
value: 0.7577586207
- name: Recall
type: recall
value: 0.7480851064
- name: F Score
type: f_score
value: 0.7528907923
- tasks:
name: SENTER
type: token-classification
metrics:
- name: Precision
type: precision
value: 0.768115942
- name: Recall
type: recall
value: 0.6883116883
- name: F Score
type: f_score
value: 0.7260273973
- tasks:
name: UNLABELED_DEPENDENCIES
type: token-classification
metrics:
- name: Accuracy
type: accuracy
value: 0.68633235
- tasks:
name: LABELED_DEPENDENCIES
type: token-classification
metrics:
- name: Accuracy
type: accuracy
value: 0.68633235
Details: https://spacy.io/models/mk#mk_core_news_md
Macedonian pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.
Feature | Description |
---|---|
Name | mk_core_news_md |
Version | 3.1.0 |
spaCy | >=3.1.0,<3.2.0 |
Default Pipeline | morphologizer , parser , attribute_ruler , lemmatizer , ner |
Components | morphologizer , parser , senter , attribute_ruler , lemmatizer , ner |
Vectors | 274587 keys, 20000 unique vectors (300 dimensions) |
Sources | Macedonian Corpus (Damjan Zlatinov, Melanija Gerasimovska, Borijan Georgievski, Marija Todosovska) Macedonian Corpus (Damjan Zlatinov, Melanija Gerasimovska, Borijan Georgievski, Marija Todosovska) Macedonian Corpus (Damjan Zlatinov, Melanija Gerasimovska, Borijan Georgievski, Marija Todosovska) spaCy lookups data (Explosion) Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia) (Explosion) |
License | CC BY-SA 4.0 |
Author | Explosion |
Label Scheme
View label scheme (55 labels for 4 components)
Component | Labels |
---|---|
morphologizer |
POS=PROPN , POS=AUX , POS=ADJ , POS=NOUN , POS=ADP , POS=PUNCT , POS=CONJ , POS=NUM , POS=VERB , POS=PRON , POS=ADV , POS=SCONJ , POS=PART , POS=SYM , POS=X , _ , POS=INTJ |
parser |
ROOT , advmod , att , aux , cc , dep , det , dobj , iobj , neg , nsubj , pobj , poss , pozm , pozv , prep , punct , relcl |
senter |
I , S |
ner |
CARDINAL , DATE , EVENT , FAC , GPE , LANGUAGE , LAW , LOC , MONEY , NORP , ORDINAL , ORG , PERCENT , PERSON , PRODUCT , QUANTITY , TIME , WORK_OF_ART |
Accuracy
Type | Score |
---|---|
TOKEN_ACC |
100.00 |
POS_ACC |
93.15 |
SENTS_P |
76.81 |
SENTS_R |
68.83 |
SENTS_F |
72.60 |
DEP_UAS |
68.63 |
DEP_LAS |
53.29 |
ENTS_P |
75.78 |
ENTS_R |
74.81 |
ENTS_F |
75.29 |