id_core_news_sm / README.md
firqaaa's picture
Update spaCy pipeline
b8a2277 verified
---
tags:
- spacy
- token-classification
language:
- id
model-index:
- name: id_core_news_sm
results:
- task:
name: NER
type: token-classification
metrics:
- name: NER Precision
type: precision
value: 0.6744068652
- name: NER Recall
type: recall
value: 0.7046413502
- name: NER F Score
type: f_score
value: 0.6891926747
- task:
name: TAG
type: token-classification
metrics:
- name: TAG (XPOS) Accuracy
type: accuracy
value: 0.9230464416
- task:
name: POS
type: token-classification
metrics:
- name: POS (UPOS) Accuracy
type: accuracy
value: 0.9235158913
- task:
name: MORPH
type: token-classification
metrics:
- name: Morph (UFeats) Accuracy
type: accuracy
value: 0.9448694067
- task:
name: LEMMA
type: token-classification
metrics:
- name: Lemma Accuracy
type: accuracy
value: 0.970604548
- task:
name: UNLABELED_DEPENDENCIES
type: token-classification
metrics:
- name: Unlabeled Attachment Score (UAS)
type: f_score
value: 0.8197906398
- task:
name: LABELED_DEPENDENCIES
type: token-classification
metrics:
- name: Labeled Attachment Score (LAS)
type: f_score
value: 0.7328201277
- task:
name: SENTS
type: token-classification
metrics:
- name: Sentences F-Score
type: f_score
value: 0.859929078
---
| Feature | Description |
| --- | --- |
| **Name** | `id_core_news_sm` |
| **Version** | `0.0.4` |
| **spaCy** | `>=3.7.0,<3.8.0` |
| **Default Pipeline** | `tok2vec`, `ner`, `tagger`, `morphologizer`, `trainable_lemmatizer`, `parser` |
| **Components** | `tok2vec`, `ner`, `tagger`, `morphologizer`, `trainable_lemmatizer`, `parser` |
| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
| **Sources** | n/a |
| **License** | n/a |
| **Author** | [n/a]() |
### Label Scheme
<details>
<summary>View label scheme (321 labels for 4 components)</summary>
| Component | Labels |
| --- | --- |
| **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PRODUCT`, `QUANTITY`, `TIME`, `WORK_OF_ART` |
| **`tagger`** | `APP`, `ASP`, `ASP+PS2`, `ASP+PS3`, `ASP+T--`, `ASS`, `ASS+PS3`, `B--`, `B--+PS3`, `B--+T--`, `CC-`, `CC-+PS3`, `CC-+T--`, `CD-`, `CD-+PS3`, `CO-`, `CO-+PS3`, `D--`, `D--+PS2`, `D--+PS3`, `D--+T--`, `F--`, `F--+PS1`, `F--+PS2`, `F--+PS3`, `F--+T--`, `G--`, `G--+PS3`, `G--+T--`, `H--`, `H--+T--`, `I--`, `M--`, `M--+PS3`, `M--+T--`, `NOUN`, `NPD`, `NPD+PS2`, `NPD+PS3`, `NSD`, `NSD+PS1`, `NSD+PS2`, `NSD+PS3`, `NSD+T--`, `NSF`, `NSM`, `NSM+PS3`, `NUM`, `O--`, `PP1`, `PP1+T--`, `PP2`, `PP3`, `PP3+T--`, `PROPN`, `PS1`, `PS1+VSA`, `PS1+VSA+T--`, `PS2`, `PS2+VSA`, `PS3`, `PUNCT`, `R--`, `R--+PS1`, `R--+PS2`, `R--+PS3`, `S--`, `S--+PS3`, `T--`, `VERB`, `VPA`, `VSA`, `VSA+PS1`, `VSA+PS2`, `VSA+PS3`, `VSA+T--`, `VSP`, `VSP+PS3`, `VSP+T--`, `W--`, `W--+T--`, `X`, `X--`, `Z--` |
| **`morphologizer`** | `POS=PROPN`, `POS=AUX`, `POS=DET\|PronType=Ind`, `Number=Sing\|POS=NOUN`, `POS=PRON\|PronType=Rel`, `Number=Sing\|POS=VERB\|Voice=Pass`, `POS=ADP`, `POS=PUNCT`, `Number=Sing\|POS=PROPN`, `POS=NOUN`, `POS=ADV`, `POS=CCONJ`, `Number=Sing\|POS=VERB\|Voice=Act`, `POS=VERB`, `POS=DET\|PronType=Tot`, `Number=Sing\|Number[psor]=Sing\|POS=NOUN\|Person[psor]=3`, `POS=SCONJ`, `Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `POS=DET\|PronType=Dem`, `NumType=Card\|POS=NUM`, `Degree=Pos\|Number=Sing\|POS=NOUN`, `Degree=Pos\|Number=Sing\|POS=ADJ`, `NumType=Card\|POS=DET\|PronType=Ind`, `Degree=Pos\|Number=Sing\|POS=ADP`, `Number[psor]=Sing\|POS=NOUN\|Person[psor]=3`, `Number=Sing\|POS=VERB`, `POS=PRON\|PronType=Int`, `Number=Sing\|POS=ADV\|Voice=Act`, `Number=Sing\|Number[psor]=Sing\|POS=VERB\|Person[psor]=3\|Voice=Act`, `Number=Sing\|POS=ADP\|Voice=Act`, `POS=ADJ`, `Number[psor]=Sing\|POS=ADP\|Person[psor]=3`, `Degree=Pos\|Number=Sing\|POS=DET`, `Degree=Pos\|Number=Sing\|POS=VERB`, `POS=PRON\|PronType=Dem`, `POS=PART\|Polarity=Neg`, `Degree=Pos\|Number=Sing\|Number[psor]=Sing\|POS=NOUN\|Person[psor]=3`, `Number=Sing\|POS=PRON\|Person=1\|Polite=Form\|PronType=Prs`, `Number=Sing\|POS=ADJ`, `Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `POS=SYM`, `POS=ADV\|PronType=Int`, `Clusivity=In\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Number=Sing\|POS=ADJ\|Voice=Act`, `Degree=Pos\|Number=Sing\|POS=PROPN`, `Degree=Pos\|Number=Sing\|POS=ADV`, `Number=Sing\|Number[psor]=Sing\|POS=VERB\|Person[psor]=3\|Voice=Pass`, `Number=Sing\|Number[psor]=Sing\|POS=NOUN\|Person[psor]=3\|Voice=Act`, `Number=Sing\|POS=PROPN\|Voice=Act`, `Number=Sing\|POS=NOUN\|Voice=Act`, `POS=DET`, `Number=Sing\|POS=DET\|Voice=Act`, `NumType=Card\|POS=PRON\|PronType=Ind`, `Number=Sing\|Number[psor]=Sing\|POS=ADV\|Person[psor]=3`, `Number=Sing\|POS=DET`, `Number=Sing\|POS=ADJ\|Voice=Pass`, `POS=CCONJ\|PronType=Dem`, `Number=Sing\|POS=ADP`, `Number=Sing\|POS=ADV`, `Number=Sing\|POS=PRON\|Person=2\|Polite=Infm\|PronType=Prs`, `Number[psor]=Sing\|POS=NOUN\|Person[psor]=2`, `Number=Plur\|Number[psor]=Sing\|POS=NOUN\|Person[psor]=2`, `Number=Sing\|POS=PRON`, `POS=PRON`, `NumType=Card\|POS=ADV\|PronType=Ind`, `NumType=Card\|Number[psor]=Sing\|POS=NUM\|Person[psor]=3`, `Number=Sing\|POS=PRON\|Person=3\|Polite=Form\|PronType=Prs`, `POS=DET\|PronType=Int`, `Number=Sing\|Number[psor]=Sing\|POS=PROPN\|Person[psor]=3`, `Number=Sing\|Number[psor]=Sing\|POS=PROPN\|Person[psor]=1`, `Degree=Pos\|Number=Sing\|POS=SCONJ`, `POS=PRON\|PronType=Ind`, `Number=Sing\|Number[psor]=Sing\|POS=NOUN\|Person[psor]=3\|Voice=Pass`, `POS=VERB\|PronType=Ind`, `Degree=Pos\|Number=Sing\|Number[psor]=Sing\|POS=ADJ\|Person[psor]=3`, `Number=Sing\|POS=SCONJ`, `Degree=Sup\|Number=Sing\|Number[psor]=Sing\|POS=ADJ\|Person[psor]=3`, `Degree=Pos\|Number=Sing\|Number[psor]=Sing\|POS=ADP\|Person[psor]=3`, `Number=Plur\|Number[psor]=Sing\|POS=NOUN\|Person[psor]=3`, `Number=Plur\|POS=NOUN`, `POS=ADV\|PronType=Dem`, `Number=Sing\|POS=VERB\|Person=1\|Voice=Act`, `Degree=Sup\|Number=Sing\|POS=ADJ`, `Number=Sing\|POS=ADP\|Voice=Pass`, `Number[psor]=Sing\|POS=PART\|Person[psor]=3`, `Number=Sing\|POS=NOUN\|Voice=Pass`, `Degree=Pos\|Number=Sing\|Number[psor]=Sing\|POS=CCONJ\|Person[psor]=3`, `POS=PART`, `Number=Sing\|Number[psor]=Sing\|POS=PART\|Person[psor]=3\|Voice=Pass`, `Degree=Sup\|Number=Sing\|POS=ADV`, `Number=Sing\|POS=PRON\|Voice=Act`, `Number=Sing\|Number[psor]=Sing\|POS=PROPN\|Person[psor]=3\|Voice=Act`, `Gender=Masc\|Number=Sing\|POS=PROPN`, `Number[psor]=Sing\|POS=PRON\|Person[psor]=3\|PronType=Tot`, `Degree=Pos\|Number=Sing\|POS=X`, `POS=PRON\|PronType=Tot`, `Degree=Pos\|Number=Sing\|Number[psor]=Sing\|POS=ADV\|Person[psor]=3`, `Degree=Pos\|Number=Sing\|Number[psor]=Sing\|POS=VERB\|Person[psor]=3`, `Number=Sing\|Number[psor]=Sing\|POS=ADP\|Person[psor]=3`, `Number=Sing\|Number[psor]=Sing\|POS=NOUN\|Person[psor]=2`, `POS=SCONJ\|PronType=Int`, `Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Number=Sing\|Number[psor]=Sing\|POS=VERB\|Person[psor]=1\|Voice=Act`, `Number[psor]=Sing\|POS=DET\|Person[psor]=3`, `Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person[psor]=3`, `Clusivity=Ex\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Number=Plur\|POS=VERB\|Voice=Act`, `Number=Sing\|Number[psor]=Sing\|POS=ADV\|Person[psor]=3\|Voice=Act`, `Degree=Pos\|Number=Sing\|POS=NOUN\|Polarity=Neg`, `POS=X`, `Number[psor]=Sing\|POS=ADJ\|Person[psor]=3`, `Number=Sing\|Number[psor]=Sing\|POS=VERB\|Person[psor]=3`, `Number=Sing\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Number=Sing\|POS=PRON\|Person=1\|Polite=Infm\|PronType=Prs`, `Number=Sing\|POS=PROPN\|Voice=Pass`, `POS=ADV\|Polarity=Neg`, `NumType=Card\|Number=Sing\|POS=NUM`, `Number[psor]=Sing\|POS=ADV\|Person[psor]=2`, `Number[psor]=Sing\|POS=ADV\|Person[psor]=3`, `Degree=Sup\|Number=Sing\|POS=PROPN`, `POS=PROPN\|Polarity=Neg`, `Number=Sing\|Number[psor]=Sing\|POS=VERB\|Person[psor]=2\|Voice=Act`, `Number=Sing\|POS=PROPN\|Person=1\|Voice=Act`, `POS=SCONJ\|PronType=Dem`, `Number=Sing\|Number[psor]=Sing\|POS=ADV\|Person[psor]=2\|Voice=Act`, `Number=Sing\|POS=CCONJ`, `Degree=Sup\|Number=Sing\|POS=VERB`, `Number=Sing\|Number[psor]=Sing\|POS=ADJ\|Person[psor]=3`, `Number=Sing\|Number[psor]=Sing\|POS=ADJ\|Person[psor]=3\|Voice=Act`, `Degree=Pos\|Number=Sing\|POS=PRON`, `Number=Sing\|POS=ADV\|Voice=Pass`, `Number[psor]=Sing\|POS=ADP\|Person[psor]=2`, `Number=Sing\|POS=SYM`, `POS=ADJ\|Polarity=Neg`, `Degree=Pos\|NumType=Card\|Number=Sing\|POS=NUM`, `Number=Sing\|Number[psor]=Sing\|POS=SCONJ\|Person[psor]=3`, `Degree=Pos\|Number=Sing\|POS=CCONJ`, `Number[psor]=Sing\|POS=NOUN\|Person[psor]=1`, `Number=Sing\|POS=CCONJ\|Voice=Act`, `Gender=Masc\|Number=Sing\|POS=NOUN`, `Number=Sing\|Number[psor]=Sing\|POS=ADP\|Person[psor]=3\|Voice=Pass`, `Gender=Fem\|Number=Sing\|POS=PROPN`, `POS=VERB\|PronType=Dem`, `Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Masc\|Number=Sing\|Number[psor]=Sing\|POS=NOUN\|Person[psor]=3`, `Number=Sing\|POS=PART\|Voice=Act`, `Degree=Sup\|Number=Sing\|Number[psor]=Sing\|POS=NOUN\|Person[psor]=3`, `POS=ADP\|PronType=Int`, `Number[psor]=Sing\|POS=VERB\|Person[psor]=3`, `Number[psor]=Sing\|POS=PRON\|Person[psor]=3\|PronType=Rel`, `Degree=Pos\|Number=Sing\|POS=AUX`, `Number=Sing\|Number[psor]=Sing\|POS=NOUN\|Person[psor]=1`, `Number=Sing\|POS=SCONJ\|Voice=Pass`, `Degree=Sup\|Number=Sing\|POS=ADP`, `Number=Sing\|POS=SCONJ\|Voice=Act`, `NumType=Card\|POS=DET\|PronType=Int`, `Degree=Pos\|Number=Sing\|POS=PART\|Polarity=Neg`, `Degree=Sup\|Number=Sing\|POS=SCONJ`, `Number=Sing\|Number[psor]=Sing\|POS=NOUN\|Person[psor]=1\|Voice=Act`, `Number=Plur\|POS=ADJ`, `POS=VERB\|PronType=Int`, `Number=Sing\|POS=VERB\|Person=2\|Voice=Act`, `Degree=Pos\|Number=Sing\|Number[psor]=Sing\|POS=ADJ\|Person[psor]=2`, `Gender=Masc\|Number=Sing\|POS=ADJ`, `Number[psor]=Sing\|POS=ADV\|Person[psor]=3\|PronType=Tot`, `POS=DET\|PronType=Rel`, `Number=Sing\|POS=NOUN\|Polarity=Neg`, `Number=Sing\|Number[psor]=Sing\|POS=PROPN\|Person[psor]=2`, `NumType=Card\|Number=Sing\|POS=NUM\|Voice=Act`, `Degree=Pos\|Number=Plur\|POS=ADJ`, `Number[psor]=Sing\|POS=DET\|Person[psor]=3\|PronType=Tot`, `Number[psor]=Sing\|POS=PROPN\|Person[psor]=1`, `Gender=Fem\|Number=Sing\|POS=NOUN`, `Number=Sing\|POS=VERB\|Person=1`, `Degree=Pos\|Number=Sing\|Number[psor]=Sing\|POS=PROPN\|Person[psor]=3`, `NumType=Card\|Number[psor]=Sing\|POS=DET\|Person[psor]=3\|PronType=Ind`, `POS=ADV\|PronType=Tot`, `Degree=Pos\|Number=Plur\|POS=ADV`, `Number=Plur\|POS=ADV\|Voice=Act`, `POS=CCONJ\|PronType=Int`, `Degree=Pos\|Number=Sing\|POS=PART`, `Number[psor]=Sing\|POS=PRON\|Person[psor]=2`, `Number=Plur\|POS=VERB`, `Number=Sing\|Number[psor]=Sing\|POS=ADJ\|Person[psor]=3\|Voice=Pass`, `Degree=Pos\|Number=Sing\|POS=PUNCT`, `Number[psor]=Sing\|POS=ADP\|Person[psor]=1`, `Degree=Sup\|Number=Sing\|POS=NOUN`, `Number[psor]=Sing\|POS=PART\|Person[psor]=3\|Polarity=Neg`, `Number=Sing\|Number[psor]=Sing\|POS=ADP\|Person[psor]=3\|Voice=Act`, `POS=NOUN\|Polarity=Neg`, `Number[psor]=Sing\|POS=PROPN\|Person[psor]=2`, `Number=Sing\|Number[psor]=Sing\|POS=NOUN\|Person[psor]=2\|Voice=Act` |
| **`parser`** | `ROOT`, `acl`, `advcl`, `advmod`, `amod`, `appos`, `case`, `cc`, `ccomp`, `compound`, `compound:plur`, `conj`, `cop`, `dep`, `det`, `fixed`, `flat`, `mark`, `nmod`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl`, `parataxis`, `punct`, `xcomp` |
</details>
### Accuracy
| Type | Score |
| --- | --- |
| `ENTS_F` | 68.92 |
| `ENTS_P` | 67.44 |
| `ENTS_R` | 70.46 |
| `TAG_ACC` | 92.30 |
| `POS_ACC` | 92.35 |
| `MORPH_ACC` | 94.49 |
| `LEMMA_ACC` | 97.06 |
| `DEP_UAS` | 81.98 |
| `DEP_LAS` | 73.28 |
| `SENTS_P` | 85.24 |
| `SENTS_R` | 86.76 |
| `SENTS_F` | 85.99 |
| `TOK2VEC_LOSS` | 984087.67 |
| `NER_LOSS` | 123734.71 |
| `TAGGER_LOSS` | 65883.76 |
| `MORPHOLOGIZER_LOSS` | 200589.63 |
| `TRAINABLE_LEMMATIZER_LOSS` | 23717.64 |
| `PARSER_LOSS` | 1145740.96 |