EC2 Default User commited on
Commit
2f5c7b2
1 Parent(s): b53d61c

Update spaCy pipeline

Browse files
LICENSES_SOURCES CHANGED
@@ -1758,37 +1758,6 @@ Creative Commons may be contacted at creativecommons.org.
1758
 
1759
 
1760
 
1761
- # spaCy lookups data
1762
-
1763
- * Author: Explosion
1764
- * URL: https://github.com/explosion/spacy-lookups-data
1765
- * License: MIT
1766
-
1767
- ```
1768
- Copyright 2019-2021 ExplosionAI GmbH
1769
-
1770
- Permission is hereby granted, free of charge, to any person obtaining a copy of
1771
- this software and associated documentation files (the "Software"), to deal in
1772
- the Software without restriction, including without limitation the rights to
1773
- use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
1774
- of the Software, and to permit persons to whom the Software is furnished to do
1775
- so, subject to the following conditions:
1776
-
1777
- The above copyright notice and this permission notice shall be included in all
1778
- copies or substantial portions of the Software.
1779
-
1780
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
1781
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
1782
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
1783
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
1784
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
1785
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
1786
- SOFTWARE.
1787
- ```
1788
-
1789
-
1790
-
1791
-
1792
  # Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)
1793
 
1794
  * Author: Explosion
1758
 
1759
 
1760
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1761
  # Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)
1762
 
1763
  * Author: Explosion
README.md CHANGED
@@ -14,61 +14,76 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.7772241993
18
  - name: NER Recall
19
  type: recall
20
- value: 0.755186722
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.7660470011
 
 
 
 
 
 
 
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
- - name: POS Accuracy
29
  type: accuracy
30
- value: 0.9543293348
31
  - task:
32
- name: SENTER
33
  type: token-classification
34
  metrics:
35
- - name: SENTER Precision
36
- type: precision
37
- value: 0.866340098
38
- - name: SENTER Recall
39
- type: recall
40
- value: 0.8880918221
41
- - name: SENTER F Score
42
- type: f_score
43
- value: 0.8770811194
44
  - task:
45
- name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
- - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
- value: 0.8701736595
 
 
 
 
 
 
 
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
- - name: Labeled Dependencies Accuracy
56
- type: accuracy
57
- value: 0.8701736595
 
 
 
 
 
 
 
58
  ---
59
  ### Details: https://spacy.io/models/nl#nl_core_news_lg
60
 
61
- Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, senter, ner, attribute_ruler, lemmatizer.
62
 
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `nl_core_news_lg` |
66
- | **Version** | `3.2.0` |
67
- | **spaCy** | `>=3.2.0,<3.3.0` |
68
- | **Default Pipeline** | `tok2vec`, `morphologizer`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
- | **Components** | `tok2vec`, `morphologizer`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
71
- | **Sources** | [UD Dutch LassySmall v2.8](https://github.com/UniversalDependencies/UD_Dutch-LassySmall) (Bouma, Gosse; van Noord, Gertjan)<br />[Dutch NER Annotations for UD LassySmall](https://nlp.town) (NLP Town)<br />[UD Dutch LassySmall v2.8](https://github.com/UniversalDependencies/UD_Dutch-LassySmall) (Bouma, Gosse; van Noord, Gertjan)<br />[UD Dutch Alpino v2.8](https://github.com/UniversalDependencies/UD_Dutch-Alpino) (Zeman, Daniel; Žabokrtský, Zdeněk; Bouma, Gosse; van Noord, Gertjan)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
72
  | **License** | `CC BY-SA 4.0` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
@@ -76,14 +91,13 @@ Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, pa
76
 
77
  <details>
78
 
79
- <summary>View label scheme (323 labels for 5 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
  | **`morphologizer`** | `POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=AUX\|Tense=Pres\|VerbForm=Fin`, `POS=ADV`, `POS=VERB\|VerbForm=Part`, `POS=PUNCT`, `Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Fin`, `POS=ADP`, `POS=NUM`, `Number=Plur\|POS=NOUN`, `POS=VERB\|VerbForm=Inf`, `POS=SCONJ`, `Definite=Def\|POS=DET`, `Gender=Com\|Number=Sing\|POS=NOUN`, `Number=Sing\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Degree=Pos\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=PROPN`, `Gender=Com\|Number=Sing\|POS=PROPN`, `POS=AUX\|VerbForm=Inf`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Fin`, `POS=DET`, `Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|Person=3\|PronType=Prs`, `POS=CCONJ`, `Number=Plur\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `POS=PRON\|Person=3\|PronType=Ind`, `Degree=Cmp\|POS=ADJ`, `Case=Nom\|POS=PRON\|Person=1\|PronType=Prs`, `Definite=Ind\|POS=DET`, `Case=Nom\|POS=PRON\|Person=3\|PronType=Prs`, `POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Number=Plur\|POS=AUX\|Tense=Pres\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Case=Acc\|POS=PRON\|Person=1\|PronType=Prs`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Fin`, `Gender=Com,Neut\|Number=Sing\|POS=NOUN`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs`, `POS=PROPN`, `POS=PRON\|PronType=Ind`, `POS=PRON\|Person=3\|PronType=Int`, `Case=Acc\|POS=PRON\|PronType=Rcp`, `Number=Plur\|POS=AUX\|Tense=Past\|VerbForm=Fin`, `Number=Sing\|POS=NOUN`, `POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `POS=SYM`, `Abbr=Yes\|POS=X`, `Gender=Com,Neut\|Number=Sing\|POS=PROPN`, `Degree=Sup\|POS=ADJ`, `POS=ADJ`, `Number=Sing\|POS=PROPN`, `POS=PRON\|PronType=Dem`, `POS=AUX\|VerbForm=Part`, `POS=PRON\|Person=3\|PronType=Rel`, `Number=Plur\|POS=PROPN`, `POS=PRON\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Dat\|POS=PRON\|PronType=Dem`, `Case=Nom\|POS=PRON\|Person=2\|PronType=Prs`, `POS=INTJ`, `Case=Acc\|POS=PRON\|Person=2\|PronType=Prs`, `Case=Gen\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=PRON\|PronType=Int`, `POS=PRON\|Person=2\|PronType=Prs`, `POS=PRON\|Person=3`, `Case=Gen\|POS=PRON\|Person=2\|PronType=Prs`, `POS=X` |
84
  | **`tagger`** | `ADJ\|nom\|basis\|met-e\|mv-n`, `ADJ\|nom\|basis\|met-e\|zonder-n\|bijz`, `ADJ\|nom\|basis\|met-e\|zonder-n\|stan`, `ADJ\|nom\|basis\|zonder\|mv-n`, `ADJ\|nom\|basis\|zonder\|zonder-n`, `ADJ\|nom\|comp\|met-e\|mv-n`, `ADJ\|nom\|comp\|met-e\|zonder-n\|stan`, `ADJ\|nom\|sup\|met-e\|mv-n`, `ADJ\|nom\|sup\|met-e\|zonder-n\|bijz`, `ADJ\|nom\|sup\|met-e\|zonder-n\|stan`, `ADJ\|nom\|sup\|zonder\|zonder-n`, `ADJ\|postnom\|basis\|met-s`, `ADJ\|postnom\|basis\|zonder`, `ADJ\|postnom\|comp\|met-s`, `ADJ\|prenom\|basis\|met-e\|bijz`, `ADJ\|prenom\|basis\|met-e\|stan`, `ADJ\|prenom\|basis\|zonder`, `ADJ\|prenom\|comp\|met-e\|stan`, `ADJ\|prenom\|comp\|zonder`, `ADJ\|prenom\|sup\|met-e\|stan`, `ADJ\|prenom\|sup\|zonder`, `ADJ\|vrij\|basis\|zonder`, `ADJ\|vrij\|comp\|zonder`, `ADJ\|vrij\|dim\|zonder`, `ADJ\|vrij\|sup\|zonder`, `BW`, `LET`, `LID\|bep\|dat\|evmo`, `LID\|bep\|gen\|evmo`, `LID\|bep\|gen\|rest3`, `LID\|bep\|stan\|evon`, `LID\|bep\|stan\|rest`, `LID\|onbep\|stan\|agr`, `N\|eigen\|ev\|basis\|gen`, `N\|eigen\|ev\|basis\|genus\|stan`, `N\|eigen\|ev\|basis\|onz\|stan`, `N\|eigen\|ev\|basis\|zijd\|stan`, `N\|eigen\|ev\|dim\|onz\|stan`, `N\|eigen\|mv\|basis`, `N\|soort\|ev\|basis\|dat`, `N\|soort\|ev\|basis\|gen`, `N\|soort\|ev\|basis\|genus\|stan`, `N\|soort\|ev\|basis\|onz\|stan`, `N\|soort\|ev\|basis\|zijd\|stan`, `N\|soort\|ev\|dim\|onz\|stan`, `N\|soort\|mv\|basis`, `N\|soort\|mv\|dim`, `SPEC\|afgebr`, `SPEC\|afk`, `SPEC\|deeleigen`, `SPEC\|enof`, `SPEC\|meta`, `SPEC\|symb`, `SPEC\|vreemd`, `TSW`, `TW\|hoofd\|nom\|mv-n\|basis`, `TW\|hoofd\|nom\|mv-n\|dim`, `TW\|hoofd\|nom\|zonder-n\|basis`, `TW\|hoofd\|nom\|zonder-n\|dim`, `TW\|hoofd\|prenom\|stan`, `TW\|hoofd\|vrij`, `TW\|rang\|nom\|mv-n`, `TW\|rang\|nom\|zonder-n`, `TW\|rang\|prenom\|stan`, `VG\|neven`, `VG\|onder`, `VNW\|aanw\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|aanw\|adv-pron\|stan\|red\|3\|getal`, `VNW\|aanw\|det\|dat\|nom\|met-e\|zonder-n`, `VNW\|aanw\|det\|dat\|prenom\|met-e\|evmo`, `VNW\|aanw\|det\|gen\|prenom\|met-e\|rest3`, `VNW\|aanw\|det\|stan\|nom\|met-e\|mv-n`, `VNW\|aanw\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|aanw\|det\|stan\|prenom\|met-e\|rest`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|agr`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|evon`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|rest`, `VNW\|aanw\|det\|stan\|vrij\|zonder`, `VNW\|aanw\|pron\|gen\|vol\|3m\|ev`, `VNW\|aanw\|pron\|stan\|vol\|3o\|ev`, `VNW\|aanw\|pron\|stan\|vol\|3\|getal`, `VNW\|betr\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|betr\|det\|stan\|nom\|zonder\|zonder-n`, `VNW\|betr\|pron\|stan\|vol\|3\|ev`, `VNW\|betr\|pron\|stan\|vol\|persoon\|getal`, `VNW\|bez\|det\|gen\|vol\|3\|ev\|prenom\|met-e\|rest3`, `VNW\|bez\|det\|stan\|nadr\|2v\|mv\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|1\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|2v\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|3\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|1\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|1\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|1\|mv\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|1\|mv\|prenom\|zonder\|evon`, `VNW\|bez\|det\|stan\|vol\|2v\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|2\|getal\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|3m\|ev\|nom\|met-e\|zonder-n`, `VNW\|bez\|det\|stan\|vol\|3m\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3p\|mv\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3v\|ev\|nom\|met-e\|zonder-n`, `VNW\|bez\|det\|stan\|vol\|3v\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|3\|mv\|prenom\|zonder\|agr`, `VNW\|excl\|pron\|stan\|vol\|3\|getal`, `VNW\|onbep\|adv-pron\|gen\|red\|3\|getal`, `VNW\|onbep\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|onbep\|det\|stan\|nom\|met-e\|mv-n`, `VNW\|onbep\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|onbep\|det\|stan\|nom\|zonder\|zonder-n`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|agr`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|evz`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|mv`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|rest`, `VNW\|onbep\|det\|stan\|prenom\|zonder\|agr`, `VNW\|onbep\|det\|stan\|prenom\|zonder\|evon`, `VNW\|onbep\|det\|stan\|vrij\|zonder`, `VNW\|onbep\|grad\|gen\|nom\|met-e\|mv-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|mv-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|mv-n\|sup`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|zonder-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|zonder-n\|sup`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|comp`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|sup`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|mv\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|zonder\|agr\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|zonder\|agr\|comp`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|basis`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|comp`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|sup`, `VNW\|onbep\|pron\|gen\|vol\|3p\|ev`, `VNW\|onbep\|pron\|stan\|vol\|3o\|ev`, `VNW\|onbep\|pron\|stan\|vol\|3p\|ev`, `VNW\|pers\|pron\|gen\|vol\|2\|getal`, `VNW\|pers\|pron\|nomin\|nadr\|3m\|ev\|masc`, `VNW\|pers\|pron\|nomin\|nadr\|3v\|ev\|fem`, `VNW\|pers\|pron\|nomin\|red\|1\|mv`, `VNW\|pers\|pron\|nomin\|red\|2v\|ev`, `VNW\|pers\|pron\|nomin\|red\|2\|getal`, `VNW\|pers\|pron\|nomin\|red\|3p\|ev\|masc`, `VNW\|pers\|pron\|nomin\|red\|3\|ev\|masc`, `VNW\|pers\|pron\|nomin\|vol\|1\|ev`, `VNW\|pers\|pron\|nomin\|vol\|1\|mv`, `VNW\|pers\|pron\|nomin\|vol\|2b\|getal`, `VNW\|pers\|pron\|nomin\|vol\|2v\|ev`, `VNW\|pers\|pron\|nomin\|vol\|2\|getal`, `VNW\|pers\|pron\|nomin\|vol\|3p\|mv`, `VNW\|pers\|pron\|nomin\|vol\|3v\|ev\|fem`, `VNW\|pers\|pron\|nomin\|vol\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|nadr\|3m\|ev\|masc`, `VNW\|pers\|pron\|obl\|red\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|vol\|2v\|ev`, `VNW\|pers\|pron\|obl\|vol\|3p\|mv`, `VNW\|pers\|pron\|obl\|vol\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|vol\|3\|getal\|fem`, `VNW\|pers\|pron\|stan\|nadr\|2v\|mv`, `VNW\|pers\|pron\|stan\|red\|3\|ev\|fem`, `VNW\|pers\|pron\|stan\|red\|3\|ev\|onz`, `VNW\|pers\|pron\|stan\|red\|3\|mv`, `VNW\|pr\|pron\|obl\|nadr\|1\|ev`, `VNW\|pr\|pron\|obl\|nadr\|2v\|getal`, `VNW\|pr\|pron\|obl\|nadr\|2\|getal`, `VNW\|pr\|pron\|obl\|red\|1\|ev`, `VNW\|pr\|pron\|obl\|red\|2v\|getal`, `VNW\|pr\|pron\|obl\|vol\|1\|ev`, `VNW\|pr\|pron\|obl\|vol\|1\|mv`, `VNW\|pr\|pron\|obl\|vol\|2\|getal`, `VNW\|recip\|pron\|gen\|vol\|persoon\|mv`, `VNW\|recip\|pron\|obl\|vol\|persoon\|mv`, `VNW\|refl\|pron\|obl\|nadr\|3\|getal`, `VNW\|refl\|pron\|obl\|red\|3\|getal`, `VNW\|vb\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|vb\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|vb\|det\|stan\|prenom\|met-e\|rest`, `VNW\|vb\|det\|stan\|prenom\|zonder\|evon`, `VNW\|vb\|pron\|gen\|vol\|3m\|ev`, `VNW\|vb\|pron\|gen\|vol\|3p\|mv`, `VNW\|vb\|pron\|gen\|vol\|3v\|ev`, `VNW\|vb\|pron\|stan\|vol\|3o\|ev`, `VNW\|vb\|pron\|stan\|vol\|3p\|getal`, `VZ\|fin`, `VZ\|init`, `VZ\|versm`, `WW\|inf\|nom\|zonder\|zonder-n`, `WW\|inf\|prenom\|met-e`, `WW\|inf\|vrij\|zonder`, `WW\|od\|nom\|met-e\|mv-n`, `WW\|od\|nom\|met-e\|zonder-n`, `WW\|od\|prenom\|met-e`, `WW\|od\|prenom\|zonder`, `WW\|od\|vrij\|zonder`, `WW\|pv\|conj\|ev`, `WW\|pv\|tgw\|ev`, `WW\|pv\|tgw\|met-t`, `WW\|pv\|tgw\|mv`, `WW\|pv\|verl\|ev`, `WW\|pv\|verl\|mv`, `WW\|vd\|nom\|met-e\|mv-n`, `WW\|vd\|nom\|met-e\|zonder-n`, `WW\|vd\|prenom\|met-e`, `WW\|vd\|prenom\|zonder`, `WW\|vd\|vrij\|zonder` |
85
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux`, `aux:pass`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `csubj`, `dep`, `det`, `expl`, `expl:pv`, `fixed`, `flat`, `iobj`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl`, `obl:agent`, `orphan`, `parataxis`, `punct`, `xcomp` |
86
- | **`senter`** | `I`, `S` |
87
  | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PRODUCT`, `QUANTITY`, `TIME`, `WORK_OF_ART` |
88
 
89
  </details>
@@ -92,22 +106,22 @@ Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, pa
92
 
93
  | Type | Score |
94
  | --- | --- |
 
 
 
 
 
 
 
 
 
95
  | `TOKEN_ACC` | 99.97 |
96
  | `TOKEN_P` | 99.74 |
97
  | `TOKEN_R` | 99.76 |
98
  | `TOKEN_F` | 99.75 |
99
- | `POS_ACC` | 96.67 |
100
- | `MORPH_ACC` | 96.49 |
101
- | `MORPH_MICRO_P` | 97.44 |
102
- | `MORPH_MICRO_R` | 95.59 |
103
- | `MORPH_MICRO_F` | 96.50 |
104
- | `TAG_ACC` | 95.43 |
105
- | `SENTS_P` | 86.63 |
106
- | `SENTS_R` | 88.81 |
107
- | `SENTS_F` | 87.71 |
108
- | `DEP_UAS` | 87.02 |
109
- | `DEP_LAS` | 82.61 |
110
- | `LEMMA_ACC` | 81.59 |
111
- | `ENTS_P` | 77.72 |
112
- | `ENTS_R` | 75.52 |
113
- | `ENTS_F` | 76.60 |
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.7652916074
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.7441217151
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.7545582048
24
+ - task:
25
+ name: TAG
26
+ type: token-classification
27
+ metrics:
28
+ - name: TAG (XPOS) Accuracy
29
+ type: accuracy
30
+ value: 0.9534133043
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
+ - name: POS (UPOS) Accuracy
36
  type: accuracy
37
+ value: 0.9661941112
38
  - task:
39
+ name: MORPH
40
  type: token-classification
41
  metrics:
42
+ - name: Morph (UFeats) Accuracy
43
+ type: accuracy
44
+ value: 0.9635947213
 
 
 
 
 
 
45
  - task:
46
+ name: LEMMA
47
  type: token-classification
48
  metrics:
49
+ - name: Lemma Accuracy
50
  type: accuracy
51
+ value: 0.9417537126
52
+ - task:
53
+ name: UNLABELED_DEPENDENCIES
54
+ type: token-classification
55
+ metrics:
56
+ - name: Unlabeled Attachment Score (UAS)
57
+ type: f_score
58
+ value: 0.8698053923
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
+ - name: Labeled Attachment Score (LAS)
64
+ type: f_score
65
+ value: 0.8235860531
66
+ - task:
67
+ name: SENTS
68
+ type: token-classification
69
+ metrics:
70
+ - name: Sentences F-Score
71
+ type: f_score
72
+ value: 0.8749559704
73
  ---
74
  ### Details: https://spacy.io/models/nl#nl_core_news_lg
75
 
76
+ Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, lemmatizer (trainable_lemmatizer), senter, ner.
77
 
78
  | Feature | Description |
79
  | --- | --- |
80
  | **Name** | `nl_core_news_lg` |
81
+ | **Version** | `3.3.0` |
82
+ | **spaCy** | `>=3.3.0.dev0,<3.4.0` |
83
+ | **Default Pipeline** | `tok2vec`, `morphologizer`, `tagger`, `parser`, `lemmatizer`, `attribute_ruler`, `ner` |
84
+ | **Components** | `tok2vec`, `morphologizer`, `tagger`, `parser`, `lemmatizer`, `senter`, `attribute_ruler`, `ner` |
85
  | **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
86
+ | **Sources** | [UD Dutch LassySmall v2.8](https://github.com/UniversalDependencies/UD_Dutch-LassySmall) (Bouma, Gosse; van Noord, Gertjan)<br />[Dutch NER Annotations for UD LassySmall](https://nlp.town) (NLP Town)<br />[UD Dutch LassySmall v2.8](https://github.com/UniversalDependencies/UD_Dutch-LassySmall) (Bouma, Gosse; van Noord, Gertjan)<br />[UD Dutch Alpino v2.8](https://github.com/UniversalDependencies/UD_Dutch-Alpino) (Zeman, Daniel; Žabokrtský, Zdeněk; Bouma, Gosse; van Noord, Gertjan)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
87
  | **License** | `CC BY-SA 4.0` |
88
  | **Author** | [Explosion](https://explosion.ai) |
89
 
91
 
92
  <details>
93
 
94
+ <summary>View label scheme (321 labels for 4 components)</summary>
95
 
96
  | Component | Labels |
97
  | --- | --- |
98
  | **`morphologizer`** | `POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=AUX\|Tense=Pres\|VerbForm=Fin`, `POS=ADV`, `POS=VERB\|VerbForm=Part`, `POS=PUNCT`, `Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Fin`, `POS=ADP`, `POS=NUM`, `Number=Plur\|POS=NOUN`, `POS=VERB\|VerbForm=Inf`, `POS=SCONJ`, `Definite=Def\|POS=DET`, `Gender=Com\|Number=Sing\|POS=NOUN`, `Number=Sing\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Degree=Pos\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=PROPN`, `Gender=Com\|Number=Sing\|POS=PROPN`, `POS=AUX\|VerbForm=Inf`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Fin`, `POS=DET`, `Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|Person=3\|PronType=Prs`, `POS=CCONJ`, `Number=Plur\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `POS=PRON\|Person=3\|PronType=Ind`, `Degree=Cmp\|POS=ADJ`, `Case=Nom\|POS=PRON\|Person=1\|PronType=Prs`, `Definite=Ind\|POS=DET`, `Case=Nom\|POS=PRON\|Person=3\|PronType=Prs`, `POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Number=Plur\|POS=AUX\|Tense=Pres\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Case=Acc\|POS=PRON\|Person=1\|PronType=Prs`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Fin`, `Gender=Com,Neut\|Number=Sing\|POS=NOUN`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs`, `POS=PROPN`, `POS=PRON\|PronType=Ind`, `POS=PRON\|Person=3\|PronType=Int`, `Case=Acc\|POS=PRON\|PronType=Rcp`, `Number=Plur\|POS=AUX\|Tense=Past\|VerbForm=Fin`, `Number=Sing\|POS=NOUN`, `POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `POS=SYM`, `Abbr=Yes\|POS=X`, `Gender=Com,Neut\|Number=Sing\|POS=PROPN`, `Degree=Sup\|POS=ADJ`, `POS=ADJ`, `Number=Sing\|POS=PROPN`, `POS=PRON\|PronType=Dem`, `POS=AUX\|VerbForm=Part`, `POS=PRON\|Person=3\|PronType=Rel`, `Number=Plur\|POS=PROPN`, `POS=PRON\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Dat\|POS=PRON\|PronType=Dem`, `Case=Nom\|POS=PRON\|Person=2\|PronType=Prs`, `POS=INTJ`, `Case=Acc\|POS=PRON\|Person=2\|PronType=Prs`, `Case=Gen\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=PRON\|PronType=Int`, `POS=PRON\|Person=2\|PronType=Prs`, `POS=PRON\|Person=3`, `Case=Gen\|POS=PRON\|Person=2\|PronType=Prs`, `POS=X` |
99
  | **`tagger`** | `ADJ\|nom\|basis\|met-e\|mv-n`, `ADJ\|nom\|basis\|met-e\|zonder-n\|bijz`, `ADJ\|nom\|basis\|met-e\|zonder-n\|stan`, `ADJ\|nom\|basis\|zonder\|mv-n`, `ADJ\|nom\|basis\|zonder\|zonder-n`, `ADJ\|nom\|comp\|met-e\|mv-n`, `ADJ\|nom\|comp\|met-e\|zonder-n\|stan`, `ADJ\|nom\|sup\|met-e\|mv-n`, `ADJ\|nom\|sup\|met-e\|zonder-n\|bijz`, `ADJ\|nom\|sup\|met-e\|zonder-n\|stan`, `ADJ\|nom\|sup\|zonder\|zonder-n`, `ADJ\|postnom\|basis\|met-s`, `ADJ\|postnom\|basis\|zonder`, `ADJ\|postnom\|comp\|met-s`, `ADJ\|prenom\|basis\|met-e\|bijz`, `ADJ\|prenom\|basis\|met-e\|stan`, `ADJ\|prenom\|basis\|zonder`, `ADJ\|prenom\|comp\|met-e\|stan`, `ADJ\|prenom\|comp\|zonder`, `ADJ\|prenom\|sup\|met-e\|stan`, `ADJ\|prenom\|sup\|zonder`, `ADJ\|vrij\|basis\|zonder`, `ADJ\|vrij\|comp\|zonder`, `ADJ\|vrij\|dim\|zonder`, `ADJ\|vrij\|sup\|zonder`, `BW`, `LET`, `LID\|bep\|dat\|evmo`, `LID\|bep\|gen\|evmo`, `LID\|bep\|gen\|rest3`, `LID\|bep\|stan\|evon`, `LID\|bep\|stan\|rest`, `LID\|onbep\|stan\|agr`, `N\|eigen\|ev\|basis\|gen`, `N\|eigen\|ev\|basis\|genus\|stan`, `N\|eigen\|ev\|basis\|onz\|stan`, `N\|eigen\|ev\|basis\|zijd\|stan`, `N\|eigen\|ev\|dim\|onz\|stan`, `N\|eigen\|mv\|basis`, `N\|soort\|ev\|basis\|dat`, `N\|soort\|ev\|basis\|gen`, `N\|soort\|ev\|basis\|genus\|stan`, `N\|soort\|ev\|basis\|onz\|stan`, `N\|soort\|ev\|basis\|zijd\|stan`, `N\|soort\|ev\|dim\|onz\|stan`, `N\|soort\|mv\|basis`, `N\|soort\|mv\|dim`, `SPEC\|afgebr`, `SPEC\|afk`, `SPEC\|deeleigen`, `SPEC\|enof`, `SPEC\|meta`, `SPEC\|symb`, `SPEC\|vreemd`, `TSW`, `TW\|hoofd\|nom\|mv-n\|basis`, `TW\|hoofd\|nom\|mv-n\|dim`, `TW\|hoofd\|nom\|zonder-n\|basis`, `TW\|hoofd\|nom\|zonder-n\|dim`, `TW\|hoofd\|prenom\|stan`, `TW\|hoofd\|vrij`, `TW\|rang\|nom\|mv-n`, `TW\|rang\|nom\|zonder-n`, `TW\|rang\|prenom\|stan`, `VG\|neven`, `VG\|onder`, `VNW\|aanw\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|aanw\|adv-pron\|stan\|red\|3\|getal`, `VNW\|aanw\|det\|dat\|nom\|met-e\|zonder-n`, `VNW\|aanw\|det\|dat\|prenom\|met-e\|evmo`, `VNW\|aanw\|det\|gen\|prenom\|met-e\|rest3`, `VNW\|aanw\|det\|stan\|nom\|met-e\|mv-n`, `VNW\|aanw\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|aanw\|det\|stan\|prenom\|met-e\|rest`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|agr`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|evon`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|rest`, `VNW\|aanw\|det\|stan\|vrij\|zonder`, `VNW\|aanw\|pron\|gen\|vol\|3m\|ev`, `VNW\|aanw\|pron\|stan\|vol\|3o\|ev`, `VNW\|aanw\|pron\|stan\|vol\|3\|getal`, `VNW\|betr\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|betr\|det\|stan\|nom\|zonder\|zonder-n`, `VNW\|betr\|pron\|stan\|vol\|3\|ev`, `VNW\|betr\|pron\|stan\|vol\|persoon\|getal`, `VNW\|bez\|det\|gen\|vol\|3\|ev\|prenom\|met-e\|rest3`, `VNW\|bez\|det\|stan\|nadr\|2v\|mv\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|1\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|2v\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|3\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|1\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|1\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|1\|mv\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|1\|mv\|prenom\|zonder\|evon`, `VNW\|bez\|det\|stan\|vol\|2v\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|2\|getal\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|3m\|ev\|nom\|met-e\|zonder-n`, `VNW\|bez\|det\|stan\|vol\|3m\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3p\|mv\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3v\|ev\|nom\|met-e\|zonder-n`, `VNW\|bez\|det\|stan\|vol\|3v\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|3\|mv\|prenom\|zonder\|agr`, `VNW\|excl\|pron\|stan\|vol\|3\|getal`, `VNW\|onbep\|adv-pron\|gen\|red\|3\|getal`, `VNW\|onbep\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|onbep\|det\|stan\|nom\|met-e\|mv-n`, `VNW\|onbep\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|onbep\|det\|stan\|nom\|zonder\|zonder-n`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|agr`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|evz`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|mv`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|rest`, `VNW\|onbep\|det\|stan\|prenom\|zonder\|agr`, `VNW\|onbep\|det\|stan\|prenom\|zonder\|evon`, `VNW\|onbep\|det\|stan\|vrij\|zonder`, `VNW\|onbep\|grad\|gen\|nom\|met-e\|mv-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|mv-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|mv-n\|sup`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|zonder-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|zonder-n\|sup`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|comp`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|sup`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|mv\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|zonder\|agr\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|zonder\|agr\|comp`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|basis`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|comp`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|sup`, `VNW\|onbep\|pron\|gen\|vol\|3p\|ev`, `VNW\|onbep\|pron\|stan\|vol\|3o\|ev`, `VNW\|onbep\|pron\|stan\|vol\|3p\|ev`, `VNW\|pers\|pron\|gen\|vol\|2\|getal`, `VNW\|pers\|pron\|nomin\|nadr\|3m\|ev\|masc`, `VNW\|pers\|pron\|nomin\|nadr\|3v\|ev\|fem`, `VNW\|pers\|pron\|nomin\|red\|1\|mv`, `VNW\|pers\|pron\|nomin\|red\|2v\|ev`, `VNW\|pers\|pron\|nomin\|red\|2\|getal`, `VNW\|pers\|pron\|nomin\|red\|3p\|ev\|masc`, `VNW\|pers\|pron\|nomin\|red\|3\|ev\|masc`, `VNW\|pers\|pron\|nomin\|vol\|1\|ev`, `VNW\|pers\|pron\|nomin\|vol\|1\|mv`, `VNW\|pers\|pron\|nomin\|vol\|2b\|getal`, `VNW\|pers\|pron\|nomin\|vol\|2v\|ev`, `VNW\|pers\|pron\|nomin\|vol\|2\|getal`, `VNW\|pers\|pron\|nomin\|vol\|3p\|mv`, `VNW\|pers\|pron\|nomin\|vol\|3v\|ev\|fem`, `VNW\|pers\|pron\|nomin\|vol\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|nadr\|3m\|ev\|masc`, `VNW\|pers\|pron\|obl\|red\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|vol\|2v\|ev`, `VNW\|pers\|pron\|obl\|vol\|3p\|mv`, `VNW\|pers\|pron\|obl\|vol\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|vol\|3\|getal\|fem`, `VNW\|pers\|pron\|stan\|nadr\|2v\|mv`, `VNW\|pers\|pron\|stan\|red\|3\|ev\|fem`, `VNW\|pers\|pron\|stan\|red\|3\|ev\|onz`, `VNW\|pers\|pron\|stan\|red\|3\|mv`, `VNW\|pr\|pron\|obl\|nadr\|1\|ev`, `VNW\|pr\|pron\|obl\|nadr\|2v\|getal`, `VNW\|pr\|pron\|obl\|nadr\|2\|getal`, `VNW\|pr\|pron\|obl\|red\|1\|ev`, `VNW\|pr\|pron\|obl\|red\|2v\|getal`, `VNW\|pr\|pron\|obl\|vol\|1\|ev`, `VNW\|pr\|pron\|obl\|vol\|1\|mv`, `VNW\|pr\|pron\|obl\|vol\|2\|getal`, `VNW\|recip\|pron\|gen\|vol\|persoon\|mv`, `VNW\|recip\|pron\|obl\|vol\|persoon\|mv`, `VNW\|refl\|pron\|obl\|nadr\|3\|getal`, `VNW\|refl\|pron\|obl\|red\|3\|getal`, `VNW\|vb\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|vb\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|vb\|det\|stan\|prenom\|met-e\|rest`, `VNW\|vb\|det\|stan\|prenom\|zonder\|evon`, `VNW\|vb\|pron\|gen\|vol\|3m\|ev`, `VNW\|vb\|pron\|gen\|vol\|3p\|mv`, `VNW\|vb\|pron\|gen\|vol\|3v\|ev`, `VNW\|vb\|pron\|stan\|vol\|3o\|ev`, `VNW\|vb\|pron\|stan\|vol\|3p\|getal`, `VZ\|fin`, `VZ\|init`, `VZ\|versm`, `WW\|inf\|nom\|zonder\|zonder-n`, `WW\|inf\|prenom\|met-e`, `WW\|inf\|vrij\|zonder`, `WW\|od\|nom\|met-e\|mv-n`, `WW\|od\|nom\|met-e\|zonder-n`, `WW\|od\|prenom\|met-e`, `WW\|od\|prenom\|zonder`, `WW\|od\|vrij\|zonder`, `WW\|pv\|conj\|ev`, `WW\|pv\|tgw\|ev`, `WW\|pv\|tgw\|met-t`, `WW\|pv\|tgw\|mv`, `WW\|pv\|verl\|ev`, `WW\|pv\|verl\|mv`, `WW\|vd\|nom\|met-e\|mv-n`, `WW\|vd\|nom\|met-e\|zonder-n`, `WW\|vd\|prenom\|met-e`, `WW\|vd\|prenom\|zonder`, `WW\|vd\|vrij\|zonder` |
100
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux`, `aux:pass`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `csubj`, `dep`, `det`, `expl`, `expl:pv`, `fixed`, `flat`, `iobj`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl`, `obl:agent`, `orphan`, `parataxis`, `punct`, `xcomp` |
 
101
  | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PRODUCT`, `QUANTITY`, `TIME`, `WORK_OF_ART` |
102
 
103
  </details>
106
 
107
  | Type | Score |
108
  | --- | --- |
109
+ | `TAG_ACC` | 95.34 |
110
+ | `SENTS_P` | 85.95 |
111
+ | `SENTS_R` | 89.10 |
112
+ | `SENTS_F` | 87.50 |
113
+ | `DEP_UAS` | 86.98 |
114
+ | `DEP_LAS` | 82.36 |
115
+ | `ENTS_P` | 76.53 |
116
+ | `ENTS_R` | 74.41 |
117
+ | `ENTS_F` | 75.46 |
118
  | `TOKEN_ACC` | 99.97 |
119
  | `TOKEN_P` | 99.74 |
120
  | `TOKEN_R` | 99.76 |
121
  | `TOKEN_F` | 99.75 |
122
+ | `POS_ACC` | 96.62 |
123
+ | `MORPH_ACC` | 96.36 |
124
+ | `MORPH_MICRO_P` | 97.22 |
125
+ | `MORPH_MICRO_R` | 95.52 |
126
+ | `MORPH_MICRO_F` | 96.36 |
127
+ | `LEMMA_ACC` | 94.18 |
 
 
 
 
 
 
 
 
 
accuracy.json CHANGED
@@ -1,339 +1,262 @@
1
  {
2
- "token_acc": 0.9997165842,
3
- "token_p": 0.9974281853,
4
- "token_r": 0.9975586363,
5
- "token_f": 0.9974934066,
6
- "pos_acc": 0.9667175573,
7
- "morph_acc": 0.9649471044,
8
- "morph_micro_p": 0.9743803954,
9
- "morph_micro_r": 0.9558803442,
10
- "morph_micro_f": 0.9650417155,
11
- "morph_per_feat": {
12
- "Person": {
13
- "p": 0.9921875,
14
- "r": 0.9713193117,
15
- "f": 0.9816425121
16
- },
17
- "Poss": {
18
- "p": 0.9885496183,
19
- "r": 0.9923371648,
20
- "f": 0.9904397706
21
- },
22
- "PronType": {
23
- "p": 0.9914529915,
24
- "r": 0.9642560266,
25
- "f": 0.9776654024
26
- },
27
- "Gender": {
28
- "p": 0.9408033827,
29
- "r": 0.9056219791,
30
- "f": 0.9228775113
31
- },
32
- "Number": {
33
- "p": 0.9837895703,
34
- "r": 0.9628798084,
35
- "f": 0.9732223903
36
- },
37
- "Tense": {
38
- "p": 0.9777901166,
39
- "r": 0.9681143485,
40
- "f": 0.9729281768
41
- },
42
- "VerbForm": {
43
- "p": 0.9640653358,
44
- "r": 0.9557394746,
45
- "f": 0.9598843513
46
- },
47
- "Degree": {
48
- "p": 0.9628550619,
49
- "r": 0.9497126437,
50
- "f": 0.956238698
51
- },
52
- "Definite": {
53
- "p": 0.9955869373,
54
- "r": 0.9929577465,
55
- "f": 0.9942706038
56
- },
57
- "Case": {
58
- "p": 0.998003992,
59
- "r": 0.9960159363,
60
- "f": 0.9970089731
61
- },
62
- "Reflex": {
63
- "p": 1.0,
64
- "r": 1.0,
65
- "f": 1.0
66
- },
67
- "Abbr": {
68
- "p": 1.0,
69
- "r": 0.6666666667,
70
- "f": 0.8
71
- }
72
- },
73
- "tag_acc": 0.9543293348,
74
- "sents_p": 0.866340098,
75
- "sents_r": 0.8880918221,
76
- "sents_f": 0.8770811194,
77
- "dep_uas": 0.8701736595,
78
- "dep_las": 0.8260704236,
79
  "dep_las_per_type": {
80
- "det": {
81
- "p": 0.8869644485,
82
- "r": 0.959566075,
83
- "f": 0.9218379915
84
  },
85
  "nsubj": {
86
- "p": 0.7900466563,
87
- "r": 0.8246753247,
88
- "f": 0.8069896743
89
  },
90
- "root": {
91
- "p": 0.7489932886,
92
- "r": 0.8328358209,
93
- "f": 0.7886925795
94
  },
95
- "case": {
96
- "p": 0.8836805556,
97
- "r": 0.9382488479,
98
- "f": 0.910147519
99
  },
100
- "obl": {
101
- "p": 0.7313195548,
102
- "r": 0.7290015848,
103
- "f": 0.7301587302
104
  },
105
- "nmod": {
106
- "p": 0.596651446,
107
- "r": 0.6841186736,
108
- "f": 0.637398374
109
  },
110
- "advmod": {
111
- "p": 0.75,
112
- "r": 0.7696969697,
113
- "f": 0.7597208375
114
  },
115
- "obj": {
116
- "p": 0.7902621723,
117
- "r": 0.778597786,
118
- "f": 0.7843866171
119
  },
120
  "mark": {
121
- "p": 0.8445378151,
122
- "r": 0.8305785124,
123
- "f": 0.8375
124
- },
125
- "advcl": {
126
- "p": 0.5137614679,
127
- "r": 0.4628099174,
128
- "f": 0.4869565217
129
  },
130
- "amod": {
131
- "p": 0.7834710744,
132
- "r": 0.8649635036,
133
- "f": 0.8222029488
134
  },
135
- "acl:relcl": {
136
- "p": 0.6352941176,
137
- "r": 0.6585365854,
138
- "f": 0.6467065868
139
  },
140
- "cop": {
141
- "p": 0.7862068966,
142
- "r": 0.6263736264,
143
- "f": 0.6972477064
144
  },
145
- "cc": {
146
- "p": 0.8,
147
- "r": 0.8384879725,
148
- "f": 0.8187919463
149
  },
150
- "conj": {
151
- "p": 0.5825242718,
152
- "r": 0.5309734513,
153
- "f": 0.5555555556
154
  },
155
- "fixed": {
156
- "p": 0.6690647482,
157
- "r": 0.2520325203,
158
- "f": 0.3661417323
159
  },
160
  "flat": {
161
- "p": 0.7995991984,
162
- "r": 0.6797274276,
163
- "f": 0.7348066298
164
  },
165
- "csubj": {
166
- "p": 0.5,
167
- "r": 0.1666666667,
168
- "f": 0.25
169
  },
170
- "aux": {
171
- "p": 0.7714285714,
172
- "r": 0.786407767,
173
- "f": 0.7788461538
174
  },
175
- "compound:prt": {
176
- "p": 0.776119403,
177
- "r": 0.6753246753,
178
- "f": 0.7222222222
179
  },
180
  "nummod": {
181
- "p": 0.59375,
182
- "r": 0.6506849315,
183
- "f": 0.6209150327
184
  },
185
- "acl": {
186
- "p": 0.5098039216,
187
- "r": 0.4406779661,
188
- "f": 0.4727272727
189
  },
190
- "expl": {
191
- "p": 0.4,
192
- "r": 0.3333333333,
193
- "f": 0.3636363636
194
  },
195
- "appos": {
196
- "p": 0.5625,
197
- "r": 0.4682080925,
198
- "f": 0.5110410095
199
  },
200
  "nsubj:pass": {
201
- "p": 0.8023255814,
202
- "r": 0.8023255814,
203
- "f": 0.8023255814
204
  },
205
  "aux:pass": {
206
- "p": 0.8823529412,
207
- "r": 0.9183673469,
208
- "f": 0.9
209
  },
210
- "ccomp": {
211
- "p": 0.6666666667,
212
- "r": 0.5294117647,
213
- "f": 0.5901639344
214
  },
215
- "xcomp": {
216
- "p": 0.4285714286,
217
- "r": 0.698630137,
218
- "f": 0.53125
219
  },
220
  "parataxis": {
221
- "p": 0.3644067797,
222
- "r": 0.288590604,
223
- "f": 0.3220973783
224
  },
225
- "expl:pv": {
226
- "p": 0.7894736842,
227
- "r": 0.7894736842,
228
- "f": 0.7894736842
229
  },
230
- "iobj": {
231
- "p": 0.4444444444,
232
- "r": 0.4,
233
- "f": 0.4210526316
234
  },
235
- "nmod:poss": {
236
- "p": 0.8616352201,
237
- "r": 0.8954248366,
238
- "f": 0.8782051282
 
 
 
 
 
 
 
 
 
 
239
  },
240
  "dep": {
241
  "p": 0.0,
242
  "r": 0.0,
243
  "f": 0.0
244
  },
245
- "obl:agent": {
246
- "p": 0.8461538462,
247
- "r": 0.7857142857,
248
- "f": 0.8148148148
249
- },
250
  "orphan": {
251
  "p": 0.0,
252
  "r": 0.0,
253
  "f": 0.0
254
  }
255
  },
256
- "lemma_acc": 0.8159277755,
257
- "ents_p": 0.7772241993,
258
- "ents_r": 0.755186722,
259
- "ents_f": 0.7660470011,
260
  "ents_per_type": {
261
- "DATE": {
262
- "p": 0.931372549,
263
- "r": 0.9253246753,
264
- "f": 0.9283387622
265
- },
266
- "NORP": {
267
- "p": 0.8181818182,
268
- "r": 0.8674698795,
269
- "f": 0.8421052632
270
- },
271
  "ORG": {
272
- "p": 0.6959459459,
273
- "r": 0.6094674556,
274
- "f": 0.6498422713
275
- },
276
- "CARDINAL": {
277
- "p": 0.8607594937,
278
- "r": 0.9714285714,
279
- "f": 0.9127516779
280
  },
281
- "GPE": {
282
- "p": 0.785046729,
283
- "r": 0.9230769231,
284
- "f": 0.8484848485
285
  },
286
  "QUANTITY": {
287
- "p": 0.8571428571,
288
- "r": 1.0,
289
- "f": 0.9230769231
290
  },
291
- "PERCENT": {
292
- "p": 1.0,
293
- "r": 0.8333333333,
294
- "f": 0.9090909091
295
  },
296
- "PERSON": {
297
- "p": 0.7788778878,
298
- "r": 0.7637540453,
299
- "f": 0.7712418301
300
  },
301
- "LAW": {
302
- "p": 1.0,
303
- "r": 0.3333333333,
304
- "f": 0.5
305
  },
306
  "EVENT": {
307
- "p": 0.4761904762,
308
- "r": 0.4347826087,
309
- "f": 0.4545454545
310
  },
311
- "WORK_OF_ART": {
312
- "p": 0.5882352941,
313
- "r": 0.4444444444,
314
- "f": 0.5063291139
 
 
 
 
 
315
  },
316
  "ORDINAL": {
317
- "p": 0.96875,
318
- "r": 0.9393939394,
319
- "f": 0.9538461538
320
  },
321
  "LANGUAGE": {
322
- "p": 0.75,
323
- "r": 0.8181818182,
324
- "f": 0.7826086957
 
 
 
 
 
325
  },
326
  "LOC": {
327
- "p": 0.5333333333,
328
- "r": 0.2352941176,
329
- "f": 0.3265306122
330
  },
331
- "FAC": {
332
- "p": 0.1,
333
- "r": 0.2142857143,
334
- "f": 0.1363636364
335
  },
336
- "PRODUCT": {
337
  "p": 0.0,
338
  "r": 0.0,
339
  "f": 0.0
@@ -343,11 +266,88 @@
343
  "r": 0.0,
344
  "f": 0.0
345
  },
346
- "TIME": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
347
  "p": 1.0,
348
  "r": 1.0,
349
  "f": 1.0
 
 
 
 
 
350
  }
351
  },
352
- "speed": 3053.5292904657
353
  }
1
  {
2
+ "tag_acc": 0.9534133043,
3
+ "sents_p": 0.8595155709,
4
+ "sents_r": 0.8909612626,
5
+ "sents_f": 0.8749559704,
6
+ "dep_uas": 0.8698053923,
7
+ "dep_las": 0.8235860531,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  "dep_las_per_type": {
9
+ "nmod:poss": {
10
+ "p": 0.9456521739,
11
+ "r": 0.9525547445,
12
+ "f": 0.9490909091
13
  },
14
  "nsubj": {
15
+ "p": 0.8466494845,
16
+ "r": 0.8639053254,
17
+ "f": 0.8551903677
18
  },
19
+ "aux": {
20
+ "p": 0.9102990033,
21
+ "r": 0.9013157895,
22
+ "f": 0.905785124
23
  },
24
+ "advmod": {
25
+ "p": 0.7967269595,
26
+ "r": 0.8258928571,
27
+ "f": 0.8110477861
28
  },
29
+ "root": {
30
+ "p": 0.8636678201,
31
+ "r": 0.8952654232,
32
+ "f": 0.8791828108
33
  },
34
+ "det": {
35
+ "p": 0.946263644,
36
+ "r": 0.9753353527,
37
+ "f": 0.9605795866
38
  },
39
+ "amod": {
40
+ "p": 0.8720420684,
41
+ "r": 0.8915770609,
42
+ "f": 0.8817013735
43
  },
44
+ "obl": {
45
+ "p": 0.7456021651,
46
+ "r": 0.7461069736,
47
+ "f": 0.7458544839
48
  },
49
  "mark": {
50
+ "p": 0.8880866426,
51
+ "r": 0.8945454545,
52
+ "f": 0.8913043478
 
 
 
 
 
53
  },
54
+ "ccomp": {
55
+ "p": 0.6727272727,
56
+ "r": 0.691588785,
57
+ "f": 0.6820276498
58
  },
59
+ "case": {
60
+ "p": 0.9379652605,
61
+ "r": 0.9606099111,
62
+ "f": 0.9491525424
63
  },
64
+ "appos": {
65
+ "p": 0.7060702875,
66
+ "r": 0.6696969697,
67
+ "f": 0.6874027994
68
  },
69
+ "obj": {
70
+ "p": 0.7875968992,
71
+ "r": 0.7755725191,
72
+ "f": 0.7815384615
73
  },
74
+ "compound:prt": {
75
+ "p": 0.7755102041,
76
+ "r": 0.7136150235,
77
+ "f": 0.7432762836
78
  },
79
+ "xcomp": {
80
+ "p": 0.6765799257,
81
+ "r": 0.6618181818,
82
+ "f": 0.6691176471
83
  },
84
  "flat": {
85
+ "p": 0.8115124153,
86
+ "r": 0.7624602333,
87
+ "f": 0.7862219792
88
  },
89
+ "expl:pv": {
90
+ "p": 0.7674418605,
91
+ "r": 0.75,
92
+ "f": 0.7586206897
93
  },
94
+ "acl": {
95
+ "p": 0.4615384615,
96
+ "r": 0.3673469388,
97
+ "f": 0.4090909091
98
  },
99
+ "advcl": {
100
+ "p": 0.5024875622,
101
+ "r": 0.454954955,
102
+ "f": 0.4775413712
103
  },
104
  "nummod": {
105
+ "p": 0.8141025641,
106
+ "r": 0.8466666667,
107
+ "f": 0.8300653595
108
  },
109
+ "nmod": {
110
+ "p": 0.7278742763,
111
+ "r": 0.7652173913,
112
+ "f": 0.746078847
113
  },
114
+ "cc": {
115
+ "p": 0.8544776119,
116
+ "r": 0.8674242424,
117
+ "f": 0.8609022556
118
  },
119
+ "conj": {
120
+ "p": 0.6463245492,
121
+ "r": 0.6331521739,
122
+ "f": 0.6396705559
123
  },
124
  "nsubj:pass": {
125
+ "p": 0.8083832335,
126
+ "r": 0.8490566038,
127
+ "f": 0.8282208589
128
  },
129
  "aux:pass": {
130
+ "p": 0.8871794872,
131
+ "r": 0.9611111111,
132
+ "f": 0.9226666667
133
  },
134
+ "iobj": {
135
+ "p": 0.5652173913,
136
+ "r": 0.3939393939,
137
+ "f": 0.4642857143
138
  },
139
+ "cop": {
140
+ "p": 0.7789473684,
141
+ "r": 0.8131868132,
142
+ "f": 0.7956989247
143
  },
144
  "parataxis": {
145
+ "p": 0.3663366337,
146
+ "r": 0.268115942,
147
+ "f": 0.309623431
148
  },
149
+ "acl:relcl": {
150
+ "p": 0.6956521739,
151
+ "r": 0.7044025157,
152
+ "f": 0.7
153
  },
154
+ "fixed": {
155
+ "p": 0.721448468,
156
+ "r": 0.4683544304,
157
+ "f": 0.5679824561
158
  },
159
+ "obl:agent": {
160
+ "p": 0.9615384615,
161
+ "r": 0.8620689655,
162
+ "f": 0.9090909091
163
+ },
164
+ "expl": {
165
+ "p": 0.4,
166
+ "r": 0.4761904762,
167
+ "f": 0.4347826087
168
+ },
169
+ "csubj": {
170
+ "p": 0.6111111111,
171
+ "r": 0.55,
172
+ "f": 0.5789473684
173
  },
174
  "dep": {
175
  "p": 0.0,
176
  "r": 0.0,
177
  "f": 0.0
178
  },
 
 
 
 
 
179
  "orphan": {
180
  "p": 0.0,
181
  "r": 0.0,
182
  "f": 0.0
183
  }
184
  },
185
+ "ents_p": 0.7652916074,
186
+ "ents_r": 0.7441217151,
187
+ "ents_f": 0.7545582048,
 
188
  "ents_per_type": {
 
 
 
 
 
 
 
 
 
 
189
  "ORG": {
190
+ "p": 0.0,
191
+ "r": 0.0,
192
+ "f": 0.0
 
 
 
 
 
193
  },
194
+ "PERSON": {
195
+ "p": 0.0,
196
+ "r": 0.0,
197
+ "f": 0.0
198
  },
199
  "QUANTITY": {
200
+ "p": 0.0,
201
+ "r": 0.0,
202
+ "f": 0.0
203
  },
204
+ "CARDINAL": {
205
+ "p": 0.0,
206
+ "r": 0.0,
207
+ "f": 0.0
208
  },
209
+ "NORP": {
210
+ "p": 0.0,
211
+ "r": 0.0,
212
+ "f": 0.0
213
  },
214
+ "DATE": {
215
+ "p": 0.0,
216
+ "r": 0.0,
217
+ "f": 0.0
218
  },
219
  "EVENT": {
220
+ "p": 0.0,
221
+ "r": 0.0,
222
+ "f": 0.0
223
  },
224
+ "PRODUCT": {
225
+ "p": 0.0,
226
+ "r": 0.0,
227
+ "f": 0.0
228
+ },
229
+ "GPE": {
230
+ "p": 0.0,
231
+ "r": 0.0,
232
+ "f": 0.0
233
  },
234
  "ORDINAL": {
235
+ "p": 0.0,
236
+ "r": 0.0,
237
+ "f": 0.0
238
  },
239
  "LANGUAGE": {
240
+ "p": 0.0,
241
+ "r": 0.0,
242
+ "f": 0.0
243
+ },
244
+ "TIME": {
245
+ "p": 0.0,
246
+ "r": 0.0,
247
+ "f": 0.0
248
  },
249
  "LOC": {
250
+ "p": 0.0,
251
+ "r": 0.0,
252
+ "f": 0.0
253
  },
254
+ "WORK_OF_ART": {
255
+ "p": 0.0,
256
+ "r": 0.0,
257
+ "f": 0.0
258
  },
259
+ "FAC": {
260
  "p": 0.0,
261
  "r": 0.0,
262
  "f": 0.0
266
  "r": 0.0,
267
  "f": 0.0
268
  },
269
+ "PERCENT": {
270
+ "p": 0.0,
271
+ "r": 0.0,
272
+ "f": 0.0
273
+ },
274
+ "LAW": {
275
+ "p": 0.0,
276
+ "r": 0.0,
277
+ "f": 0.0
278
+ }
279
+ },
280
+ "speed": 10256.4374458498,
281
+ "token_acc": 0.9997165842,
282
+ "token_p": 0.9974281853,
283
+ "token_r": 0.9975586363,
284
+ "token_f": 0.9974934066,
285
+ "pos_acc": 0.9661941112,
286
+ "morph_acc": 0.9635947213,
287
+ "morph_micro_p": 0.9722389581,
288
+ "morph_micro_r": 0.9551518463,
289
+ "morph_micro_f": 0.9636196601,
290
+ "morph_per_feat": {
291
+ "Person": {
292
+ "p": 0.9892891918,
293
+ "r": 0.9713193117,
294
+ "f": 0.9802219006
295
+ },
296
+ "Poss": {
297
+ "p": 0.9811320755,
298
+ "r": 0.9961685824,
299
+ "f": 0.9885931559
300
+ },
301
+ "PronType": {
302
+ "p": 0.9880647911,
303
+ "r": 0.9634247714,
304
+ "f": 0.9755892256
305
+ },
306
+ "Gender": {
307
+ "p": 0.9345253747,
308
+ "r": 0.90409565,
309
+ "f": 0.9190587018
310
+ },
311
+ "Number": {
312
+ "p": 0.9816793893,
313
+ "r": 0.9624307738,
314
+ "f": 0.9719597914
315
+ },
316
+ "Tense": {
317
+ "p": 0.9805339266,
318
+ "r": 0.9692138538,
319
+ "f": 0.9748410285
320
+ },
321
+ "VerbForm": {
322
+ "p": 0.9623188406,
323
+ "r": 0.9557394746,
324
+ "f": 0.9590178733
325
+ },
326
+ "Degree": {
327
+ "p": 0.9655677656,
328
+ "r": 0.9468390805,
329
+ "f": 0.9561117156
330
+ },
331
+ "Definite": {
332
+ "p": 0.9942680776,
333
+ "r": 0.9925176056,
334
+ "f": 0.9933920705
335
+ },
336
+ "Case": {
337
+ "p": 0.998,
338
+ "r": 0.9940239044,
339
+ "f": 0.996007984
340
+ },
341
+ "Reflex": {
342
  "p": 1.0,
343
  "r": 1.0,
344
  "f": 1.0
345
+ },
346
+ "Abbr": {
347
+ "p": 1.0,
348
+ "r": 0.5,
349
+ "f": 0.6666666667
350
  }
351
  },
352
+ "lemma_acc": 0.9417537126
353
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
config.cfg CHANGED
@@ -10,7 +10,7 @@ seed = 0
10
 
11
  [nlp]
12
  lang = "nl"
13
- pipeline = ["tok2vec","morphologizer","tagger","parser","senter","attribute_ruler","lemmatizer","ner"]
14
  disabled = ["senter"]
15
  before_creation = null
16
  after_creation = null
@@ -26,11 +26,22 @@ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
26
  validate = false
27
 
28
  [components.lemmatizer]
29
- factory = "lemmatizer"
30
- mode = "rule"
31
- model = null
32
  overwrite = false
33
  scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
@@ -39,8 +50,9 @@ overwrite = true
39
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
41
  [components.morphologizer.model]
42
- @architectures = "spacy.Tagger.v1"
43
  nO = null
 
44
 
45
  [components.morphologizer.model.tok2vec]
46
  @architectures = "spacy.Tok2VecListener.v1"
@@ -70,7 +82,7 @@ nO = null
70
  @architectures = "spacy.MultiHashEmbed.v2"
71
  width = 96
72
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
73
- rows = [5000,2500,2500,2500,100]
74
  include_static_vectors = true
75
 
76
  [components.ner.model.tok2vec.encode]
@@ -108,8 +120,9 @@ overwrite = false
108
  scorer = {"@scorers":"spacy.senter_scorer.v1"}
109
 
110
  [components.senter.model]
111
- @architectures = "spacy.Tagger.v1"
112
  nO = null
 
113
 
114
  [components.senter.model.tok2vec]
115
  @architectures = "spacy.Tok2Vec.v2"
@@ -130,12 +143,14 @@ maxout_pieces = 2
130
 
131
  [components.tagger]
132
  factory = "tagger"
 
133
  overwrite = false
134
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
135
 
136
  [components.tagger.model]
137
- @architectures = "spacy.Tagger.v1"
138
  nO = null
 
139
 
140
  [components.tagger.model.tok2vec]
141
  @architectures = "spacy.Tok2VecListener.v1"
@@ -152,7 +167,7 @@ factory = "tok2vec"
152
  @architectures = "spacy.MultiHashEmbed.v2"
153
  width = ${components.tok2vec.model.encode:width}
154
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
155
- rows = [5000,2500,2500,2500,100]
156
  include_static_vectors = true
157
 
158
  [components.tok2vec.model.encode]
@@ -189,7 +204,7 @@ dropout = 0.1
189
  accumulate_gradient = 1
190
  patience = 5000
191
  max_epochs = 0
192
- max_steps = 0
193
  eval_frequency = 1000
194
  frozen_components = []
195
  before_to_disk = null
@@ -224,18 +239,18 @@ eps = 0.00000001
224
  learn_rate = 0.001
225
 
226
  [training.score_weights]
227
- pos_acc = 0.06
228
- morph_acc = 0.05
229
  morph_per_feat = null
230
- tag_acc = 0.06
231
  dep_uas = 0.0
232
- dep_las = 0.16
233
  dep_las_per_type = null
234
  sents_p = null
235
  sents_r = null
236
- sents_f = 0.02
237
- lemma_acc = 0.5
238
- ents_f = 0.16
239
  ents_p = 0.0
240
  ents_r = 0.0
241
  ents_per_type = null
@@ -252,6 +267,13 @@ after_init = null
252
 
253
  [initialize.components]
254
 
 
 
 
 
 
 
 
255
  [initialize.components.morphologizer]
256
 
257
  [initialize.components.morphologizer.labels]
10
 
11
  [nlp]
12
  lang = "nl"
13
+ pipeline = ["tok2vec","morphologizer","tagger","parser","lemmatizer","senter","attribute_ruler","ner"]
14
  disabled = ["senter"]
15
  before_creation = null
16
  after_creation = null
26
  validate = false
27
 
28
  [components.lemmatizer]
29
+ factory = "trainable_lemmatizer"
30
+ backoff = "orth"
31
+ min_tree_freq = 3
32
  overwrite = false
33
  scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
34
+ top_k = 1
35
+
36
+ [components.lemmatizer.model]
37
+ @architectures = "spacy.Tagger.v2"
38
+ nO = null
39
+ normalize = false
40
+
41
+ [components.lemmatizer.model.tok2vec]
42
+ @architectures = "spacy.Tok2VecListener.v1"
43
+ width = ${components.tok2vec.model.encode:width}
44
+ upstream = "tok2vec"
45
 
46
  [components.morphologizer]
47
  factory = "morphologizer"
50
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
51
 
52
  [components.morphologizer.model]
53
+ @architectures = "spacy.Tagger.v2"
54
  nO = null
55
+ normalize = false
56
 
57
  [components.morphologizer.model.tok2vec]
58
  @architectures = "spacy.Tok2VecListener.v1"
82
  @architectures = "spacy.MultiHashEmbed.v2"
83
  width = 96
84
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
85
+ rows = [5000,1000,2500,2500,50]
86
  include_static_vectors = true
87
 
88
  [components.ner.model.tok2vec.encode]
120
  scorer = {"@scorers":"spacy.senter_scorer.v1"}
121
 
122
  [components.senter.model]
123
+ @architectures = "spacy.Tagger.v2"
124
  nO = null
125
+ normalize = false
126
 
127
  [components.senter.model.tok2vec]
128
  @architectures = "spacy.Tok2Vec.v2"
143
 
144
  [components.tagger]
145
  factory = "tagger"
146
+ neg_prefix = "!"
147
  overwrite = false
148
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
149
 
150
  [components.tagger.model]
151
+ @architectures = "spacy.Tagger.v2"
152
  nO = null
153
+ normalize = false
154
 
155
  [components.tagger.model.tok2vec]
156
  @architectures = "spacy.Tok2VecListener.v1"
167
  @architectures = "spacy.MultiHashEmbed.v2"
168
  width = ${components.tok2vec.model.encode:width}
169
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
170
+ rows = [5000,1000,2500,2500,50]
171
  include_static_vectors = true
172
 
173
  [components.tok2vec.model.encode]
204
  accumulate_gradient = 1
205
  patience = 5000
206
  max_epochs = 0
207
+ max_steps = 100000
208
  eval_frequency = 1000
209
  frozen_components = []
210
  before_to_disk = null
239
  learn_rate = 0.001
240
 
241
  [training.score_weights]
242
+ pos_acc = 0.1
243
+ morph_acc = 0.09
244
  morph_per_feat = null
245
+ tag_acc = 0.1
246
  dep_uas = 0.0
247
+ dep_las = 0.29
248
  dep_las_per_type = null
249
  sents_p = null
250
  sents_r = null
251
+ sents_f = 0.04
252
+ lemma_acc = 0.1
253
+ ents_f = 0.29
254
  ents_p = 0.0
255
  ents_r = 0.0
256
  ents_per_type = null
267
 
268
  [initialize.components]
269
 
270
+ [initialize.components.lemmatizer]
271
+
272
+ [initialize.components.lemmatizer.labels]
273
+ @readers = "spacy.read_labels.v1"
274
+ path = "corpus/labels/trainable_lemmatizer.json"
275
+ require = false
276
+
277
  [initialize.components.morphologizer]
278
 
279
  [initialize.components.morphologizer.labels]
lemmatizer/cfg ADDED
@@ -0,0 +1,1198 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "labels":[
3
+ 1,
4
+ 4,
5
+ 5,
6
+ 10,
7
+ 12,
8
+ 14,
9
+ 18,
10
+ 22,
11
+ 26,
12
+ 27,
13
+ 30,
14
+ 32,
15
+ 34,
16
+ 36,
17
+ 39,
18
+ 43,
19
+ 45,
20
+ 47,
21
+ 51,
22
+ 53,
23
+ 55,
24
+ 57,
25
+ 58,
26
+ 61,
27
+ 63,
28
+ 64,
29
+ 66,
30
+ 67,
31
+ 69,
32
+ 71,
33
+ 75,
34
+ 77,
35
+ 79,
36
+ 81,
37
+ 83,
38
+ 85,
39
+ 87,
40
+ 89,
41
+ 91,
42
+ 95,
43
+ 97,
44
+ 98,
45
+ 101,
46
+ 103,
47
+ 104,
48
+ 106,
49
+ 109,
50
+ 110,
51
+ 112,
52
+ 113,
53
+ 117,
54
+ 120,
55
+ 122,
56
+ 125,
57
+ 127,
58
+ 130,
59
+ 135,
60
+ 136,
61
+ 138,
62
+ 139,
63
+ 140,
64
+ 142,
65
+ 143,
66
+ 145,
67
+ 147,
68
+ 151,
69
+ 154,
70
+ 155,
71
+ 156,
72
+ 158,
73
+ 160,
74
+ 164,
75
+ 165,
76
+ 167,
77
+ 168,
78
+ 170,
79
+ 172,
80
+ 173,
81
+ 174,
82
+ 176,
83
+ 179,
84
+ 181,
85
+ 184,
86
+ 188,
87
+ 190,
88
+ 193,
89
+ 195,
90
+ 197,
91
+ 199,
92
+ 119,
93
+ 202,
94
+ 203,
95
+ 205,
96
+ 208,
97
+ 211,
98
+ 213,
99
+ 216,
100
+ 220,
101
+ 221,
102
+ 223,
103
+ 225,
104
+ 227,
105
+ 231,
106
+ 234,
107
+ 236,
108
+ 239,
109
+ 243,
110
+ 246,
111
+ 248,
112
+ 250,
113
+ 252,
114
+ 254,
115
+ 255,
116
+ 258,
117
+ 73,
118
+ 260,
119
+ 262,
120
+ 264,
121
+ 266,
122
+ 269,
123
+ 271,
124
+ 273,
125
+ 276,
126
+ 278,
127
+ 280,
128
+ 281,
129
+ 283,
130
+ 285,
131
+ 287,
132
+ 289,
133
+ 291,
134
+ 293,
135
+ 294,
136
+ 296,
137
+ 298,
138
+ 300,
139
+ 301,
140
+ 303,
141
+ 307,
142
+ 309,
143
+ 312,
144
+ 313,
145
+ 315,
146
+ 317,
147
+ 318,
148
+ 320,
149
+ 321,
150
+ 325,
151
+ 327,
152
+ 329,
153
+ 330,
154
+ 332,
155
+ 335,
156
+ 337,
157
+ 338,
158
+ 339,
159
+ 341,
160
+ 343,
161
+ 348,
162
+ 350,
163
+ 352,
164
+ 355,
165
+ 357,
166
+ 360,
167
+ 362,
168
+ 363,
169
+ 364,
170
+ 366,
171
+ 368,
172
+ 369,
173
+ 373,
174
+ 374,
175
+ 376,
176
+ 378,
177
+ 380,
178
+ 382,
179
+ 384,
180
+ 386,
181
+ 388,
182
+ 389,
183
+ 392,
184
+ 394,
185
+ 398,
186
+ 400,
187
+ 401,
188
+ 404,
189
+ 407,
190
+ 409,
191
+ 412,
192
+ 414,
193
+ 415,
194
+ 417,
195
+ 418,
196
+ 422,
197
+ 425,
198
+ 428,
199
+ 429,
200
+ 430,
201
+ 431,
202
+ 432,
203
+ 434,
204
+ 435,
205
+ 436,
206
+ 438,
207
+ 441,
208
+ 444,
209
+ 446,
210
+ 448,
211
+ 449,
212
+ 450,
213
+ 452,
214
+ 454,
215
+ 456,
216
+ 459,
217
+ 461,
218
+ 463,
219
+ 465,
220
+ 466,
221
+ 468,
222
+ 471,
223
+ 473,
224
+ 475,
225
+ 476,
226
+ 480,
227
+ 482,
228
+ 484,
229
+ 486,
230
+ 489,
231
+ 491,
232
+ 494,
233
+ 495,
234
+ 497,
235
+ 498,
236
+ 500,
237
+ 501,
238
+ 502,
239
+ 504,
240
+ 505,
241
+ 506,
242
+ 509,
243
+ 512,
244
+ 516,
245
+ 517,
246
+ 519,
247
+ 522,
248
+ 526,
249
+ 527,
250
+ 529,
251
+ 530,
252
+ 531,
253
+ 533,
254
+ 535,
255
+ 538,
256
+ 540,
257
+ 542,
258
+ 546,
259
+ 547,
260
+ 548,
261
+ 552,
262
+ 553,
263
+ 554,
264
+ 555,
265
+ 557,
266
+ 559,
267
+ 563,
268
+ 566,
269
+ 568,
270
+ 570,
271
+ 572,
272
+ 575,
273
+ 576,
274
+ 578,
275
+ 581,
276
+ 583,
277
+ 586,
278
+ 587,
279
+ 589,
280
+ 592,
281
+ 593,
282
+ 594,
283
+ 595,
284
+ 596,
285
+ 599,
286
+ 601,
287
+ 603,
288
+ 606,
289
+ 607,
290
+ 609,
291
+ 610,
292
+ 613,
293
+ 615,
294
+ 616,
295
+ 619,
296
+ 620,
297
+ 622,
298
+ 623,
299
+ 625,
300
+ 626,
301
+ 627,
302
+ 629,
303
+ 631,
304
+ 634,
305
+ 635,
306
+ 638,
307
+ 640,
308
+ 643,
309
+ 644,
310
+ 645,
311
+ 647,
312
+ 649,
313
+ 653,
314
+ 657,
315
+ 659,
316
+ 660,
317
+ 661,
318
+ 662,
319
+ 663,
320
+ 665,
321
+ 666,
322
+ 667,
323
+ 671,
324
+ 673,
325
+ 674,
326
+ 675,
327
+ 677,
328
+ 678,
329
+ 679,
330
+ 681,
331
+ 683,
332
+ 684,
333
+ 686,
334
+ 688,
335
+ 689,
336
+ 691,
337
+ 692,
338
+ 693,
339
+ 696,
340
+ 697,
341
+ 698,
342
+ 700,
343
+ 702,
344
+ 704,
345
+ 707,
346
+ 709,
347
+ 710,
348
+ 712,
349
+ 713,
350
+ 715,
351
+ 717,
352
+ 718,
353
+ 719,
354
+ 720,
355
+ 722,
356
+ 724,
357
+ 727,
358
+ 729,
359
+ 730,
360
+ 731,
361
+ 735,
362
+ 736,
363
+ 737,
364
+ 738,
365
+ 739,
366
+ 740,
367
+ 741,
368
+ 742,
369
+ 744,
370
+ 746,
371
+ 747,
372
+ 748,
373
+ 751,
374
+ 754,
375
+ 756,
376
+ 758,
377
+ 759,
378
+ 760,
379
+ 762,
380
+ 764,
381
+ 765,
382
+ 767,
383
+ 768,
384
+ 771,
385
+ 772,
386
+ 775,
387
+ 777,
388
+ 778,
389
+ 779,
390
+ 782,
391
+ 783,
392
+ 785,
393
+ 787,
394
+ 789,
395
+ 792,
396
+ 794,
397
+ 795,
398
+ 797,
399
+ 799,
400
+ 802,
401
+ 803,
402
+ 806,
403
+ 809,
404
+ 811,
405
+ 813,
406
+ 815,
407
+ 35,
408
+ 818,
409
+ 819,
410
+ 821,
411
+ 824,
412
+ 825,
413
+ 828,
414
+ 830,
415
+ 832,
416
+ 835,
417
+ 836,
418
+ 838,
419
+ 841,
420
+ 844,
421
+ 845,
422
+ 847,
423
+ 848,
424
+ 849,
425
+ 850,
426
+ 851,
427
+ 853,
428
+ 855,
429
+ 856,
430
+ 857,
431
+ 858,
432
+ 860,
433
+ 861,
434
+ 864,
435
+ 867,
436
+ 869,
437
+ 872,
438
+ 874,
439
+ 875,
440
+ 877,
441
+ 880,
442
+ 881,
443
+ 883,
444
+ 884,
445
+ 886,
446
+ 888,
447
+ 889,
448
+ 890,
449
+ 893,
450
+ 894,
451
+ 897,
452
+ 898,
453
+ 900,
454
+ 902,
455
+ 903,
456
+ 906,
457
+ 907,
458
+ 909,
459
+ 910,
460
+ 912,
461
+ 913,
462
+ 914,
463
+ 915,
464
+ 916,
465
+ 918,
466
+ 919,
467
+ 920,
468
+ 922,
469
+ 925,
470
+ 929,
471
+ 933,
472
+ 934,
473
+ 935,
474
+ 937,
475
+ 938,
476
+ 940,
477
+ 941,
478
+ 943,
479
+ 944,
480
+ 945,
481
+ 946,
482
+ 949,
483
+ 950,
484
+ 952,
485
+ 955,
486
+ 956,
487
+ 957,
488
+ 958,
489
+ 959,
490
+ 961,
491
+ 962,
492
+ 963,
493
+ 965,
494
+ 968,
495
+ 969,
496
+ 971,
497
+ 973,
498
+ 976,
499
+ 979,
500
+ 980,
501
+ 981,
502
+ 983,
503
+ 986,
504
+ 987,
505
+ 990,
506
+ 992,
507
+ 995,
508
+ 997,
509
+ 999,
510
+ 1001,
511
+ 1004,
512
+ 1005,
513
+ 1007,
514
+ 1009,
515
+ 1011,
516
+ 1012,
517
+ 1015,
518
+ 1018,
519
+ 1019,
520
+ 1020,
521
+ 1021,
522
+ 1024,
523
+ 1027,
524
+ 1029,
525
+ 1030,
526
+ 1031,
527
+ 1033,
528
+ 1035,
529
+ 1039,
530
+ 1041,
531
+ 1042,
532
+ 1043,
533
+ 1047,
534
+ 1048,
535
+ 1053,
536
+ 1055,
537
+ 1056,
538
+ 1057,
539
+ 1058,
540
+ 1059,
541
+ 1060,
542
+ 1062,
543
+ 1064,
544
+ 1067,
545
+ 1069,
546
+ 1072,
547
+ 1075,
548
+ 1076,
549
+ 1078,
550
+ 1079,
551
+ 1081,
552
+ 1084,
553
+ 1085,
554
+ 1087,
555
+ 1089,
556
+ 1091,
557
+ 1092,
558
+ 1093,
559
+ 1094,
560
+ 1095,
561
+ 1096,
562
+ 1097,
563
+ 1102,
564
+ 1103,
565
+ 1106,
566
+ 1107,
567
+ 1108,
568
+ 1110,
569
+ 1111,
570
+ 1112,
571
+ 1113,
572
+ 1116,
573
+ 1118,
574
+ 1120,
575
+ 1123,
576
+ 1124,
577
+ 1126,
578
+ 1128,
579
+ 1130,
580
+ 1131,
581
+ 1132,
582
+ 1133,
583
+ 1136,
584
+ 1137,
585
+ 1138,
586
+ 1140,
587
+ 1142,
588
+ 1144,
589
+ 1146,
590
+ 1147,
591
+ 1148,
592
+ 1152,
593
+ 1153,
594
+ 1154,
595
+ 1155,
596
+ 1158,
597
+ 1159,
598
+ 1160,
599
+ 1162,
600
+ 1164,
601
+ 1165,
602
+ 88,
603
+ 1166,
604
+ 1169,
605
+ 1171,
606
+ 1172,
607
+ 1175,
608
+ 1177,
609
+ 1179,
610
+ 1181,
611
+ 1183,
612
+ 1185,
613
+ 1186,
614
+ 1188,
615
+ 1191,
616
+ 1192,
617
+ 1193,
618
+ 1194,
619
+ 1195,
620
+ 1196,
621
+ 1200,
622
+ 1204,
623
+ 1205,
624
+ 1206,
625
+ 1207,
626
+ 1210,
627
+ 1212,
628
+ 311,
629
+ 1213,
630
+ 1215,
631
+ 1217,
632
+ 1218,
633
+ 1221,
634
+ 1222,
635
+ 1224,
636
+ 1225,
637
+ 1226,
638
+ 1227,
639
+ 1228,
640
+ 1229,
641
+ 1230,
642
+ 1231,
643
+ 1232,
644
+ 1233,
645
+ 1235,
646
+ 1236,
647
+ 1239,
648
+ 1241,
649
+ 1242,
650
+ 1244,
651
+ 1245,
652
+ 1247,
653
+ 1249,
654
+ 1251,
655
+ 1252,
656
+ 1253,
657
+ 1257,
658
+ 1258,
659
+ 1262,
660
+ 1263,
661
+ 1265,
662
+ 1266,
663
+ 1267,
664
+ 1270,
665
+ 1272,
666
+ 1274,
667
+ 1275,
668
+ 1276,
669
+ 1277,
670
+ 1280,
671
+ 1282,
672
+ 1283,
673
+ 1286,
674
+ 1287,
675
+ 1290,
676
+ 1292,
677
+ 1293,
678
+ 1295,
679
+ 1299,
680
+ 1300,
681
+ 1302,
682
+ 1303,
683
+ 1306,
684
+ 1308,
685
+ 1309,
686
+ 1311,
687
+ 1314,
688
+ 1318,
689
+ 1319,
690
+ 1320,
691
+ 1324,
692
+ 1325,
693
+ 1327,
694
+ 1328,
695
+ 1329,
696
+ 1330,
697
+ 1331,
698
+ 1332,
699
+ 1333,
700
+ 1334,
701
+ 1335,
702
+ 1337,
703
+ 1338,
704
+ 1341,
705
+ 1343,
706
+ 1344,
707
+ 1345,
708
+ 1346,
709
+ 1347,
710
+ 1348,
711
+ 1350,
712
+ 1351,
713
+ 1355,
714
+ 1356,
715
+ 1358,
716
+ 1359,
717
+ 1361,
718
+ 1362,
719
+ 1365,
720
+ 1368,
721
+ 1369,
722
+ 1371,
723
+ 1373,
724
+ 1375,
725
+ 1377,
726
+ 1379,
727
+ 1381,
728
+ 1382,
729
+ 1383,
730
+ 1384,
731
+ 1385,
732
+ 1387,
733
+ 1390,
734
+ 90,
735
+ 1391,
736
+ 1393,
737
+ 1394,
738
+ 1395,
739
+ 1397,
740
+ 1398,
741
+ 1399,
742
+ 1400,
743
+ 1401,
744
+ 1402,
745
+ 1405,
746
+ 1407,
747
+ 1409,
748
+ 1411,
749
+ 1412,
750
+ 1413,
751
+ 1415,
752
+ 1416,
753
+ 1417,
754
+ 1418,
755
+ 1419,
756
+ 1423,
757
+ 1425,
758
+ 1426,
759
+ 1427,
760
+ 1428,
761
+ 1429,
762
+ 1431,
763
+ 1432,
764
+ 1433,
765
+ 1436,
766
+ 1437,
767
+ 1438,
768
+ 1439,
769
+ 1441,
770
+ 1443,
771
+ 1444,
772
+ 1445,
773
+ 1446,
774
+ 1447,
775
+ 1449,
776
+ 1450,
777
+ 1451,
778
+ 1452,
779
+ 1453,
780
+ 1454,
781
+ 1456,
782
+ 1460,
783
+ 1461,
784
+ 1464,
785
+ 1465,
786
+ 1466,
787
+ 1467,
788
+ 1468,
789
+ 1469,
790
+ 1470,
791
+ 1471,
792
+ 1472,
793
+ 1473,
794
+ 1476,
795
+ 1477,
796
+ 1479,
797
+ 1481,
798
+ 1483,
799
+ 1486,
800
+ 1488,
801
+ 1489,
802
+ 1490,
803
+ 1492,
804
+ 1493,
805
+ 1494,
806
+ 1496,
807
+ 1497,
808
+ 1498,
809
+ 1499,
810
+ 1502,
811
+ 1503,
812
+ 1504,
813
+ 1506,
814
+ 1508,
815
+ 1509,
816
+ 1512,
817
+ 1514,
818
+ 1517,
819
+ 1518,
820
+ 1519,
821
+ 1521,
822
+ 1523,
823
+ 1524,
824
+ 1525,
825
+ 1528,
826
+ 1529,
827
+ 1531,
828
+ 1532,
829
+ 1534,
830
+ 1535,
831
+ 1538,
832
+ 1540,
833
+ 1541,
834
+ 1542,
835
+ 1543,
836
+ 1544,
837
+ 1545,
838
+ 1547,
839
+ 1548,
840
+ 1549,
841
+ 1551,
842
+ 1552,
843
+ 1553,
844
+ 1554,
845
+ 1555,
846
+ 1556,
847
+ 1558,
848
+ 1559,
849
+ 1561,
850
+ 1562,
851
+ 1564,
852
+ 1565,
853
+ 1566,
854
+ 1570,
855
+ 1571,
856
+ 1574,
857
+ 1575,
858
+ 1576,
859
+ 1577,
860
+ 1579,
861
+ 1580,
862
+ 1581,
863
+ 1583,
864
+ 1584,
865
+ 1588,
866
+ 1590,
867
+ 1592,
868
+ 1594,
869
+ 1595,
870
+ 1596,
871
+ 1598,
872
+ 1599,
873
+ 1600,
874
+ 1602,
875
+ 1603,
876
+ 1604,
877
+ 1606,
878
+ 1607,
879
+ 1608,
880
+ 1612,
881
+ 1614,
882
+ 1615,
883
+ 1618,
884
+ 1619,
885
+ 1621,
886
+ 1624,
887
+ 1625,
888
+ 1628,
889
+ 1629,
890
+ 1632,
891
+ 1633,
892
+ 1635,
893
+ 1636,
894
+ 1637,
895
+ 1638,
896
+ 1640,
897
+ 1641,
898
+ 1644,
899
+ 1645,
900
+ 1646,
901
+ 1647,
902
+ 1649,
903
+ 1651,
904
+ 1653,
905
+ 1656,
906
+ 1658,
907
+ 1659,
908
+ 1660,
909
+ 1662,
910
+ 1663,
911
+ 1664,
912
+ 1665,
913
+ 1666,
914
+ 1668,
915
+ 1669,
916
+ 1671,
917
+ 1672,
918
+ 1673,
919
+ 1676,
920
+ 1677,
921
+ 1678,
922
+ 1680,
923
+ 1681,
924
+ 1684,
925
+ 1685,
926
+ 1687,
927
+ 1688,
928
+ 1690,
929
+ 1692,
930
+ 1694,
931
+ 1697,
932
+ 1698,
933
+ 1699,
934
+ 1700,
935
+ 1701,
936
+ 1702,
937
+ 1703,
938
+ 1704,
939
+ 1706,
940
+ 1707,
941
+ 1709,
942
+ 1710,
943
+ 1711,
944
+ 1712,
945
+ 1716,
946
+ 1718,
947
+ 1719,
948
+ 1720,
949
+ 1721,
950
+ 1722,
951
+ 1725,
952
+ 1726,
953
+ 1727,
954
+ 1729,
955
+ 1730,
956
+ 1732,
957
+ 1734,
958
+ 1736,
959
+ 1738,
960
+ 1739,
961
+ 1740,
962
+ 1742,
963
+ 1743,
964
+ 1744,
965
+ 1745,
966
+ 1746,
967
+ 1749,
968
+ 1752,
969
+ 1753,
970
+ 1754,
971
+ 1755,
972
+ 1756,
973
+ 1758,
974
+ 1761,
975
+ 1762,
976
+ 1763,
977
+ 1765,
978
+ 1766,
979
+ 1768,
980
+ 1769,
981
+ 1771,
982
+ 1772,
983
+ 1773,
984
+ 1774,
985
+ 1778,
986
+ 1779,
987
+ 1780,
988
+ 1782,
989
+ 1783,
990
+ 1784,
991
+ 1785,
992
+ 1786,
993
+ 1787,
994
+ 1789,
995
+ 1790,
996
+ 1792,
997
+ 1793,
998
+ 1795,
999
+ 1796,
1000
+ 1799,
1001
+ 1801,
1002
+ 1802,
1003
+ 1804,
1004
+ 1806,
1005
+ 1808,
1006
+ 1809,
1007
+ 1810,
1008
+ 1811,
1009
+ 1814,
1010
+ 1816,
1011
+ 1817,
1012
+ 1820,
1013
+ 1823,
1014
+ 1824,
1015
+ 1825,
1016
+ 1827,
1017
+ 1830,
1018
+ 1831,
1019
+ 1833,
1020
+ 1834,
1021
+ 1836,
1022
+ 1837,
1023
+ 1838,
1024
+ 1840,
1025
+ 1841,
1026
+ 1842,
1027
+ 1844,
1028
+ 1846,
1029
+ 1849,
1030
+ 1850,
1031
+ 1852,
1032
+ 1853,
1033
+ 1855,
1034
+ 1857,
1035
+ 1858,
1036
+ 1859,
1037
+ 1860,
1038
+ 1861,
1039
+ 1862,
1040
+ 1863,
1041
+ 1864,
1042
+ 1867,
1043
+ 1868,
1044
+ 1870,
1045
+ 1871,
1046
+ 1872,
1047
+ 1873,
1048
+ 1874,
1049
+ 1875,
1050
+ 1876,
1051
+ 1878,
1052
+ 1880,
1053
+ 1882,
1054
+ 1883,
1055
+ 1884,
1056
+ 1886,
1057
+ 1887,
1058
+ 1889,
1059
+ 1893,
1060
+ 1894,
1061
+ 1896,
1062
+ 1899,
1063
+ 1900,
1064
+ 1902,
1065
+ 1903,
1066
+ 1904,
1067
+ 1905,
1068
+ 1906,
1069
+ 1907,
1070
+ 1909,
1071
+ 1910,
1072
+ 1911,
1073
+ 1912,
1074
+ 1913,
1075
+ 1916,
1076
+ 295,
1077
+ 1917,
1078
+ 1918,
1079
+ 1919,
1080
+ 1921,
1081
+ 1923,
1082
+ 1925,
1083
+ 1926,
1084
+ 1929,
1085
+ 1931,
1086
+ 1932,
1087
+ 1933,
1088
+ 1935,
1089
+ 1936,
1090
+ 1937,
1091
+ 1939,
1092
+ 1940,
1093
+ 1941,
1094
+ 1942,
1095
+ 1943,
1096
+ 1944,
1097
+ 1946,
1098
+ 1948,
1099
+ 1950,
1100
+ 1951,
1101
+ 1952,
1102
+ 1954,
1103
+ 1956,
1104
+ 1957,
1105
+ 1960,
1106
+ 1961,
1107
+ 1964,
1108
+ 1966,
1109
+ 1967,
1110
+ 1968,
1111
+ 1970,
1112
+ 1971,
1113
+ 1972,
1114
+ 1975,
1115
+ 1978,
1116
+ 1981,
1117
+ 1982,
1118
+ 1983,
1119
+ 1984,
1120
+ 1987,
1121
+ 1988,
1122
+ 44,
1123
+ 1990,
1124
+ 1991,
1125
+ 1994,
1126
+ 1995,
1127
+ 1996,
1128
+ 2000,
1129
+ 2001,
1130
+ 2002,
1131
+ 2005,
1132
+ 2007,
1133
+ 2008,
1134
+ 2009,
1135
+ 2011,
1136
+ 2012,
1137
+ 2013,
1138
+ 2014,
1139
+ 2015,
1140
+ 2016,
1141
+ 2017,
1142
+ 2019,
1143
+ 2020,
1144
+ 2021,
1145
+ 2023,
1146
+ 2025,
1147
+ 2028,
1148
+ 2029,
1149
+ 2030,
1150
+ 2031,
1151
+ 2032,
1152
+ 2034,
1153
+ 2035,
1154
+ 2036,
1155
+ 2037,
1156
+ 2038,
1157
+ 2040,
1158
+ 2042,
1159
+ 2045,
1160
+ 2046,
1161
+ 2047,
1162
+ 2048,
1163
+ 2050,
1164
+ 2054,
1165
+ 2056,
1166
+ 2057,
1167
+ 2058,
1168
+ 2060,
1169
+ 2061,
1170
+ 2062,
1171
+ 2063,
1172
+ 2064,
1173
+ 2066,
1174
+ 2067,
1175
+ 2068,
1176
+ 2070,
1177
+ 2072,
1178
+ 2074,
1179
+ 2075,
1180
+ 2077,
1181
+ 2081,
1182
+ 852,
1183
+ 2082,
1184
+ 2083,
1185
+ 2084,
1186
+ 2086,
1187
+ 2087,
1188
+ 2089,
1189
+ 2091,
1190
+ 2092,
1191
+ 2094,
1192
+ 2096,
1193
+ 2097,
1194
+ 2098,
1195
+ 2099,
1196
+ 2101
1197
+ ]
1198
+ }
lemmatizer/{lookups/lookups.bin → model} RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:509c45ce6556e5253e8d77bd3af683619f92723e4df1028bc1a17f9aec1a47ea
3
- size 6953602
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a930ffab83ea245ace02a91e5b332dff291f73bdf2833e42950d3be6b47e65db
3
+ size 463714
lemmatizer/trees ADDED
Binary file (288 kB). View file
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"nl",
3
  "name":"core_news_lg",
4
- "version":"3.2.0",
5
- "description":"Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
- "spacy_version":">=3.2.0,<3.3.0",
11
- "spacy_git_version":"bb26550e2",
12
  "vectors":{
13
  "width":300,
14
  "vectors":500000,
@@ -328,15 +328,8 @@
328
  "punct",
329
  "xcomp"
330
  ],
331
- "senter":[
332
- "I",
333
- "S"
334
- ],
335
  "attribute_ruler":[
336
 
337
- ],
338
- "lemmatizer":[
339
-
340
  ],
341
  "ner":[
342
  "CARDINAL",
@@ -364,8 +357,8 @@
364
  "morphologizer",
365
  "tagger",
366
  "parser",
367
- "attribute_ruler",
368
  "lemmatizer",
 
369
  "ner"
370
  ],
371
  "components":[
@@ -373,350 +366,273 @@
373
  "morphologizer",
374
  "tagger",
375
  "parser",
 
376
  "senter",
377
  "attribute_ruler",
378
- "lemmatizer",
379
  "ner"
380
  ],
381
  "disabled":[
382
  "senter"
383
  ],
384
  "performance":{
385
- "token_acc":0.9997165842,
386
- "token_p":0.9974281853,
387
- "token_r":0.9975586363,
388
- "token_f":0.9974934066,
389
- "pos_acc":0.9667175573,
390
- "morph_acc":0.9649471044,
391
- "morph_micro_p":0.9743803954,
392
- "morph_micro_r":0.9558803442,
393
- "morph_micro_f":0.9650417155,
394
- "morph_per_feat":{
395
- "Person":{
396
- "p":0.9921875,
397
- "r":0.9713193117,
398
- "f":0.9816425121
399
- },
400
- "Poss":{
401
- "p":0.9885496183,
402
- "r":0.9923371648,
403
- "f":0.9904397706
404
- },
405
- "PronType":{
406
- "p":0.9914529915,
407
- "r":0.9642560266,
408
- "f":0.9776654024
409
- },
410
- "Gender":{
411
- "p":0.9408033827,
412
- "r":0.9056219791,
413
- "f":0.9228775113
414
- },
415
- "Number":{
416
- "p":0.9837895703,
417
- "r":0.9628798084,
418
- "f":0.9732223903
419
- },
420
- "Tense":{
421
- "p":0.9777901166,
422
- "r":0.9681143485,
423
- "f":0.9729281768
424
- },
425
- "VerbForm":{
426
- "p":0.9640653358,
427
- "r":0.9557394746,
428
- "f":0.9598843513
429
- },
430
- "Degree":{
431
- "p":0.9628550619,
432
- "r":0.9497126437,
433
- "f":0.956238698
434
- },
435
- "Definite":{
436
- "p":0.9955869373,
437
- "r":0.9929577465,
438
- "f":0.9942706038
439
- },
440
- "Case":{
441
- "p":0.998003992,
442
- "r":0.9960159363,
443
- "f":0.9970089731
444
- },
445
- "Reflex":{
446
- "p":1.0,
447
- "r":1.0,
448
- "f":1.0
449
- },
450
- "Abbr":{
451
- "p":1.0,
452
- "r":0.6666666667,
453
- "f":0.8
454
- }
455
- },
456
- "tag_acc":0.9543293348,
457
- "sents_p":0.866340098,
458
- "sents_r":0.8880918221,
459
- "sents_f":0.8770811194,
460
- "dep_uas":0.8701736595,
461
- "dep_las":0.8260704236,
462
  "dep_las_per_type":{
463
- "det":{
464
- "p":0.8869644485,
465
- "r":0.959566075,
466
- "f":0.9218379915
467
  },
468
  "nsubj":{
469
- "p":0.7900466563,
470
- "r":0.8246753247,
471
- "f":0.8069896743
472
  },
473
- "root":{
474
- "p":0.7489932886,
475
- "r":0.8328358209,
476
- "f":0.7886925795
477
  },
478
- "case":{
479
- "p":0.8836805556,
480
- "r":0.9382488479,
481
- "f":0.910147519
482
  },
483
- "obl":{
484
- "p":0.7313195548,
485
- "r":0.7290015848,
486
- "f":0.7301587302
487
  },
488
- "nmod":{
489
- "p":0.596651446,
490
- "r":0.6841186736,
491
- "f":0.637398374
492
  },
493
- "advmod":{
494
- "p":0.75,
495
- "r":0.7696969697,
496
- "f":0.7597208375
497
  },
498
- "obj":{
499
- "p":0.7902621723,
500
- "r":0.778597786,
501
- "f":0.7843866171
502
  },
503
  "mark":{
504
- "p":0.8445378151,
505
- "r":0.8305785124,
506
- "f":0.8375
507
- },
508
- "advcl":{
509
- "p":0.5137614679,
510
- "r":0.4628099174,
511
- "f":0.4869565217
512
  },
513
- "amod":{
514
- "p":0.7834710744,
515
- "r":0.8649635036,
516
- "f":0.8222029488
517
  },
518
- "acl:relcl":{
519
- "p":0.6352941176,
520
- "r":0.6585365854,
521
- "f":0.6467065868
522
  },
523
- "cop":{
524
- "p":0.7862068966,
525
- "r":0.6263736264,
526
- "f":0.6972477064
527
  },
528
- "cc":{
529
- "p":0.8,
530
- "r":0.8384879725,
531
- "f":0.8187919463
532
  },
533
- "conj":{
534
- "p":0.5825242718,
535
- "r":0.5309734513,
536
- "f":0.5555555556
537
  },
538
- "fixed":{
539
- "p":0.6690647482,
540
- "r":0.2520325203,
541
- "f":0.3661417323
542
  },
543
  "flat":{
544
- "p":0.7995991984,
545
- "r":0.6797274276,
546
- "f":0.7348066298
547
  },
548
- "csubj":{
549
- "p":0.5,
550
- "r":0.1666666667,
551
- "f":0.25
552
  },
553
- "aux":{
554
- "p":0.7714285714,
555
- "r":0.786407767,
556
- "f":0.7788461538
557
  },
558
- "compound:prt":{
559
- "p":0.776119403,
560
- "r":0.6753246753,
561
- "f":0.7222222222
562
  },
563
  "nummod":{
564
- "p":0.59375,
565
- "r":0.6506849315,
566
- "f":0.6209150327
567
  },
568
- "acl":{
569
- "p":0.5098039216,
570
- "r":0.4406779661,
571
- "f":0.4727272727
572
  },
573
- "expl":{
574
- "p":0.4,
575
- "r":0.3333333333,
576
- "f":0.3636363636
577
  },
578
- "appos":{
579
- "p":0.5625,
580
- "r":0.4682080925,
581
- "f":0.5110410095
582
  },
583
  "nsubj:pass":{
584
- "p":0.8023255814,
585
- "r":0.8023255814,
586
- "f":0.8023255814
587
  },
588
  "aux:pass":{
589
- "p":0.8823529412,
590
- "r":0.9183673469,
591
- "f":0.9
592
  },
593
- "ccomp":{
594
- "p":0.6666666667,
595
- "r":0.5294117647,
596
- "f":0.5901639344
597
  },
598
- "xcomp":{
599
- "p":0.4285714286,
600
- "r":0.698630137,
601
- "f":0.53125
602
  },
603
  "parataxis":{
604
- "p":0.3644067797,
605
- "r":0.288590604,
606
- "f":0.3220973783
607
  },
608
- "expl:pv":{
609
- "p":0.7894736842,
610
- "r":0.7894736842,
611
- "f":0.7894736842
612
  },
613
- "iobj":{
614
- "p":0.4444444444,
615
- "r":0.4,
616
- "f":0.4210526316
617
  },
618
- "nmod:poss":{
619
- "p":0.8616352201,
620
- "r":0.8954248366,
621
- "f":0.8782051282
 
 
 
 
 
 
 
 
 
 
622
  },
623
  "dep":{
624
  "p":0.0,
625
  "r":0.0,
626
  "f":0.0
627
  },
628
- "obl:agent":{
629
- "p":0.8461538462,
630
- "r":0.7857142857,
631
- "f":0.8148148148
632
- },
633
  "orphan":{
634
  "p":0.0,
635
  "r":0.0,
636
  "f":0.0
637
  }
638
  },
639
- "lemma_acc":0.8159277755,
640
- "ents_p":0.7772241993,
641
- "ents_r":0.755186722,
642
- "ents_f":0.7660470011,
643
  "ents_per_type":{
644
- "DATE":{
645
- "p":0.931372549,
646
- "r":0.9253246753,
647
- "f":0.9283387622
648
- },
649
- "NORP":{
650
- "p":0.8181818182,
651
- "r":0.8674698795,
652
- "f":0.8421052632
653
- },
654
  "ORG":{
655
- "p":0.6959459459,
656
- "r":0.6094674556,
657
- "f":0.6498422713
658
- },
659
- "CARDINAL":{
660
- "p":0.8607594937,
661
- "r":0.9714285714,
662
- "f":0.9127516779
663
  },
664
- "GPE":{
665
- "p":0.785046729,
666
- "r":0.9230769231,
667
- "f":0.8484848485
668
  },
669
  "QUANTITY":{
670
- "p":0.8571428571,
671
- "r":1.0,
672
- "f":0.9230769231
673
  },
674
- "PERCENT":{
675
- "p":1.0,
676
- "r":0.8333333333,
677
- "f":0.9090909091
678
  },
679
- "PERSON":{
680
- "p":0.7788778878,
681
- "r":0.7637540453,
682
- "f":0.7712418301
683
  },
684
- "LAW":{
685
- "p":1.0,
686
- "r":0.3333333333,
687
- "f":0.5
688
  },
689
  "EVENT":{
690
- "p":0.4761904762,
691
- "r":0.4347826087,
692
- "f":0.4545454545
693
  },
694
- "WORK_OF_ART":{
695
- "p":0.5882352941,
696
- "r":0.4444444444,
697
- "f":0.5063291139
 
 
 
 
 
698
  },
699
  "ORDINAL":{
700
- "p":0.96875,
701
- "r":0.9393939394,
702
- "f":0.9538461538
703
  },
704
  "LANGUAGE":{
705
- "p":0.75,
706
- "r":0.8181818182,
707
- "f":0.7826086957
 
 
 
 
 
708
  },
709
  "LOC":{
710
- "p":0.5333333333,
711
- "r":0.2352941176,
712
- "f":0.3265306122
713
  },
714
- "FAC":{
715
- "p":0.1,
716
- "r":0.2142857143,
717
- "f":0.1363636364
718
  },
719
- "PRODUCT":{
720
  "p":0.0,
721
  "r":0.0,
722
  "f":0.0
@@ -726,13 +642,90 @@
726
  "r":0.0,
727
  "f":0.0
728
  },
729
- "TIME":{
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
730
  "p":1.0,
731
  "r":1.0,
732
  "f":1.0
 
 
 
 
 
733
  }
734
  },
735
- "speed":3053.5292904657
736
  },
737
  "sources":[
738
  {
@@ -759,12 +752,6 @@
759
  "license":"CC BY-SA 4.0",
760
  "author":"Zeman, Daniel; \u017dabokrtsk\u00fd, Zden\u011bk; Bouma, Gosse; van Noord, Gertjan"
761
  },
762
- {
763
- "name":"spaCy lookups data",
764
- "author":"Explosion",
765
- "url":"https://github.com/explosion/spacy-lookups-data",
766
- "license":"MIT"
767
- },
768
  {
769
  "name":"Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)",
770
  "url":"https://spacy.io",
1
  {
2
  "lang":"nl",
3
  "name":"core_news_lg",
4
+ "version":"3.3.0",
5
+ "description":"Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, lemmatizer (trainable_lemmatizer), senter, ner.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
+ "spacy_version":">=3.3.0.dev0,<3.4.0",
11
+ "spacy_git_version":"849bef2de",
12
  "vectors":{
13
  "width":300,
14
  "vectors":500000,
328
  "punct",
329
  "xcomp"
330
  ],
 
 
 
 
331
  "attribute_ruler":[
332
 
 
 
 
333
  ],
334
  "ner":[
335
  "CARDINAL",
357
  "morphologizer",
358
  "tagger",
359
  "parser",
 
360
  "lemmatizer",
361
+ "attribute_ruler",
362
  "ner"
363
  ],
364
  "components":[
366
  "morphologizer",
367
  "tagger",
368
  "parser",
369
+ "lemmatizer",
370
  "senter",
371
  "attribute_ruler",
 
372
  "ner"
373
  ],
374
  "disabled":[
375
  "senter"
376
  ],
377
  "performance":{
378
+ "tag_acc":0.9534133043,
379
+ "sents_p":0.8595155709,
380
+ "sents_r":0.8909612626,
381
+ "sents_f":0.8749559704,
382
+ "dep_uas":0.8698053923,
383
+ "dep_las":0.8235860531,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
384
  "dep_las_per_type":{
385
+ "nmod:poss":{
386
+ "p":0.9456521739,
387
+ "r":0.9525547445,
388
+ "f":0.9490909091
389
  },
390
  "nsubj":{
391
+ "p":0.8466494845,
392
+ "r":0.8639053254,
393
+ "f":0.8551903677
394
  },
395
+ "aux":{
396
+ "p":0.9102990033,
397
+ "r":0.9013157895,
398
+ "f":0.905785124
399
  },
400
+ "advmod":{
401
+ "p":0.7967269595,
402
+ "r":0.8258928571,
403
+ "f":0.8110477861
404
  },
405
+ "root":{
406
+ "p":0.8636678201,
407
+ "r":0.8952654232,
408
+ "f":0.8791828108
409
  },
410
+ "det":{
411
+ "p":0.946263644,
412
+ "r":0.9753353527,
413
+ "f":0.9605795866
414
  },
415
+ "amod":{
416
+ "p":0.8720420684,
417
+ "r":0.8915770609,
418
+ "f":0.8817013735
419
  },
420
+ "obl":{
421
+ "p":0.7456021651,
422
+ "r":0.7461069736,
423
+ "f":0.7458544839
424
  },
425
  "mark":{
426
+ "p":0.8880866426,
427
+ "r":0.8945454545,
428
+ "f":0.8913043478
 
 
 
 
 
429
  },
430
+ "ccomp":{
431
+ "p":0.6727272727,
432
+ "r":0.691588785,
433
+ "f":0.6820276498
434
  },
435
+ "case":{
436
+ "p":0.9379652605,
437
+ "r":0.9606099111,
438
+ "f":0.9491525424
439
  },
440
+ "appos":{
441
+ "p":0.7060702875,
442
+ "r":0.6696969697,
443
+ "f":0.6874027994
444
  },
445
+ "obj":{
446
+ "p":0.7875968992,
447
+ "r":0.7755725191,
448
+ "f":0.7815384615
449
  },
450
+ "compound:prt":{
451
+ "p":0.7755102041,
452
+ "r":0.7136150235,
453
+ "f":0.7432762836
454
  },
455
+ "xcomp":{
456
+ "p":0.6765799257,
457
+ "r":0.6618181818,
458
+ "f":0.6691176471
459
  },
460
  "flat":{
461
+ "p":0.8115124153,
462
+ "r":0.7624602333,
463
+ "f":0.7862219792
464
  },
465
+ "expl:pv":{
466
+ "p":0.7674418605,
467
+ "r":0.75,
468
+ "f":0.7586206897
469
  },
470
+ "acl":{
471
+ "p":0.4615384615,
472
+ "r":0.3673469388,
473
+ "f":0.4090909091
474
  },
475
+ "advcl":{
476
+ "p":0.5024875622,
477
+ "r":0.454954955,
478
+ "f":0.4775413712
479
  },
480
  "nummod":{
481
+ "p":0.8141025641,
482
+ "r":0.8466666667,
483
+ "f":0.8300653595
484
  },
485
+ "nmod":{
486
+ "p":0.7278742763,
487
+ "r":0.7652173913,
488
+ "f":0.746078847
489
  },
490
+ "cc":{
491
+ "p":0.8544776119,
492
+ "r":0.8674242424,
493
+ "f":0.8609022556
494
  },
495
+ "conj":{
496
+ "p":0.6463245492,
497
+ "r":0.6331521739,
498
+ "f":0.6396705559
499
  },
500
  "nsubj:pass":{
501
+ "p":0.8083832335,
502
+ "r":0.8490566038,
503
+ "f":0.8282208589
504
  },
505
  "aux:pass":{
506
+ "p":0.8871794872,
507
+ "r":0.9611111111,
508
+ "f":0.9226666667
509
  },
510
+ "iobj":{
511
+ "p":0.5652173913,
512
+ "r":0.3939393939,
513
+ "f":0.4642857143
514
  },
515
+ "cop":{
516
+ "p":0.7789473684,
517
+ "r":0.8131868132,
518
+ "f":0.7956989247
519
  },
520
  "parataxis":{
521
+ "p":0.3663366337,
522
+ "r":0.268115942,
523
+ "f":0.309623431
524
  },
525
+ "acl:relcl":{
526
+ "p":0.6956521739,
527
+ "r":0.7044025157,
528
+ "f":0.7
529
  },
530
+ "fixed":{
531
+ "p":0.721448468,
532
+ "r":0.4683544304,
533
+ "f":0.5679824561
534
  },
535
+ "obl:agent":{
536
+ "p":0.9615384615,
537
+ "r":0.8620689655,
538
+ "f":0.9090909091
539
+ },
540
+ "expl":{
541
+ "p":0.4,
542
+ "r":0.4761904762,
543
+ "f":0.4347826087
544
+ },
545
+ "csubj":{
546
+ "p":0.6111111111,
547
+ "r":0.55,
548
+ "f":0.5789473684
549
  },
550
  "dep":{
551
  "p":0.0,
552
  "r":0.0,
553
  "f":0.0
554
  },
 
 
 
 
 
555
  "orphan":{
556
  "p":0.0,
557
  "r":0.0,
558
  "f":0.0
559
  }
560
  },
561
+ "ents_p":0.7652916074,
562
+ "ents_r":0.7441217151,
563
+ "ents_f":0.7545582048,
 
564
  "ents_per_type":{
 
 
 
 
 
 
 
 
 
 
565
  "ORG":{
566
+ "p":0.0,
567
+ "r":0.0,
568
+ "f":0.0
 
 
 
 
 
569
  },
570
+ "PERSON":{
571
+ "p":0.0,
572
+ "r":0.0,
573
+ "f":0.0
574
  },
575
  "QUANTITY":{
576
+ "p":0.0,
577
+ "r":0.0,
578
+ "f":0.0
579
  },
580
+ "CARDINAL":{
581
+ "p":0.0,
582
+ "r":0.0,
583
+ "f":0.0
584
  },
585
+ "NORP":{
586
+ "p":0.0,
587
+ "r":0.0,
588
+ "f":0.0
589
  },
590
+ "DATE":{
591
+ "p":0.0,
592
+ "r":0.0,
593
+ "f":0.0
594
  },
595
  "EVENT":{
596
+ "p":0.0,
597
+ "r":0.0,
598
+ "f":0.0
599
  },
600
+ "PRODUCT":{
601
+ "p":0.0,
602
+ "r":0.0,
603
+ "f":0.0
604
+ },
605
+ "GPE":{
606
+ "p":0.0,
607
+ "r":0.0,
608
+ "f":0.0
609
  },
610
  "ORDINAL":{
611
+ "p":0.0,
612
+ "r":0.0,
613
+ "f":0.0
614
  },
615
  "LANGUAGE":{
616
+ "p":0.0,
617
+ "r":0.0,
618
+ "f":0.0
619
+ },
620
+ "TIME":{
621
+ "p":0.0,
622
+ "r":0.0,
623
+ "f":0.0
624
  },
625
  "LOC":{
626
+ "p":0.0,
627
+ "r":0.0,
628
+ "f":0.0
629
  },
630
+ "WORK_OF_ART":{
631
+ "p":0.0,
632
+ "r":0.0,
633
+ "f":0.0
634
  },
635
+ "FAC":{
636
  "p":0.0,
637
  "r":0.0,
638
  "f":0.0
642
  "r":0.0,
643
  "f":0.0
644
  },
645
+ "PERCENT":{
646
+ "p":0.0,
647
+ "r":0.0,
648
+ "f":0.0
649
+ },
650
+ "LAW":{
651
+ "p":0.0,
652
+ "r":0.0,
653
+ "f":0.0
654
+ }
655
+ },
656
+ "speed":10256.4374458498,
657
+ "token_acc":0.9997165842,
658
+ "token_p":0.9974281853,
659
+ "token_r":0.9975586363,
660
+ "token_f":0.9974934066,
661
+ "pos_acc":0.9661941112,
662
+ "morph_acc":0.9635947213,
663
+ "morph_micro_p":0.9722389581,
664
+ "morph_micro_r":0.9551518463,
665
+ "morph_micro_f":0.9636196601,
666
+ "morph_per_feat":{
667
+ "Person":{
668
+ "p":0.9892891918,
669
+ "r":0.9713193117,
670
+ "f":0.9802219006
671
+ },
672
+ "Poss":{
673
+ "p":0.9811320755,
674
+ "r":0.9961685824,
675
+ "f":0.9885931559
676
+ },
677
+ "PronType":{
678
+ "p":0.9880647911,
679
+ "r":0.9634247714,
680
+ "f":0.9755892256
681
+ },
682
+ "Gender":{
683
+ "p":0.9345253747,
684
+ "r":0.90409565,
685
+ "f":0.9190587018
686
+ },
687
+ "Number":{
688
+ "p":0.9816793893,
689
+ "r":0.9624307738,
690
+ "f":0.9719597914
691
+ },
692
+ "Tense":{
693
+ "p":0.9805339266,
694
+ "r":0.9692138538,
695
+ "f":0.9748410285
696
+ },
697
+ "VerbForm":{
698
+ "p":0.9623188406,
699
+ "r":0.9557394746,
700
+ "f":0.9590178733
701
+ },
702
+ "Degree":{
703
+ "p":0.9655677656,
704
+ "r":0.9468390805,
705
+ "f":0.9561117156
706
+ },
707
+ "Definite":{
708
+ "p":0.9942680776,
709
+ "r":0.9925176056,
710
+ "f":0.9933920705
711
+ },
712
+ "Case":{
713
+ "p":0.998,
714
+ "r":0.9940239044,
715
+ "f":0.996007984
716
+ },
717
+ "Reflex":{
718
  "p":1.0,
719
  "r":1.0,
720
  "f":1.0
721
+ },
722
+ "Abbr":{
723
+ "p":1.0,
724
+ "r":0.5,
725
+ "f":0.6666666667
726
  }
727
  },
728
+ "lemma_acc":0.9417537126
729
  },
730
  "sources":[
731
  {
752
  "license":"CC BY-SA 4.0",
753
  "author":"Zeman, Daniel; \u017dabokrtsk\u00fd, Zden\u011bk; Bouma, Gosse; van Noord, Gertjan"
754
  },
 
 
 
 
 
 
755
  {
756
  "name":"Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)",
757
  "url":"https://spacy.io",
morphologizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:56e7ac008019bd6e097c49720cedc515db44b718ade953ad79bb5ae323f6a2e6
3
- size 25598
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:002fb75f4ff6e09c75716386d3f603669f6b229fad7ba1f7b6a26620e998b0d5
3
+ size 25650
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2a480ff5f7320c293d2115cd54cc9ba86ba4f271d4f9204c515e3cb489feafd2
3
- size 7106353
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:377530352be09f8af817499145d3dd8a3abc32bd7d952084a390046ee59660f8
3
+ size 6511153
nl_core_news_lg-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e6c8b7d8f97177893cd644cca554b562be309f7173167254bbde419959d5fc56
3
- size 572644286
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1f2d97ecb9d50259b0b25ebc57a3e2f79a7b6260a9e238c26792ed1854d9830d
3
+ size 568149198
parser/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:03903a7b1c04f033d818c6a61f36211f31ebbf6f31614098ed3796a2895a0fcb
3
  size 315229
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a641079ec81300f2485528a68139797e6bccea9122ae8c014303aa65a075bfc
3
  size 315229
parser/moves CHANGED
@@ -1 +1 @@
1
- ��moves��{"0":{"":151558},"1":{"":91349},"2":{"det":29810,"case":26215,"nsubj":13579,"amod":12918,"punct":11737,"advmod":9702,"obl":8128,"mark":6683,"cc":5438,"obj":4515,"aux":4218,"nsubj:pass":2513,"aux:pass":2468,"cop":2077,"nummod":2050,"nmod:poss":2023,"nmod":1255,"xcomp":1160,"compound:prt":839,"advcl":643,"acl":505,"parataxis":416,"iobj":307,"expl":273,"advmod||xcomp":266,"expl:pv":261,"obl||xcomp":259,"obl:agent":227,"obj||xcomp":200,"case||obl":162,"ccomp":108,"expl||advcl":60,"case||advcl":51,"obl||ccomp":50,"csubj":50,"advmod||ccomp":49,"obj||ccomp":47,"obl||obj":42,"advcl||xcomp":31,"dep":0},"3":{"punct":19438,"nmod":13028,"flat":9160,"conj":7136,"obl":6802,"fixed":4623,"nsubj":4273,"appos":3320,"obj":3142,"advmod":3090,"parataxis":2280,"xcomp":2095,"acl:relcl":2032,"advcl":1595,"compound:prt":1376,"cop":1281,"ccomp":1230,"acl":774,"amod":490,"aux:pass":398,"csubj":395,"nummod":365,"aux":355,"iobj":229,"expl:pv":225,"obl:agent":221,"nmod||obj":178,"advcl||advmod":152,"case":147,"acl:relcl||obj":135,"case||obl":132,"acl:relcl||nsubj":98,"acl||obj":88,"expl":83,"mark":69,"orphan":68,"acl:relcl||nsubj:pass":55,"obl||xcomp":53,"expl||advcl":47,"cc":35,"advcl||amod":35,"advcl||nmod":34,"obl||obj":32,"nmod||nsubj":31,"dep":0},"4":{"ROOT":18070}}�cfg��neg_key�
1
+ ��moves��{"0":{"":151638},"1":{"":91446},"2":{"det":29834,"case":26244,"nsubj":13574,"amod":12930,"punct":11769,"advmod":9701,"obl":8122,"mark":6683,"cc":5442,"obj":4514,"aux":4218,"nsubj:pass":2513,"aux:pass":2468,"cop":2077,"nummod":2054,"nmod:poss":2019,"nmod":1249,"xcomp":1160,"compound:prt":839,"advcl":643,"acl":507,"parataxis":413,"iobj":307,"expl":273,"advmod||xcomp":266,"expl:pv":261,"obl||xcomp":259,"obl:agent":227,"obj||xcomp":200,"case||obl":162,"ccomp":108,"expl||advcl":60,"case||advcl":51,"obl||ccomp":50,"csubj":50,"advmod||ccomp":49,"obj||ccomp":47,"obl||obj":42,"advcl||xcomp":31,"dep":0},"3":{"punct":19558,"nmod":12995,"flat":9165,"conj":7139,"obl":6798,"fixed":4636,"nsubj":4269,"appos":3301,"obj":3144,"advmod":3092,"parataxis":2279,"xcomp":2098,"acl:relcl":2035,"advcl":1596,"compound:prt":1376,"cop":1282,"ccomp":1231,"acl":775,"amod":490,"aux:pass":398,"csubj":396,"nummod":365,"aux":355,"iobj":230,"expl:pv":225,"obl:agent":221,"nmod||obj":178,"advcl||advmod":152,"case":147,"acl:relcl||obj":135,"case||obl":132,"acl:relcl||nsubj":98,"acl||obj":88,"expl":83,"mark":69,"orphan":68,"acl:relcl||nsubj:pass":55,"obl||xcomp":53,"expl||advcl":47,"cc":35,"advcl||amod":35,"advcl||nmod":34,"obl||obj":32,"nmod||nsubj":31,"dep":0},"4":{"ROOT":18078}}�cfg��neg_key�
senter/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:45e4c3f9dce8981c0fd28d0508a8bf86b17ddb83519b98866d711ca08c9e6794
3
- size 219901
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b950cda3c8d2d290ae31b1e6c39157efa215b65c09f0f4560fb0b5e773bb2e3
3
+ size 219953
tagger/cfg CHANGED
@@ -203,5 +203,6 @@
203
  "WW|vd|prenom|zonder",
204
  "WW|vd|vrij|zonder"
205
  ],
 
206
  "overwrite":false
207
  }
203
  "WW|vd|prenom|zonder",
204
  "WW|vd|vrij|zonder"
205
  ],
206
+ "neg_prefix":"!",
207
  "overwrite":false
208
  }
tagger/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:143980c7cff527855fc0886d694869737e9e739e9209b5e1f0c73cbe374c001e
3
- size 78761
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c241bc6b1c22b0c4a305e5f647a903b6bb1f79520bc8ad0ea1917ed7484fa5e4
3
+ size 78813
tok2vec/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:86e1ef8452c6fd1c5e44fe4d220e0fe65539f0893e4366563213eb901edbaaec
3
- size 6960804
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a746887c36b1681fcca5f0cd12c159c61ab1ea33aac78377e59672be1ee74d19
3
+ size 6365604
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a81d5d5168ce264f0ff117a04099a8d6cce7b7366419e368481c07dd253127e3
3
- size 10135659
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d1a5fc70ba2fd5c9823bd960f3b3f6c9b84ef776f0b4668f997cb871be8c859c
3
+ size 10172588