osanseviero HF staff commited on
Commit
80a1e7f
1 Parent(s): 5316f78

Update spaCy pipeline

Browse files
LICENSES_SOURCES CHANGED
@@ -1,4 +1,4 @@
1
- # UD Dutch LassySmall v2.5
2
 
3
  * Author: Bouma, Gosse; van Noord, Gertjan
4
  * URL: https://github.com/UniversalDependencies/UD_Dutch-LassySmall
@@ -878,7 +878,7 @@ Creative Commons may be contacted at creativecommons.org.
878
 
879
 
880
 
881
- # UD Dutch LassySmall v2.5
882
 
883
  * Author: Bouma, Gosse; van Noord, Gertjan
884
  * URL: https://github.com/UniversalDependencies/UD_Dutch-LassySmall
@@ -1318,7 +1318,7 @@ Creative Commons may be contacted at creativecommons.org.
1318
 
1319
 
1320
 
1321
- # UD Dutch Alpino v2.5
1322
 
1323
  * Author: Zeman, Daniel; Žabokrtský, Zdeněk; Bouma, Gosse; van Noord, Gertjan
1324
  * URL: https://github.com/UniversalDependencies/UD_Dutch-Alpino
1
+ # UD Dutch LassySmall v2.8
2
 
3
  * Author: Bouma, Gosse; van Noord, Gertjan
4
  * URL: https://github.com/UniversalDependencies/UD_Dutch-LassySmall
878
 
879
 
880
 
881
+ # UD Dutch LassySmall v2.8
882
 
883
  * Author: Bouma, Gosse; van Noord, Gertjan
884
  * URL: https://github.com/UniversalDependencies/UD_Dutch-LassySmall
1318
 
1319
 
1320
 
1321
+ # UD Dutch Alpino v2.8
1322
 
1323
  * Author: Zeman, Daniel; Žabokrtský, Zdeněk; Bouma, Gosse; van Noord, Gertjan
1324
  * URL: https://github.com/UniversalDependencies/UD_Dutch-Alpino
README.md CHANGED
@@ -4,7 +4,7 @@ tags:
4
  - token-classification
5
  language:
6
  - nl
7
- license: CC-BY-SA-4.0
8
  model-index:
9
  - name: nl_core_news_sm
10
  results:
@@ -14,47 +14,47 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.7305475504
18
  - name: NER Recall
19
  type: recall
20
- value: 0.7012448133
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.7155963303
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
- value: 0.9349210503
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
- value: 0.8511659808
38
  - name: SENTER Recall
39
  type: recall
40
- value: 0.8902439024
41
  - name: SENTER F Score
42
  type: f_score
43
- value: 0.8702664797
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
- value: 0.8552811724
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
- value: 0.8552811724
58
  ---
59
  ### Details: https://spacy.io/models/nl#nl_core_news_sm
60
 
@@ -63,12 +63,12 @@ Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, pa
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `nl_core_news_sm` |
66
- | **Version** | `3.1.0` |
67
- | **spaCy** | `>=3.1.0,<3.2.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
71
- | **Sources** | [UD Dutch LassySmall v2.5](https://github.com/UniversalDependencies/UD_Dutch-LassySmall) (Bouma, Gosse; van Noord, Gertjan)<br />[Dutch NER Annotations for UD LassySmall](https://nlp.town) (NLP Town)<br />[UD Dutch LassySmall v2.5](https://github.com/UniversalDependencies/UD_Dutch-LassySmall) (Bouma, Gosse; van Noord, Gertjan)<br />[UD Dutch Alpino v2.5](https://github.com/UniversalDependencies/UD_Dutch-Alpino) (Zeman, Daniel; Žabokrtský, Zdeněk; Bouma, Gosse; van Noord, Gertjan)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion) |
72
  | **License** | `CC BY-SA 4.0` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
@@ -76,12 +76,12 @@ Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, pa
76
 
77
  <details>
78
 
79
- <summary>View label scheme (318 labels for 5 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
- | **`morphologizer`** | `POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=AUX\|Tense=Pres\|VerbForm=Fin`, `POS=ADV`, `POS=VERB\|VerbForm=Part`, `POS=PUNCT`, `Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Fin`, `POS=ADP`, `POS=NUM`, `Number=Plur\|POS=NOUN`, `POS=VERB\|VerbForm=Inf`, `POS=SCONJ`, `Definite=Def\|POS=DET`, `Gender=Com\|Number=Sing\|POS=NOUN`, `Number=Sing\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Degree=Pos\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=PROPN`, `Gender=Com\|Number=Sing\|POS=PROPN`, `POS=AUX\|VerbForm=Inf`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Fin`, `POS=DET`, `Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|Person=3\|PronType=Prs`, `POS=CCONJ`, `Number=Plur\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `POS=PRON\|Person=3\|PronType=Ind`, `Degree=Cmp\|POS=ADJ`, `Case=Nom\|POS=PRON\|Person=1\|PronType=Prs`, `Definite=Ind\|POS=DET`, `Case=Nom\|POS=PRON\|Person=3\|PronType=Prs`, `POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Number=Plur\|POS=AUX\|Tense=Pres\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Case=Acc\|POS=PRON\|Person=1\|PronType=Prs`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Fin`, `Gender=Com,Neut\|Number=Sing\|POS=NOUN`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs`, `POS=PROPN`, `POS=PRON\|PronType=Ind`, `POS=PRON\|Person=3\|PronType=Int`, `Case=Acc\|POS=PRON\|PronType=Rcp`, `Number=Plur\|POS=AUX\|Tense=Past\|VerbForm=Fin`, `Number=Sing\|POS=NOUN`, `POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `POS=SYM`, `Abbr=Yes\|POS=X`, `Gender=Com,Neut\|Number=Sing\|POS=PROPN`, `Degree=Sup\|POS=ADJ`, `Foreign=Yes\|POS=X`, `POS=ADJ`, `Number=Sing\|POS=PROPN`, `POS=PRON\|PronType=Dem`, `POS=AUX\|VerbForm=Part`, `POS=PRON\|Person=3\|PronType=Rel`, `Number=Plur\|POS=PROPN`, `POS=PRON\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Dat\|POS=PRON\|PronType=Dem`, `Case=Nom\|POS=PRON\|Person=2\|PronType=Prs`, `POS=X`, `POS=INTJ`, `Case=Gen\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=PRON\|PronType=Int`, `Case=Acc\|POS=PRON\|Person=2\|PronType=Prs`, `POS=PRON\|Person=2\|PronType=Prs`, `Case=Gen\|POS=PRON\|Person=2\|PronType=Prs` |
84
- | **`tagger`** | `ADJ\|nom\|basis\|met-e\|mv-n`, `ADJ\|nom\|basis\|met-e\|zonder-n\|bijz`, `ADJ\|nom\|basis\|met-e\|zonder-n\|stan`, `ADJ\|nom\|basis\|zonder\|mv-n`, `ADJ\|nom\|basis\|zonder\|zonder-n`, `ADJ\|nom\|comp\|met-e\|mv-n`, `ADJ\|nom\|comp\|met-e\|zonder-n\|stan`, `ADJ\|nom\|sup\|met-e\|mv-n`, `ADJ\|nom\|sup\|met-e\|zonder-n\|stan`, `ADJ\|nom\|sup\|zonder\|zonder-n`, `ADJ\|postnom\|basis\|met-s`, `ADJ\|postnom\|basis\|zonder`, `ADJ\|postnom\|comp\|met-s`, `ADJ\|prenom\|basis\|met-e\|bijz`, `ADJ\|prenom\|basis\|met-e\|stan`, `ADJ\|prenom\|basis\|zonder`, `ADJ\|prenom\|comp\|met-e\|stan`, `ADJ\|prenom\|comp\|zonder`, `ADJ\|prenom\|sup\|met-e\|stan`, `ADJ\|vrij\|basis\|zonder`, `ADJ\|vrij\|comp\|zonder`, `ADJ\|vrij\|dim\|zonder`, `ADJ\|vrij\|sup\|zonder`, `BW`, `LET`, `LID\|bep\|dat\|evmo`, `LID\|bep\|gen\|evmo`, `LID\|bep\|gen\|rest3`, `LID\|bep\|stan\|evon`, `LID\|bep\|stan\|rest`, `LID\|onbep\|stan\|agr`, `N\|eigen\|ev\|basis\|gen`, `N\|eigen\|ev\|basis\|genus\|stan`, `N\|eigen\|ev\|basis\|onz\|stan`, `N\|eigen\|ev\|basis\|zijd\|stan`, `N\|eigen\|ev\|dim\|onz\|stan`, `N\|eigen\|mv\|basis`, `N\|soort\|ev\|basis\|dat`, `N\|soort\|ev\|basis\|gen`, `N\|soort\|ev\|basis\|genus\|stan`, `N\|soort\|ev\|basis\|onz\|stan`, `N\|soort\|ev\|basis\|zijd\|stan`, `N\|soort\|ev\|dim\|onz\|stan`, `N\|soort\|mv\|basis`, `N\|soort\|mv\|dim`, `SPEC\|afgebr`, `SPEC\|afk`, `SPEC\|deeleigen`, `SPEC\|enof`, `SPEC\|meta`, `SPEC\|symb`, `SPEC\|vreemd`, `TSW`, `TW\|hoofd\|nom\|mv-n\|basis`, `TW\|hoofd\|nom\|mv-n\|dim`, `TW\|hoofd\|nom\|zonder-n\|basis`, `TW\|hoofd\|nom\|zonder-n\|dim`, `TW\|hoofd\|prenom\|stan`, `TW\|hoofd\|vrij`, `TW\|rang\|nom\|mv-n`, `TW\|rang\|nom\|zonder-n`, `TW\|rang\|prenom\|stan`, `VG\|neven`, `VG\|onder`, `VNW\|aanw\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|aanw\|adv-pron\|stan\|red\|3\|getal`, `VNW\|aanw\|det\|dat\|nom\|met-e\|zonder-n`, `VNW\|aanw\|det\|dat\|prenom\|met-e\|evmo`, `VNW\|aanw\|det\|gen\|prenom\|met-e\|rest3`, `VNW\|aanw\|det\|stan\|nom\|met-e\|mv-n`, `VNW\|aanw\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|aanw\|det\|stan\|prenom\|met-e\|rest`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|agr`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|evon`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|rest`, `VNW\|aanw\|det\|stan\|vrij\|zonder`, `VNW\|aanw\|pron\|gen\|vol\|3m\|ev`, `VNW\|aanw\|pron\|stan\|vol\|3o\|ev`, `VNW\|aanw\|pron\|stan\|vol\|3\|getal`, `VNW\|betr\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|betr\|det\|stan\|nom\|zonder\|zonder-n`, `VNW\|betr\|pron\|stan\|vol\|3\|ev`, `VNW\|betr\|pron\|stan\|vol\|persoon\|getal`, `VNW\|bez\|det\|gen\|vol\|3\|ev\|prenom\|met-e\|rest3`, `VNW\|bez\|det\|stan\|nadr\|2v\|mv\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|1\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|2v\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|3\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|1\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|1\|mv\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|1\|mv\|prenom\|zonder\|evon`, `VNW\|bez\|det\|stan\|vol\|2v\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|2\|getal\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|3m\|ev\|nom\|met-e\|zonder-n`, `VNW\|bez\|det\|stan\|vol\|3m\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3p\|mv\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3v\|ev\|nom\|met-e\|zonder-n`, `VNW\|bez\|det\|stan\|vol\|3v\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|3\|mv\|prenom\|zonder\|agr`, `VNW\|onbep\|adv-pron\|gen\|red\|3\|getal`, `VNW\|onbep\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|onbep\|det\|stan\|nom\|met-e\|mv-n`, `VNW\|onbep\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|agr`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|evz`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|mv`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|rest`, `VNW\|onbep\|det\|stan\|prenom\|zonder\|agr`, `VNW\|onbep\|det\|stan\|prenom\|zonder\|evon`, `VNW\|onbep\|det\|stan\|vrij\|zonder`, `VNW\|onbep\|grad\|gen\|nom\|met-e\|mv-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|mv-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|mv-n\|sup`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|zonder-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|zonder-n\|sup`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|comp`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|sup`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|mv\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|zonder\|agr\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|zonder\|agr\|comp`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|basis`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|comp`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|sup`, `VNW\|onbep\|pron\|gen\|vol\|3p\|ev`, `VNW\|onbep\|pron\|stan\|vol\|3o\|ev`, `VNW\|onbep\|pron\|stan\|vol\|3p\|ev`, `VNW\|pers\|pron\|gen\|vol\|2\|getal`, `VNW\|pers\|pron\|nomin\|nadr\|3m\|ev\|masc`, `VNW\|pers\|pron\|nomin\|nadr\|3v\|ev\|fem`, `VNW\|pers\|pron\|nomin\|red\|1\|mv`, `VNW\|pers\|pron\|nomin\|red\|2v\|ev`, `VNW\|pers\|pron\|nomin\|red\|2\|getal`, `VNW\|pers\|pron\|nomin\|red\|3p\|ev\|masc`, `VNW\|pers\|pron\|nomin\|red\|3\|ev\|masc`, `VNW\|pers\|pron\|nomin\|vol\|1\|ev`, `VNW\|pers\|pron\|nomin\|vol\|1\|mv`, `VNW\|pers\|pron\|nomin\|vol\|2b\|getal`, `VNW\|pers\|pron\|nomin\|vol\|2v\|ev`, `VNW\|pers\|pron\|nomin\|vol\|2\|getal`, `VNW\|pers\|pron\|nomin\|vol\|3p\|mv`, `VNW\|pers\|pron\|nomin\|vol\|3v\|ev\|fem`, `VNW\|pers\|pron\|nomin\|vol\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|nadr\|3m\|ev\|masc`, `VNW\|pers\|pron\|obl\|red\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|vol\|2v\|ev`, `VNW\|pers\|pron\|obl\|vol\|3p\|mv`, `VNW\|pers\|pron\|obl\|vol\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|vol\|3\|getal\|fem`, `VNW\|pers\|pron\|stan\|nadr\|2v\|mv`, `VNW\|pers\|pron\|stan\|red\|3\|ev\|fem`, `VNW\|pers\|pron\|stan\|red\|3\|ev\|onz`, `VNW\|pers\|pron\|stan\|red\|3\|mv`, `VNW\|pr\|pron\|obl\|nadr\|1\|ev`, `VNW\|pr\|pron\|obl\|nadr\|2v\|getal`, `VNW\|pr\|pron\|obl\|nadr\|2\|getal`, `VNW\|pr\|pron\|obl\|red\|1\|ev`, `VNW\|pr\|pron\|obl\|red\|2v\|getal`, `VNW\|pr\|pron\|obl\|vol\|1\|ev`, `VNW\|pr\|pron\|obl\|vol\|1\|mv`, `VNW\|pr\|pron\|obl\|vol\|2\|getal`, `VNW\|recip\|pron\|gen\|vol\|persoon\|mv`, `VNW\|recip\|pron\|obl\|vol\|persoon\|mv`, `VNW\|refl\|pron\|obl\|nadr\|3\|getal`, `VNW\|refl\|pron\|obl\|red\|3\|getal`, `VNW\|vb\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|vb\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|vb\|det\|stan\|prenom\|met-e\|rest`, `VNW\|vb\|det\|stan\|prenom\|zonder\|evon`, `VNW\|vb\|pron\|gen\|vol\|3m\|ev`, `VNW\|vb\|pron\|gen\|vol\|3p\|mv`, `VNW\|vb\|pron\|gen\|vol\|3v\|ev`, `VNW\|vb\|pron\|stan\|vol\|3o\|ev`, `VNW\|vb\|pron\|stan\|vol\|3p\|getal`, `VZ\|fin`, `VZ\|init`, `VZ\|versm`, `WW\|inf\|nom\|zonder\|zonder-n`, `WW\|inf\|prenom\|met-e`, `WW\|inf\|vrij\|zonder`, `WW\|od\|nom\|met-e\|mv-n`, `WW\|od\|nom\|met-e\|zonder-n`, `WW\|od\|prenom\|met-e`, `WW\|od\|prenom\|zonder`, `WW\|od\|vrij\|zonder`, `WW\|pv\|conj\|ev`, `WW\|pv\|tgw\|ev`, `WW\|pv\|tgw\|met-t`, `WW\|pv\|tgw\|mv`, `WW\|pv\|verl\|ev`, `WW\|pv\|verl\|mv`, `WW\|vd\|nom\|met-e\|mv-n`, `WW\|vd\|nom\|met-e\|zonder-n`, `WW\|vd\|prenom\|met-e`, `WW\|vd\|prenom\|zonder`, `WW\|vd\|vrij\|zonder` |
85
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux`, `aux:pass`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `csubj`, `dep`, `det`, `expl`, `expl:pv`, `fixed`, `flat`, `iobj`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl`, `obl:agent`, `orphan`, `parataxis`, `punct`, `xcomp` |
86
  | **`senter`** | `I`, `S` |
87
  | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PRODUCT`, `QUANTITY`, `TIME`, `WORK_OF_ART` |
@@ -92,16 +92,22 @@ Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, pa
92
 
93
  | Type | Score |
94
  | --- | --- |
95
- | `TAG_ACC` | 93.49 |
96
- | `DEP_UAS` | 85.53 |
97
- | `DEP_LAS` | 80.57 |
98
- | `ENTS_P` | 73.05 |
99
- | `ENTS_R` | 70.12 |
100
- | `ENTS_F` | 71.56 |
101
- | `SENTS_P` | 85.12 |
102
- | `SENTS_R` | 89.02 |
103
- | `SENTS_F` | 87.03 |
104
  | `TOKEN_ACC` | 99.97 |
105
- | `POS_ACC` | 95.46 |
106
- | `MORPH_ACC` | 94.58 |
107
- | `LEMMA_ACC` | 85.21 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - token-classification
5
  language:
6
  - nl
7
+ license: cc-by-sa-4.0
8
  model-index:
9
  - name: nl_core_news_sm
10
  results:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.7446964155
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.704011065
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.7237824387
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
+ value: 0.9392161567
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
+ value: 0.8339006127
38
  - name: SENTER Recall
39
  type: recall
40
+ value: 0.8787661406
41
  - name: SENTER F Score
42
  type: f_score
43
+ value: 0.8557457213
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
+ value: 0.8546007605
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
+ value: 0.8546007605
58
  ---
59
  ### Details: https://spacy.io/models/nl#nl_core_news_sm
60
 
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `nl_core_news_sm` |
66
+ | **Version** | `3.2.0` |
67
+ | **spaCy** | `>=3.2.0,<3.3.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
71
+ | **Sources** | [UD Dutch LassySmall v2.8](https://github.com/UniversalDependencies/UD_Dutch-LassySmall) (Bouma, Gosse; van Noord, Gertjan)<br />[Dutch NER Annotations for UD LassySmall](https://nlp.town) (NLP Town)<br />[UD Dutch LassySmall v2.8](https://github.com/UniversalDependencies/UD_Dutch-LassySmall) (Bouma, Gosse; van Noord, Gertjan)<br />[UD Dutch Alpino v2.8](https://github.com/UniversalDependencies/UD_Dutch-Alpino) (Zeman, Daniel; Žabokrtský, Zdeněk; Bouma, Gosse; van Noord, Gertjan)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion) |
72
  | **License** | `CC BY-SA 4.0` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
76
 
77
  <details>
78
 
79
+ <summary>View label scheme (323 labels for 5 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
+ | **`morphologizer`** | `POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=AUX\|Tense=Pres\|VerbForm=Fin`, `POS=ADV`, `POS=VERB\|VerbForm=Part`, `POS=PUNCT`, `Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Fin`, `POS=ADP`, `POS=NUM`, `Number=Plur\|POS=NOUN`, `POS=VERB\|VerbForm=Inf`, `POS=SCONJ`, `Definite=Def\|POS=DET`, `Gender=Com\|Number=Sing\|POS=NOUN`, `Number=Sing\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Degree=Pos\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=PROPN`, `Gender=Com\|Number=Sing\|POS=PROPN`, `POS=AUX\|VerbForm=Inf`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Fin`, `POS=DET`, `Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|Person=3\|PronType=Prs`, `POS=CCONJ`, `Number=Plur\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `POS=PRON\|Person=3\|PronType=Ind`, `Degree=Cmp\|POS=ADJ`, `Case=Nom\|POS=PRON\|Person=1\|PronType=Prs`, `Definite=Ind\|POS=DET`, `Case=Nom\|POS=PRON\|Person=3\|PronType=Prs`, `POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Number=Plur\|POS=AUX\|Tense=Pres\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Case=Acc\|POS=PRON\|Person=1\|PronType=Prs`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Fin`, `Gender=Com,Neut\|Number=Sing\|POS=NOUN`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs`, `POS=PROPN`, `POS=PRON\|PronType=Ind`, `POS=PRON\|Person=3\|PronType=Int`, `Case=Acc\|POS=PRON\|PronType=Rcp`, `Number=Plur\|POS=AUX\|Tense=Past\|VerbForm=Fin`, `Number=Sing\|POS=NOUN`, `POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `POS=SYM`, `Abbr=Yes\|POS=X`, `Gender=Com,Neut\|Number=Sing\|POS=PROPN`, `Degree=Sup\|POS=ADJ`, `POS=ADJ`, `Number=Sing\|POS=PROPN`, `POS=PRON\|PronType=Dem`, `POS=AUX\|VerbForm=Part`, `POS=PRON\|Person=3\|PronType=Rel`, `Number=Plur\|POS=PROPN`, `POS=PRON\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Dat\|POS=PRON\|PronType=Dem`, `Case=Nom\|POS=PRON\|Person=2\|PronType=Prs`, `POS=INTJ`, `Case=Acc\|POS=PRON\|Person=2\|PronType=Prs`, `Case=Gen\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=PRON\|PronType=Int`, `POS=PRON\|Person=2\|PronType=Prs`, `POS=PRON\|Person=3`, `Case=Gen\|POS=PRON\|Person=2\|PronType=Prs`, `POS=X` |
84
+ | **`tagger`** | `ADJ\|nom\|basis\|met-e\|mv-n`, `ADJ\|nom\|basis\|met-e\|zonder-n\|bijz`, `ADJ\|nom\|basis\|met-e\|zonder-n\|stan`, `ADJ\|nom\|basis\|zonder\|mv-n`, `ADJ\|nom\|basis\|zonder\|zonder-n`, `ADJ\|nom\|comp\|met-e\|mv-n`, `ADJ\|nom\|comp\|met-e\|zonder-n\|stan`, `ADJ\|nom\|sup\|met-e\|mv-n`, `ADJ\|nom\|sup\|met-e\|zonder-n\|bijz`, `ADJ\|nom\|sup\|met-e\|zonder-n\|stan`, `ADJ\|nom\|sup\|zonder\|zonder-n`, `ADJ\|postnom\|basis\|met-s`, `ADJ\|postnom\|basis\|zonder`, `ADJ\|postnom\|comp\|met-s`, `ADJ\|prenom\|basis\|met-e\|bijz`, `ADJ\|prenom\|basis\|met-e\|stan`, `ADJ\|prenom\|basis\|zonder`, `ADJ\|prenom\|comp\|met-e\|stan`, `ADJ\|prenom\|comp\|zonder`, `ADJ\|prenom\|sup\|met-e\|stan`, `ADJ\|prenom\|sup\|zonder`, `ADJ\|vrij\|basis\|zonder`, `ADJ\|vrij\|comp\|zonder`, `ADJ\|vrij\|dim\|zonder`, `ADJ\|vrij\|sup\|zonder`, `BW`, `LET`, `LID\|bep\|dat\|evmo`, `LID\|bep\|gen\|evmo`, `LID\|bep\|gen\|rest3`, `LID\|bep\|stan\|evon`, `LID\|bep\|stan\|rest`, `LID\|onbep\|stan\|agr`, `N\|eigen\|ev\|basis\|gen`, `N\|eigen\|ev\|basis\|genus\|stan`, `N\|eigen\|ev\|basis\|onz\|stan`, `N\|eigen\|ev\|basis\|zijd\|stan`, `N\|eigen\|ev\|dim\|onz\|stan`, `N\|eigen\|mv\|basis`, `N\|soort\|ev\|basis\|dat`, `N\|soort\|ev\|basis\|gen`, `N\|soort\|ev\|basis\|genus\|stan`, `N\|soort\|ev\|basis\|onz\|stan`, `N\|soort\|ev\|basis\|zijd\|stan`, `N\|soort\|ev\|dim\|onz\|stan`, `N\|soort\|mv\|basis`, `N\|soort\|mv\|dim`, `SPEC\|afgebr`, `SPEC\|afk`, `SPEC\|deeleigen`, `SPEC\|enof`, `SPEC\|meta`, `SPEC\|symb`, `SPEC\|vreemd`, `TSW`, `TW\|hoofd\|nom\|mv-n\|basis`, `TW\|hoofd\|nom\|mv-n\|dim`, `TW\|hoofd\|nom\|zonder-n\|basis`, `TW\|hoofd\|nom\|zonder-n\|dim`, `TW\|hoofd\|prenom\|stan`, `TW\|hoofd\|vrij`, `TW\|rang\|nom\|mv-n`, `TW\|rang\|nom\|zonder-n`, `TW\|rang\|prenom\|stan`, `VG\|neven`, `VG\|onder`, `VNW\|aanw\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|aanw\|adv-pron\|stan\|red\|3\|getal`, `VNW\|aanw\|det\|dat\|nom\|met-e\|zonder-n`, `VNW\|aanw\|det\|dat\|prenom\|met-e\|evmo`, `VNW\|aanw\|det\|gen\|prenom\|met-e\|rest3`, `VNW\|aanw\|det\|stan\|nom\|met-e\|mv-n`, `VNW\|aanw\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|aanw\|det\|stan\|prenom\|met-e\|rest`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|agr`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|evon`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|rest`, `VNW\|aanw\|det\|stan\|vrij\|zonder`, `VNW\|aanw\|pron\|gen\|vol\|3m\|ev`, `VNW\|aanw\|pron\|stan\|vol\|3o\|ev`, `VNW\|aanw\|pron\|stan\|vol\|3\|getal`, `VNW\|betr\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|betr\|det\|stan\|nom\|zonder\|zonder-n`, `VNW\|betr\|pron\|stan\|vol\|3\|ev`, `VNW\|betr\|pron\|stan\|vol\|persoon\|getal`, `VNW\|bez\|det\|gen\|vol\|3\|ev\|prenom\|met-e\|rest3`, `VNW\|bez\|det\|stan\|nadr\|2v\|mv\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|1\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|2v\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|3\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|1\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|1\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|1\|mv\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|1\|mv\|prenom\|zonder\|evon`, `VNW\|bez\|det\|stan\|vol\|2v\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|2\|getal\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|3m\|ev\|nom\|met-e\|zonder-n`, `VNW\|bez\|det\|stan\|vol\|3m\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3p\|mv\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3v\|ev\|nom\|met-e\|zonder-n`, `VNW\|bez\|det\|stan\|vol\|3v\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|3\|mv\|prenom\|zonder\|agr`, `VNW\|excl\|pron\|stan\|vol\|3\|getal`, `VNW\|onbep\|adv-pron\|gen\|red\|3\|getal`, `VNW\|onbep\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|onbep\|det\|stan\|nom\|met-e\|mv-n`, `VNW\|onbep\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|onbep\|det\|stan\|nom\|zonder\|zonder-n`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|agr`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|evz`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|mv`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|rest`, `VNW\|onbep\|det\|stan\|prenom\|zonder\|agr`, `VNW\|onbep\|det\|stan\|prenom\|zonder\|evon`, `VNW\|onbep\|det\|stan\|vrij\|zonder`, `VNW\|onbep\|grad\|gen\|nom\|met-e\|mv-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|mv-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|mv-n\|sup`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|zonder-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|zonder-n\|sup`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|comp`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|sup`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|mv\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|zonder\|agr\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|zonder\|agr\|comp`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|basis`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|comp`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|sup`, `VNW\|onbep\|pron\|gen\|vol\|3p\|ev`, `VNW\|onbep\|pron\|stan\|vol\|3o\|ev`, `VNW\|onbep\|pron\|stan\|vol\|3p\|ev`, `VNW\|pers\|pron\|gen\|vol\|2\|getal`, `VNW\|pers\|pron\|nomin\|nadr\|3m\|ev\|masc`, `VNW\|pers\|pron\|nomin\|nadr\|3v\|ev\|fem`, `VNW\|pers\|pron\|nomin\|red\|1\|mv`, `VNW\|pers\|pron\|nomin\|red\|2v\|ev`, `VNW\|pers\|pron\|nomin\|red\|2\|getal`, `VNW\|pers\|pron\|nomin\|red\|3p\|ev\|masc`, `VNW\|pers\|pron\|nomin\|red\|3\|ev\|masc`, `VNW\|pers\|pron\|nomin\|vol\|1\|ev`, `VNW\|pers\|pron\|nomin\|vol\|1\|mv`, `VNW\|pers\|pron\|nomin\|vol\|2b\|getal`, `VNW\|pers\|pron\|nomin\|vol\|2v\|ev`, `VNW\|pers\|pron\|nomin\|vol\|2\|getal`, `VNW\|pers\|pron\|nomin\|vol\|3p\|mv`, `VNW\|pers\|pron\|nomin\|vol\|3v\|ev\|fem`, `VNW\|pers\|pron\|nomin\|vol\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|nadr\|3m\|ev\|masc`, `VNW\|pers\|pron\|obl\|red\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|vol\|2v\|ev`, `VNW\|pers\|pron\|obl\|vol\|3p\|mv`, `VNW\|pers\|pron\|obl\|vol\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|vol\|3\|getal\|fem`, `VNW\|pers\|pron\|stan\|nadr\|2v\|mv`, `VNW\|pers\|pron\|stan\|red\|3\|ev\|fem`, `VNW\|pers\|pron\|stan\|red\|3\|ev\|onz`, `VNW\|pers\|pron\|stan\|red\|3\|mv`, `VNW\|pr\|pron\|obl\|nadr\|1\|ev`, `VNW\|pr\|pron\|obl\|nadr\|2v\|getal`, `VNW\|pr\|pron\|obl\|nadr\|2\|getal`, `VNW\|pr\|pron\|obl\|red\|1\|ev`, `VNW\|pr\|pron\|obl\|red\|2v\|getal`, `VNW\|pr\|pron\|obl\|vol\|1\|ev`, `VNW\|pr\|pron\|obl\|vol\|1\|mv`, `VNW\|pr\|pron\|obl\|vol\|2\|getal`, `VNW\|recip\|pron\|gen\|vol\|persoon\|mv`, `VNW\|recip\|pron\|obl\|vol\|persoon\|mv`, `VNW\|refl\|pron\|obl\|nadr\|3\|getal`, `VNW\|refl\|pron\|obl\|red\|3\|getal`, `VNW\|vb\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|vb\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|vb\|det\|stan\|prenom\|met-e\|rest`, `VNW\|vb\|det\|stan\|prenom\|zonder\|evon`, `VNW\|vb\|pron\|gen\|vol\|3m\|ev`, `VNW\|vb\|pron\|gen\|vol\|3p\|mv`, `VNW\|vb\|pron\|gen\|vol\|3v\|ev`, `VNW\|vb\|pron\|stan\|vol\|3o\|ev`, `VNW\|vb\|pron\|stan\|vol\|3p\|getal`, `VZ\|fin`, `VZ\|init`, `VZ\|versm`, `WW\|inf\|nom\|zonder\|zonder-n`, `WW\|inf\|prenom\|met-e`, `WW\|inf\|vrij\|zonder`, `WW\|od\|nom\|met-e\|mv-n`, `WW\|od\|nom\|met-e\|zonder-n`, `WW\|od\|prenom\|met-e`, `WW\|od\|prenom\|zonder`, `WW\|od\|vrij\|zonder`, `WW\|pv\|conj\|ev`, `WW\|pv\|tgw\|ev`, `WW\|pv\|tgw\|met-t`, `WW\|pv\|tgw\|mv`, `WW\|pv\|verl\|ev`, `WW\|pv\|verl\|mv`, `WW\|vd\|nom\|met-e\|mv-n`, `WW\|vd\|nom\|met-e\|zonder-n`, `WW\|vd\|prenom\|met-e`, `WW\|vd\|prenom\|zonder`, `WW\|vd\|vrij\|zonder` |
85
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux`, `aux:pass`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `csubj`, `dep`, `det`, `expl`, `expl:pv`, `fixed`, `flat`, `iobj`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl`, `obl:agent`, `orphan`, `parataxis`, `punct`, `xcomp` |
86
  | **`senter`** | `I`, `S` |
87
  | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PRODUCT`, `QUANTITY`, `TIME`, `WORK_OF_ART` |
92
 
93
  | Type | Score |
94
  | --- | --- |
 
 
 
 
 
 
 
 
 
95
  | `TOKEN_ACC` | 99.97 |
96
+ | `TOKEN_P` | 99.74 |
97
+ | `TOKEN_R` | 99.76 |
98
+ | `TOKEN_F` | 99.75 |
99
+ | `POS_ACC` | 95.81 |
100
+ | `MORPH_ACC` | 94.99 |
101
+ | `MORPH_MICRO_P` | 95.89 |
102
+ | `MORPH_MICRO_R` | 93.84 |
103
+ | `MORPH_MICRO_F` | 94.85 |
104
+ | `TAG_ACC` | 93.92 |
105
+ | `SENTS_P` | 83.39 |
106
+ | `SENTS_R` | 87.88 |
107
+ | `SENTS_F` | 85.57 |
108
+ | `DEP_UAS` | 85.46 |
109
+ | `DEP_LAS` | 80.63 |
110
+ | `LEMMA_ACC` | 81.33 |
111
+ | `ENTS_P` | 74.47 |
112
+ | `ENTS_R` | 70.40 |
113
+ | `ENTS_F` | 72.38 |
accuracy.json CHANGED
@@ -1,352 +1,353 @@
1
  {
2
- "tag_acc": 0.9349210503,
3
- "dep_uas": 0.8552811724,
4
- "dep_las": 0.8057499239,
5
- "ents_p": 0.7305475504,
6
- "ents_r": 0.7012448133,
7
- "ents_f": 0.7155963303,
8
- "sents_p": 0.8511659808,
9
- "sents_r": 0.8902439024,
10
- "sents_f": 0.8702664797,
11
- "speed": 3825.7629872661,
12
- "dep_las_per_type": {
13
- "nmod:poss": {
14
- "p": 0.9553903346,
15
- "r": 0.9379562044,
16
- "f": 0.9465930018
17
  },
18
- "nsubj": {
19
- "p": 0.828229028,
20
- "r": 0.8178829717,
21
- "f": 0.8230234866
22
  },
23
- "aux": {
24
- "p": 0.8929765886,
25
- "r": 0.8782894737,
26
- "f": 0.8855721393
27
  },
28
- "advmod": {
29
- "p": 0.7687016337,
30
- "r": 0.786971831,
31
- "f": 0.7777294476
32
  },
33
- "root": {
34
- "p": 0.8504801097,
35
- "r": 0.8895265423,
36
- "f": 0.8695652174
37
  },
38
- "det": {
39
- "p": 0.93700126,
40
- "r": 0.9645482058,
41
- "f": 0.9505752024
42
  },
43
- "amod": {
44
- "p": 0.8596646072,
45
- "r": 0.8727598566,
46
- "f": 0.866162739
47
  },
48
- "obl": {
49
- "p": 0.737449118,
50
- "r": 0.7409679618,
51
- "f": 0.7392043523
52
  },
53
- "mark": {
54
- "p": 0.887477314,
55
- "r": 0.8907103825,
56
- "f": 0.8890909091
57
  },
58
- "ccomp": {
59
- "p": 0.6146788991,
60
- "r": 0.6261682243,
61
- "f": 0.6203703704
62
  },
63
- "case": {
64
- "p": 0.9337199669,
65
- "r": 0.9550847458,
66
- "f": 0.9442815249
67
  },
68
- "appos": {
69
- "p": 0.7269736842,
70
- "r": 0.6696969697,
71
- "f": 0.6971608833
 
 
 
 
 
 
 
 
 
 
 
 
 
72
  },
73
- "obj": {
74
- "p": 0.7103347889,
75
- "r": 0.743902439,
76
- "f": 0.7267311988
77
  },
78
- "compound:prt": {
79
- "p": 0.7638190955,
80
- "r": 0.7004608295,
81
- "f": 0.7307692308
82
  },
83
- "xcomp": {
84
- "p": 0.6356877323,
85
- "r": 0.6286764706,
86
- "f": 0.6321626617
87
  },
88
- "flat": {
89
- "p": 0.8104046243,
90
- "r": 0.7619565217,
91
- "f": 0.7854341737
92
  },
93
- "expl:pv": {
94
- "p": 0.7021276596,
95
- "r": 0.75,
96
- "f": 0.7252747253
97
  },
98
- "acl": {
99
- "p": 0.5064935065,
100
- "r": 0.4020618557,
101
- "f": 0.4482758621
 
 
 
 
 
 
 
 
 
 
102
  },
103
  "advcl": {
104
- "p": 0.4837209302,
105
- "r": 0.4684684685,
106
- "f": 0.47597254
107
  },
108
- "nummod": {
109
- "p": 0.8301282051,
110
- "r": 0.8633333333,
111
- "f": 0.8464052288
112
  },
113
- "nmod": {
114
- "p": 0.7062240664,
115
- "r": 0.7438811189,
116
- "f": 0.7245636441
 
 
 
 
 
117
  },
118
  "cc": {
119
- "p": 0.8563327032,
120
- "r": 0.8595825427,
121
- "f": 0.8579545455
122
  },
123
  "conj": {
124
- "p": 0.5967957276,
125
- "r": 0.6081632653,
126
- "f": 0.602425876
127
- },
128
- "nsubj:pass": {
129
- "p": 0.753164557,
130
- "r": 0.748427673,
131
- "f": 0.7507886435
132
  },
133
- "aux:pass": {
134
- "p": 0.8681318681,
135
- "r": 0.8777777778,
136
- "f": 0.8729281768
137
  },
138
- "cop": {
139
- "p": 0.7167235495,
140
- "r": 0.7720588235,
141
- "f": 0.7433628319
142
  },
143
- "parataxis": {
144
- "p": 0.3440366972,
145
- "r": 0.2737226277,
146
- "f": 0.3048780488
147
  },
148
- "iobj": {
149
- "p": 0.6315789474,
150
- "r": 0.3636363636,
151
- "f": 0.4615384615
152
  },
153
- "acl:relcl": {
154
- "p": 0.5914634146,
155
- "r": 0.6100628931,
156
- "f": 0.600619195
157
  },
158
- "expl": {
159
- "p": 0.5,
160
- "r": 0.5714285714,
161
- "f": 0.5333333333
162
  },
163
- "fixed": {
164
- "p": 0.7220630372,
165
- "r": 0.4337349398,
166
- "f": 0.5419354839
167
  },
168
- "obl:agent": {
169
- "p": 1.0,
170
- "r": 0.8620689655,
171
- "f": 0.9259259259
172
  },
173
- "csubj": {
174
- "p": 0.5333333333,
175
- "r": 0.4,
176
- "f": 0.4571428571
177
  },
178
  "dep": {
179
  "p": 0.0,
180
  "r": 0.0,
181
  "f": 0.0
182
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
183
  "orphan": {
184
  "p": 0.0,
185
  "r": 0.0,
186
  "f": 0.0
187
  }
188
  },
 
 
 
 
189
  "ents_per_type": {
190
  "DATE": {
191
- "p": 0.9459459459,
192
- "r": 0.9090909091,
193
- "f": 0.9271523179
194
  },
195
  "NORP": {
196
- "p": 0.6698113208,
197
- "r": 0.8554216867,
198
- "f": 0.7513227513
199
  },
200
  "ORG": {
201
- "p": 0.728,
202
- "r": 0.5384615385,
203
- "f": 0.619047619
204
  },
205
  "CARDINAL": {
206
- "p": 0.8815789474,
207
  "r": 0.9571428571,
208
- "f": 0.9178082192
209
  },
210
  "GPE": {
211
- "p": 0.6724137931,
212
- "r": 0.8571428571,
213
- "f": 0.7536231884
214
  },
215
  "MONEY": {
216
- "p": 0.2,
217
- "r": 0.3333333333,
218
- "f": 0.25
219
  },
220
  "PERCENT": {
221
- "p": 1.0,
222
  "r": 1.0,
223
- "f": 1.0
224
  },
225
  "PERSON": {
226
- "p": 0.6907894737,
227
- "r": 0.6796116505,
228
- "f": 0.6851549755
 
 
 
 
 
229
  },
230
  "LAW": {
231
  "p": 1.0,
232
  "r": 1.0,
233
  "f": 1.0
234
  },
235
- "ORDINAL": {
236
- "p": 0.9090909091,
237
- "r": 0.9090909091,
238
- "f": 0.9090909091
239
  },
240
- "WORK_OF_ART": {
241
- "p": 0.6203703704,
242
- "r": 0.3722222222,
243
- "f": 0.4652777778
244
  },
245
  "LANGUAGE": {
246
- "p": 0.5833333333,
247
  "r": 0.6363636364,
248
- "f": 0.6086956522
249
  },
250
  "QUANTITY": {
251
- "p": 0.8461538462,
252
  "r": 0.9166666667,
253
- "f": 0.88
254
  },
255
  "LOC": {
256
- "p": 0.2631578947,
257
- "r": 0.1470588235,
258
- "f": 0.1886792453
259
  },
260
  "FAC": {
261
- "p": 0.037037037,
262
  "r": 0.0714285714,
263
- "f": 0.0487804878
264
  },
265
  "EVENT": {
266
- "p": 0.3684210526,
267
- "r": 0.3043478261,
268
- "f": 0.3333333333
269
- },
270
- "PRODUCT": {
271
- "p": 0.0,
272
- "r": 0.0,
273
- "f": 0.0
274
  },
275
  "TIME": {
276
- "p": 1.0,
277
  "r": 1.0,
278
- "f": 1.0
279
  }
280
  },
281
- "token_acc": 0.9997165842,
282
- "pos_acc": 0.9546138579,
283
- "morph_acc": 0.9458392409,
284
- "lemma_acc": 0.8520836514,
285
- "morph_per_feat": {
286
- "Person": {
287
- "p": 0.986328125,
288
- "r": 0.964660936,
289
- "f": 0.9753742154
290
- },
291
- "Poss": {
292
- "p": 0.9846153846,
293
- "r": 0.9733840304,
294
- "f": 0.9789674952
295
- },
296
- "PronType": {
297
- "p": 0.9872013652,
298
- "r": 0.9609634551,
299
- "f": 0.9739057239
300
- },
301
- "Gender": {
302
- "p": 0.8911903513,
303
- "r": 0.860724234,
304
- "f": 0.875692387
305
- },
306
- "Number": {
307
- "p": 0.9758520556,
308
- "r": 0.9529850746,
309
- "f": 0.9642830174
310
- },
311
- "Tense": {
312
- "p": 0.9692565679,
313
- "r": 0.9527472527,
314
- "f": 0.9609310058
315
- },
316
- "VerbForm": {
317
- "p": 0.9379733141,
318
- "r": 0.9359481828,
319
- "f": 0.9369596542
320
- },
321
- "Degree": {
322
- "p": 0.9393491124,
323
- "r": 0.9058487874,
324
- "f": 0.9222948439
325
- },
326
- "Definite": {
327
- "p": 0.9951584507,
328
- "r": 0.9947206335,
329
- "f": 0.9949394939
330
- },
331
- "Case": {
332
- "p": 0.9940239044,
333
- "r": 0.9940239044,
334
- "f": 0.9940239044
335
- },
336
- "Reflex": {
337
- "p": 1.0,
338
- "r": 1.0,
339
- "f": 1.0
340
- },
341
- "Foreign": {
342
- "p": 0.68,
343
- "r": 0.2615384615,
344
- "f": 0.3777777778
345
- },
346
- "Abbr": {
347
- "p": 0.8333333333,
348
- "r": 0.8333333333,
349
- "f": 0.8333333333
350
- }
351
- }
352
  }
1
  {
2
+ "token_acc": 0.9997165842,
3
+ "token_p": 0.9974281853,
4
+ "token_r": 0.9975586363,
5
+ "token_f": 0.9974934066,
6
+ "pos_acc": 0.9581243184,
7
+ "morph_acc": 0.9498527648,
8
+ "morph_micro_p": 0.9588741568,
9
+ "morph_micro_r": 0.9384419251,
10
+ "morph_micro_f": 0.9485480234,
11
+ "morph_per_feat": {
12
+ "Person": {
13
+ "p": 0.991202346,
14
+ "r": 0.9694072658,
15
+ "f": 0.9801836636
 
16
  },
17
+ "Poss": {
18
+ "p": 0.9923076923,
19
+ "r": 0.9885057471,
20
+ "f": 0.990403071
21
  },
22
+ "PronType": {
23
+ "p": 0.9914310197,
24
+ "r": 0.961762261,
25
+ "f": 0.976371308
26
  },
27
+ "Gender": {
28
+ "p": 0.9029177719,
29
+ "r": 0.8659374205,
30
+ "f": 0.8840410336
31
  },
32
+ "Number": {
33
+ "p": 0.9755422243,
34
+ "r": 0.949259093,
35
+ "f": 0.9622212107
36
  },
37
+ "Tense": {
38
+ "p": 0.9620958751,
39
+ "r": 0.9488730071,
40
+ "f": 0.9554386936
41
  },
42
+ "VerbForm": {
43
+ "p": 0.9396676301,
44
+ "r": 0.9359481828,
45
+ "f": 0.9378042185
46
  },
47
+ "Degree": {
48
+ "p": 0.9345930233,
49
+ "r": 0.9238505747,
50
+ "f": 0.9291907514
51
  },
52
+ "Definite": {
53
+ "p": 0.995584989,
54
+ "r": 0.9925176056,
55
+ "f": 0.994048931
56
  },
57
+ "Case": {
58
+ "p": 0.9960159363,
59
+ "r": 0.9960159363,
60
+ "f": 0.9960159363
61
  },
62
+ "Reflex": {
63
+ "p": 1.0,
64
+ "r": 1.0,
65
+ "f": 1.0
66
  },
67
+ "Abbr": {
68
+ "p": 0.9,
69
+ "r": 0.5,
70
+ "f": 0.6428571429
71
+ }
72
+ },
73
+ "tag_acc": 0.9392161567,
74
+ "sents_p": 0.8339006127,
75
+ "sents_r": 0.8787661406,
76
+ "sents_f": 0.8557457213,
77
+ "dep_uas": 0.8546007605,
78
+ "dep_las": 0.8062660009,
79
+ "dep_las_per_type": {
80
+ "det": {
81
+ "p": 0.8796718323,
82
+ "r": 0.9516765286,
83
+ "f": 0.9142586452
84
  },
85
+ "nsubj": {
86
+ "p": 0.7457627119,
87
+ "r": 0.7857142857,
88
+ "f": 0.7652173913
89
  },
90
+ "root": {
91
+ "p": 0.7325428195,
92
+ "r": 0.8298507463,
93
+ "f": 0.77816655
94
  },
95
+ "case": {
96
+ "p": 0.8851174935,
97
+ "r": 0.9373271889,
98
+ "f": 0.9104744852
99
  },
100
+ "obl": {
101
+ "p": 0.7081339713,
102
+ "r": 0.7036450079,
103
+ "f": 0.7058823529
104
  },
105
+ "nmod": {
106
+ "p": 0.6162420382,
107
+ "r": 0.6753926702,
108
+ "f": 0.6444629475
109
  },
110
+ "advmod": {
111
+ "p": 0.6918604651,
112
+ "r": 0.7212121212,
113
+ "f": 0.706231454
114
+ },
115
+ "obj": {
116
+ "p": 0.717472119,
117
+ "r": 0.7121771218,
118
+ "f": 0.7148148148
119
+ },
120
+ "mark": {
121
+ "p": 0.8205128205,
122
+ "r": 0.7933884298,
123
+ "f": 0.8067226891
124
  },
125
  "advcl": {
126
+ "p": 0.52,
127
+ "r": 0.4297520661,
128
+ "f": 0.4705882353
129
  },
130
+ "amod": {
131
+ "p": 0.8108581436,
132
+ "r": 0.8448905109,
133
+ "f": 0.8275245755
134
  },
135
+ "acl:relcl": {
136
+ "p": 0.6329113924,
137
+ "r": 0.6097560976,
138
+ "f": 0.6211180124
139
+ },
140
+ "cop": {
141
+ "p": 0.7094594595,
142
+ "r": 0.5769230769,
143
+ "f": 0.6363636364
144
  },
145
  "cc": {
146
+ "p": 0.798013245,
147
+ "r": 0.8281786942,
148
+ "f": 0.8128161889
149
  },
150
  "conj": {
151
+ "p": 0.54784689,
152
+ "r": 0.5066371681,
153
+ "f": 0.5264367816
 
 
 
 
 
154
  },
155
+ "fixed": {
156
+ "p": 0.7045454545,
157
+ "r": 0.2520325203,
158
+ "f": 0.371257485
159
  },
160
+ "flat": {
161
+ "p": 0.7467889908,
162
+ "r": 0.6933560477,
163
+ "f": 0.7190812721
164
  },
165
+ "csubj": {
166
+ "p": 0.75,
167
+ "r": 0.5,
168
+ "f": 0.6
169
  },
170
+ "aux": {
171
+ "p": 0.7647058824,
172
+ "r": 0.7572815534,
173
+ "f": 0.7609756098
174
  },
175
+ "compound:prt": {
176
+ "p": 0.7727272727,
177
+ "r": 0.6623376623,
178
+ "f": 0.7132867133
179
  },
180
+ "nummod": {
181
+ "p": 0.5641025641,
182
+ "r": 0.602739726,
183
+ "f": 0.582781457
184
  },
185
+ "acl": {
186
+ "p": 0.4375,
187
+ "r": 0.3559322034,
188
+ "f": 0.3925233645
189
  },
190
+ "expl": {
191
+ "p": 0.4,
192
+ "r": 0.3333333333,
193
+ "f": 0.3636363636
194
  },
195
+ "appos": {
196
+ "p": 0.6092715232,
197
+ "r": 0.5317919075,
198
+ "f": 0.5679012346
199
  },
200
  "dep": {
201
  "p": 0.0,
202
  "r": 0.0,
203
  "f": 0.0
204
  },
205
+ "nsubj:pass": {
206
+ "p": 0.7882352941,
207
+ "r": 0.7790697674,
208
+ "f": 0.783625731
209
+ },
210
+ "aux:pass": {
211
+ "p": 0.8252427184,
212
+ "r": 0.8673469388,
213
+ "f": 0.8457711443
214
+ },
215
+ "ccomp": {
216
+ "p": 0.5882352941,
217
+ "r": 0.5882352941,
218
+ "f": 0.5882352941
219
+ },
220
+ "xcomp": {
221
+ "p": 0.36,
222
+ "r": 0.6164383562,
223
+ "f": 0.4545454545
224
+ },
225
+ "parataxis": {
226
+ "p": 0.350877193,
227
+ "r": 0.2684563758,
228
+ "f": 0.3041825095
229
+ },
230
+ "expl:pv": {
231
+ "p": 0.8095238095,
232
+ "r": 0.8947368421,
233
+ "f": 0.85
234
+ },
235
+ "iobj": {
236
+ "p": 0.3636363636,
237
+ "r": 0.4,
238
+ "f": 0.380952381
239
+ },
240
+ "nmod:poss": {
241
+ "p": 0.8940397351,
242
+ "r": 0.8823529412,
243
+ "f": 0.8881578947
244
+ },
245
+ "obl:agent": {
246
+ "p": 0.6875,
247
+ "r": 0.7857142857,
248
+ "f": 0.7333333333
249
+ },
250
  "orphan": {
251
  "p": 0.0,
252
  "r": 0.0,
253
  "f": 0.0
254
  }
255
  },
256
+ "lemma_acc": 0.8133109449,
257
+ "ents_p": 0.7446964155,
258
+ "ents_r": 0.704011065,
259
+ "ents_f": 0.7237824387,
260
  "ents_per_type": {
261
  "DATE": {
262
+ "p": 0.9317406143,
263
+ "r": 0.8863636364,
264
+ "f": 0.9084858569
265
  },
266
  "NORP": {
267
+ "p": 0.693877551,
268
+ "r": 0.8192771084,
269
+ "f": 0.7513812155
270
  },
271
  "ORG": {
272
+ "p": 0.762295082,
273
+ "r": 0.550295858,
274
+ "f": 0.6391752577
275
  },
276
  "CARDINAL": {
277
+ "p": 0.858974359,
278
  "r": 0.9571428571,
279
+ "f": 0.9054054054
280
  },
281
  "GPE": {
282
+ "p": 0.6722689076,
283
+ "r": 0.8791208791,
284
+ "f": 0.7619047619
285
  },
286
  "MONEY": {
287
+ "p": 0.0,
288
+ "r": 0.0,
289
+ "f": 0.0
290
  },
291
  "PERCENT": {
292
+ "p": 0.6666666667,
293
  "r": 1.0,
294
+ "f": 0.8
295
  },
296
  "PERSON": {
297
+ "p": 0.7281879195,
298
+ "r": 0.7022653722,
299
+ "f": 0.7149917628
300
+ },
301
+ "WORK_OF_ART": {
302
+ "p": 0.6146788991,
303
+ "r": 0.3722222222,
304
+ "f": 0.4636678201
305
  },
306
  "LAW": {
307
  "p": 1.0,
308
  "r": 1.0,
309
  "f": 1.0
310
  },
311
+ "PRODUCT": {
312
+ "p": 0.0,
313
+ "r": 0.0,
314
+ "f": 0.0
315
  },
316
+ "ORDINAL": {
317
+ "p": 0.90625,
318
+ "r": 0.8787878788,
319
+ "f": 0.8923076923
320
  },
321
  "LANGUAGE": {
322
+ "p": 0.4375,
323
  "r": 0.6363636364,
324
+ "f": 0.5185185185
325
  },
326
  "QUANTITY": {
327
+ "p": 0.9166666667,
328
  "r": 0.9166666667,
329
+ "f": 0.9166666667
330
  },
331
  "LOC": {
332
+ "p": 0.6428571429,
333
+ "r": 0.2647058824,
334
+ "f": 0.375
335
  },
336
  "FAC": {
337
+ "p": 0.0625,
338
  "r": 0.0714285714,
339
+ "f": 0.0666666667
340
  },
341
  "EVENT": {
342
+ "p": 0.3157894737,
343
+ "r": 0.2608695652,
344
+ "f": 0.2857142857
 
 
 
 
 
345
  },
346
  "TIME": {
347
+ "p": 0.5,
348
  "r": 1.0,
349
+ "f": 0.6666666667
350
  }
351
  },
352
+ "speed": 3273.3086753398
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
353
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
config.cfg CHANGED
@@ -1,10 +1,8 @@
1
  [paths]
2
- train = "corpus/nl-dep-news/train.spacy"
3
- dev = "corpus/nl-dep-news/dev.spacy"
4
  vectors = null
5
- raw = null
6
  init_tok2vec = null
7
- vocab_data = null
8
 
9
  [system]
10
  gpu_allocator = null
@@ -24,6 +22,7 @@ tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
24
 
25
  [components.attribute_ruler]
26
  factory = "attribute_ruler"
 
27
  validate = false
28
 
29
  [components.lemmatizer]
@@ -31,9 +30,13 @@ factory = "lemmatizer"
31
  mode = "rule"
32
  model = null
33
  overwrite = false
 
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
 
 
 
37
 
38
  [components.morphologizer.model]
39
  @architectures = "spacy.Tagger.v1"
@@ -48,6 +51,7 @@ upstream = "tok2vec"
48
  factory = "ner"
49
  incorrect_spans_key = null
50
  moves = null
 
51
  update_with_oracle_cut_size = 100
52
 
53
  [components.ner.model]
@@ -65,8 +69,8 @@ nO = null
65
  [components.ner.model.tok2vec.embed]
66
  @architectures = "spacy.MultiHashEmbed.v2"
67
  width = 96
68
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
69
- rows = [5000,2500,2500,2500]
70
  include_static_vectors = false
71
 
72
  [components.ner.model.tok2vec.encode]
@@ -81,6 +85,7 @@ factory = "parser"
81
  learn_tokens = false
82
  min_action_freq = 30
83
  moves = null
 
84
  update_with_oracle_cut_size = 100
85
 
86
  [components.parser.model]
@@ -99,6 +104,8 @@ upstream = "tok2vec"
99
 
100
  [components.senter]
101
  factory = "senter"
 
 
102
 
103
  [components.senter.model]
104
  @architectures = "spacy.Tagger.v1"
@@ -110,8 +117,8 @@ nO = null
110
  [components.senter.model.tok2vec.embed]
111
  @architectures = "spacy.MultiHashEmbed.v2"
112
  width = 16
113
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
114
- rows = [1000,500,500,500]
115
  include_static_vectors = false
116
 
117
  [components.senter.model.tok2vec.encode]
@@ -123,6 +130,8 @@ maxout_pieces = 2
123
 
124
  [components.tagger]
125
  factory = "tagger"
 
 
126
 
127
  [components.tagger.model]
128
  @architectures = "spacy.Tagger.v1"
@@ -142,8 +151,8 @@ factory = "tok2vec"
142
  [components.tok2vec.model.embed]
143
  @architectures = "spacy.MultiHashEmbed.v2"
144
  width = ${components.tok2vec.model.encode:width}
145
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
146
- rows = [5000,2500,2500,2500]
147
  include_static_vectors = false
148
 
149
  [components.tok2vec.model.encode]
@@ -157,22 +166,19 @@ maxout_pieces = 3
157
 
158
  [corpora.dev]
159
  @readers = "spacy.Corpus.v1"
160
- limit = 0
161
- max_length = 0
162
- path = ${paths:dev}
163
  gold_preproc = false
 
 
164
  augmenter = null
165
 
166
  [corpora.train]
167
  @readers = "spacy.Corpus.v1"
168
- path = ${paths:train}
169
- max_length = 5000
170
  gold_preproc = false
 
171
  limit = 0
172
-
173
- [corpora.train.augmenter]
174
- @augmenters = "spacy.lower_case.v1"
175
- level = 0.1
176
 
177
  [training]
178
  train_corpus = "corpora.train"
@@ -203,9 +209,8 @@ compound = 1.001
203
  t = 0.0
204
 
205
  [training.logger]
206
- @loggers = "spacy.WandbLogger.v1"
207
- project_name = "spacy-v3.0.0a2"
208
- remove_config_values = []
209
 
210
  [training.optimizer]
211
  @optimizers = "Adam.v1"
@@ -219,26 +224,27 @@ eps = 0.00000001
219
  learn_rate = 0.001
220
 
221
  [training.score_weights]
222
- pos_acc = 0.05
223
  morph_acc = 0.05
224
  morph_per_feat = null
225
- tag_acc = 0.05
226
  dep_uas = 0.0
227
  dep_las = 0.16
228
  dep_las_per_type = null
229
  sents_p = null
230
  sents_r = null
231
  sents_f = 0.02
232
- lemma_acc = 0.33
233
- ents_f = 0.33
234
  ents_p = 0.0
235
  ents_r = 0.0
236
  ents_per_type = null
 
237
 
238
  [pretraining]
239
 
240
  [initialize]
241
- vocab_data = ${paths.vocab_data}
242
  vectors = ${paths.vectors}
243
  init_tok2vec = ${paths.init_tok2vec}
244
  before_init = null
1
  [paths]
2
+ train = null
3
+ dev = null
4
  vectors = null
 
5
  init_tok2vec = null
 
6
 
7
  [system]
8
  gpu_allocator = null
22
 
23
  [components.attribute_ruler]
24
  factory = "attribute_ruler"
25
+ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
26
  validate = false
27
 
28
  [components.lemmatizer]
30
  mode = "rule"
31
  model = null
32
  overwrite = false
33
+ scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
37
+ extend = false
38
+ overwrite = true
39
+ scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
41
  [components.morphologizer.model]
42
  @architectures = "spacy.Tagger.v1"
51
  factory = "ner"
52
  incorrect_spans_key = null
53
  moves = null
54
+ scorer = {"@scorers":"spacy.ner_scorer.v1"}
55
  update_with_oracle_cut_size = 100
56
 
57
  [components.ner.model]
69
  [components.ner.model.tok2vec.embed]
70
  @architectures = "spacy.MultiHashEmbed.v2"
71
  width = 96
72
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
73
+ rows = [5000,2500,2500,2500,100]
74
  include_static_vectors = false
75
 
76
  [components.ner.model.tok2vec.encode]
85
  learn_tokens = false
86
  min_action_freq = 30
87
  moves = null
88
+ scorer = {"@scorers":"spacy.parser_scorer.v1"}
89
  update_with_oracle_cut_size = 100
90
 
91
  [components.parser.model]
104
 
105
  [components.senter]
106
  factory = "senter"
107
+ overwrite = false
108
+ scorer = {"@scorers":"spacy.senter_scorer.v1"}
109
 
110
  [components.senter.model]
111
  @architectures = "spacy.Tagger.v1"
117
  [components.senter.model.tok2vec.embed]
118
  @architectures = "spacy.MultiHashEmbed.v2"
119
  width = 16
120
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
121
+ rows = [1000,500,500,500,50]
122
  include_static_vectors = false
123
 
124
  [components.senter.model.tok2vec.encode]
130
 
131
  [components.tagger]
132
  factory = "tagger"
133
+ overwrite = false
134
+ scorer = {"@scorers":"spacy.tagger_scorer.v1"}
135
 
136
  [components.tagger.model]
137
  @architectures = "spacy.Tagger.v1"
151
  [components.tok2vec.model.embed]
152
  @architectures = "spacy.MultiHashEmbed.v2"
153
  width = ${components.tok2vec.model.encode:width}
154
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
155
+ rows = [5000,2500,2500,2500,100]
156
  include_static_vectors = false
157
 
158
  [components.tok2vec.model.encode]
166
 
167
  [corpora.dev]
168
  @readers = "spacy.Corpus.v1"
169
+ path = ${paths.dev}
 
 
170
  gold_preproc = false
171
+ max_length = 0
172
+ limit = 0
173
  augmenter = null
174
 
175
  [corpora.train]
176
  @readers = "spacy.Corpus.v1"
177
+ path = ${paths.train}
 
178
  gold_preproc = false
179
+ max_length = 0
180
  limit = 0
181
+ augmenter = null
 
 
 
182
 
183
  [training]
184
  train_corpus = "corpora.train"
209
  t = 0.0
210
 
211
  [training.logger]
212
+ @loggers = "spacy.ConsoleLogger.v1"
213
+ progress_bar = false
 
214
 
215
  [training.optimizer]
216
  @optimizers = "Adam.v1"
224
  learn_rate = 0.001
225
 
226
  [training.score_weights]
227
+ pos_acc = 0.06
228
  morph_acc = 0.05
229
  morph_per_feat = null
230
+ tag_acc = 0.06
231
  dep_uas = 0.0
232
  dep_las = 0.16
233
  dep_las_per_type = null
234
  sents_p = null
235
  sents_r = null
236
  sents_f = 0.02
237
+ lemma_acc = 0.5
238
+ ents_f = 0.16
239
  ents_p = 0.0
240
  ents_r = 0.0
241
  ents_per_type = null
242
+ speed = 0.0
243
 
244
  [pretraining]
245
 
246
  [initialize]
247
+ vocab_data = null
248
  vectors = ${paths.vectors}
249
  init_tok2vec = ${paths.init_tok2vec}
250
  before_init = null
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"nl",
3
  "name":"core_news_sm",
4
- "version":"3.1.0",
5
  "description":"Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
- "spacy_version":">=3.1.0,<3.2.0",
11
- "spacy_git_version":"caba63b74",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
@@ -68,7 +68,6 @@
68
  "Abbr=Yes|POS=X",
69
  "Gender=Com,Neut|Number=Sing|POS=PROPN",
70
  "Degree=Sup|POS=ADJ",
71
- "Foreign=Yes|POS=X",
72
  "POS=ADJ",
73
  "Number=Sing|POS=PROPN",
74
  "POS=PRON|PronType=Dem",
@@ -78,13 +77,14 @@
78
  "POS=PRON|Person=2|Poss=Yes|PronType=Prs",
79
  "Case=Dat|POS=PRON|PronType=Dem",
80
  "Case=Nom|POS=PRON|Person=2|PronType=Prs",
81
- "POS=X",
82
  "POS=INTJ",
 
83
  "Case=Gen|POS=PRON|Person=3|Poss=Yes|PronType=Prs",
84
  "POS=PRON|PronType=Int",
85
- "Case=Acc|POS=PRON|Person=2|PronType=Prs",
86
  "POS=PRON|Person=2|PronType=Prs",
87
- "Case=Gen|POS=PRON|Person=2|PronType=Prs"
 
 
88
  ],
89
  "tagger":[
90
  "ADJ|nom|basis|met-e|mv-n",
@@ -95,6 +95,7 @@
95
  "ADJ|nom|comp|met-e|mv-n",
96
  "ADJ|nom|comp|met-e|zonder-n|stan",
97
  "ADJ|nom|sup|met-e|mv-n",
 
98
  "ADJ|nom|sup|met-e|zonder-n|stan",
99
  "ADJ|nom|sup|zonder|zonder-n",
100
  "ADJ|postnom|basis|met-s",
@@ -106,6 +107,7 @@
106
  "ADJ|prenom|comp|met-e|stan",
107
  "ADJ|prenom|comp|zonder",
108
  "ADJ|prenom|sup|met-e|stan",
 
109
  "ADJ|vrij|basis|zonder",
110
  "ADJ|vrij|comp|zonder",
111
  "ADJ|vrij|dim|zonder",
@@ -175,6 +177,7 @@
175
  "VNW|bez|det|stan|red|1|ev|prenom|zonder|agr",
176
  "VNW|bez|det|stan|red|2v|ev|prenom|zonder|agr",
177
  "VNW|bez|det|stan|red|3|ev|prenom|zonder|agr",
 
178
  "VNW|bez|det|stan|vol|1|ev|prenom|zonder|agr",
179
  "VNW|bez|det|stan|vol|1|mv|prenom|met-e|rest",
180
  "VNW|bez|det|stan|vol|1|mv|prenom|zonder|evon",
@@ -187,10 +190,12 @@
187
  "VNW|bez|det|stan|vol|3v|ev|prenom|met-e|rest",
188
  "VNW|bez|det|stan|vol|3|ev|prenom|zonder|agr",
189
  "VNW|bez|det|stan|vol|3|mv|prenom|zonder|agr",
 
190
  "VNW|onbep|adv-pron|gen|red|3|getal",
191
  "VNW|onbep|adv-pron|obl|vol|3o|getal",
192
  "VNW|onbep|det|stan|nom|met-e|mv-n",
193
  "VNW|onbep|det|stan|nom|met-e|zonder-n",
 
194
  "VNW|onbep|det|stan|prenom|met-e|agr",
195
  "VNW|onbep|det|stan|prenom|met-e|evz",
196
  "VNW|onbep|det|stan|prenom|met-e|mv",
@@ -377,360 +382,361 @@
377
  "senter"
378
  ],
379
  "performance":{
380
- "tag_acc":0.9349210503,
381
- "dep_uas":0.8552811724,
382
- "dep_las":0.8057499239,
383
- "ents_p":0.7305475504,
384
- "ents_r":0.7012448133,
385
- "ents_f":0.7155963303,
386
- "sents_p":0.8511659808,
387
- "sents_r":0.8902439024,
388
- "sents_f":0.8702664797,
389
- "speed":3825.7629872661,
390
- "dep_las_per_type":{
391
- "nmod:poss":{
392
- "p":0.9553903346,
393
- "r":0.9379562044,
394
- "f":0.9465930018
395
  },
396
- "nsubj":{
397
- "p":0.828229028,
398
- "r":0.8178829717,
399
- "f":0.8230234866
400
  },
401
- "aux":{
402
- "p":0.8929765886,
403
- "r":0.8782894737,
404
- "f":0.8855721393
405
  },
406
- "advmod":{
407
- "p":0.7687016337,
408
- "r":0.786971831,
409
- "f":0.7777294476
410
  },
411
- "root":{
412
- "p":0.8504801097,
413
- "r":0.8895265423,
414
- "f":0.8695652174
415
  },
416
- "det":{
417
- "p":0.93700126,
418
- "r":0.9645482058,
419
- "f":0.9505752024
420
  },
421
- "amod":{
422
- "p":0.8596646072,
423
- "r":0.8727598566,
424
- "f":0.866162739
425
  },
426
- "obl":{
427
- "p":0.737449118,
428
- "r":0.7409679618,
429
- "f":0.7392043523
430
  },
431
- "mark":{
432
- "p":0.887477314,
433
- "r":0.8907103825,
434
- "f":0.8890909091
435
  },
436
- "ccomp":{
437
- "p":0.6146788991,
438
- "r":0.6261682243,
439
- "f":0.6203703704
440
  },
441
- "case":{
442
- "p":0.9337199669,
443
- "r":0.9550847458,
444
- "f":0.9442815249
445
  },
446
- "appos":{
447
- "p":0.7269736842,
448
- "r":0.6696969697,
449
- "f":0.6971608833
 
 
 
 
 
 
 
 
 
 
 
 
 
450
  },
451
- "obj":{
452
- "p":0.7103347889,
453
- "r":0.743902439,
454
- "f":0.7267311988
455
  },
456
- "compound:prt":{
457
- "p":0.7638190955,
458
- "r":0.7004608295,
459
- "f":0.7307692308
460
  },
461
- "xcomp":{
462
- "p":0.6356877323,
463
- "r":0.6286764706,
464
- "f":0.6321626617
465
  },
466
- "flat":{
467
- "p":0.8104046243,
468
- "r":0.7619565217,
469
- "f":0.7854341737
470
  },
471
- "expl:pv":{
472
- "p":0.7021276596,
473
- "r":0.75,
474
- "f":0.7252747253
475
  },
476
- "acl":{
477
- "p":0.5064935065,
478
- "r":0.4020618557,
479
- "f":0.4482758621
 
 
 
 
 
 
 
 
 
 
480
  },
481
  "advcl":{
482
- "p":0.4837209302,
483
- "r":0.4684684685,
484
- "f":0.47597254
485
  },
486
- "nummod":{
487
- "p":0.8301282051,
488
- "r":0.8633333333,
489
- "f":0.8464052288
490
  },
491
- "nmod":{
492
- "p":0.7062240664,
493
- "r":0.7438811189,
494
- "f":0.7245636441
 
 
 
 
 
495
  },
496
  "cc":{
497
- "p":0.8563327032,
498
- "r":0.8595825427,
499
- "f":0.8579545455
500
  },
501
  "conj":{
502
- "p":0.5967957276,
503
- "r":0.6081632653,
504
- "f":0.602425876
505
- },
506
- "nsubj:pass":{
507
- "p":0.753164557,
508
- "r":0.748427673,
509
- "f":0.7507886435
510
  },
511
- "aux:pass":{
512
- "p":0.8681318681,
513
- "r":0.8777777778,
514
- "f":0.8729281768
515
  },
516
- "cop":{
517
- "p":0.7167235495,
518
- "r":0.7720588235,
519
- "f":0.7433628319
520
  },
521
- "parataxis":{
522
- "p":0.3440366972,
523
- "r":0.2737226277,
524
- "f":0.3048780488
525
  },
526
- "iobj":{
527
- "p":0.6315789474,
528
- "r":0.3636363636,
529
- "f":0.4615384615
530
  },
531
- "acl:relcl":{
532
- "p":0.5914634146,
533
- "r":0.6100628931,
534
- "f":0.600619195
535
  },
536
- "expl":{
537
- "p":0.5,
538
- "r":0.5714285714,
539
- "f":0.5333333333
540
  },
541
- "fixed":{
542
- "p":0.7220630372,
543
- "r":0.4337349398,
544
- "f":0.5419354839
545
  },
546
- "obl:agent":{
547
- "p":1.0,
548
- "r":0.8620689655,
549
- "f":0.9259259259
550
  },
551
- "csubj":{
552
- "p":0.5333333333,
553
- "r":0.4,
554
- "f":0.4571428571
555
  },
556
  "dep":{
557
  "p":0.0,
558
  "r":0.0,
559
  "f":0.0
560
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
561
  "orphan":{
562
  "p":0.0,
563
  "r":0.0,
564
  "f":0.0
565
  }
566
  },
 
 
 
 
567
  "ents_per_type":{
568
  "DATE":{
569
- "p":0.9459459459,
570
- "r":0.9090909091,
571
- "f":0.9271523179
572
  },
573
  "NORP":{
574
- "p":0.6698113208,
575
- "r":0.8554216867,
576
- "f":0.7513227513
577
  },
578
  "ORG":{
579
- "p":0.728,
580
- "r":0.5384615385,
581
- "f":0.619047619
582
  },
583
  "CARDINAL":{
584
- "p":0.8815789474,
585
  "r":0.9571428571,
586
- "f":0.9178082192
587
  },
588
  "GPE":{
589
- "p":0.6724137931,
590
- "r":0.8571428571,
591
- "f":0.7536231884
592
  },
593
  "MONEY":{
594
- "p":0.2,
595
- "r":0.3333333333,
596
- "f":0.25
597
  },
598
  "PERCENT":{
599
- "p":1.0,
600
  "r":1.0,
601
- "f":1.0
602
  },
603
  "PERSON":{
604
- "p":0.6907894737,
605
- "r":0.6796116505,
606
- "f":0.6851549755
 
 
 
 
 
607
  },
608
  "LAW":{
609
  "p":1.0,
610
  "r":1.0,
611
  "f":1.0
612
  },
613
- "ORDINAL":{
614
- "p":0.9090909091,
615
- "r":0.9090909091,
616
- "f":0.9090909091
617
  },
618
- "WORK_OF_ART":{
619
- "p":0.6203703704,
620
- "r":0.3722222222,
621
- "f":0.4652777778
622
  },
623
  "LANGUAGE":{
624
- "p":0.5833333333,
625
  "r":0.6363636364,
626
- "f":0.6086956522
627
  },
628
  "QUANTITY":{
629
- "p":0.8461538462,
630
  "r":0.9166666667,
631
- "f":0.88
632
  },
633
  "LOC":{
634
- "p":0.2631578947,
635
- "r":0.1470588235,
636
- "f":0.1886792453
637
  },
638
  "FAC":{
639
- "p":0.037037037,
640
  "r":0.0714285714,
641
- "f":0.0487804878
642
  },
643
  "EVENT":{
644
- "p":0.3684210526,
645
- "r":0.3043478261,
646
- "f":0.3333333333
647
- },
648
- "PRODUCT":{
649
- "p":0.0,
650
- "r":0.0,
651
- "f":0.0
652
  },
653
  "TIME":{
654
- "p":1.0,
655
  "r":1.0,
656
- "f":1.0
657
  }
658
  },
659
- "token_acc":0.9997165842,
660
- "pos_acc":0.9546138579,
661
- "morph_acc":0.9458392409,
662
- "lemma_acc":0.8520836514,
663
- "morph_per_feat":{
664
- "Person":{
665
- "p":0.986328125,
666
- "r":0.964660936,
667
- "f":0.9753742154
668
- },
669
- "Poss":{
670
- "p":0.9846153846,
671
- "r":0.9733840304,
672
- "f":0.9789674952
673
- },
674
- "PronType":{
675
- "p":0.9872013652,
676
- "r":0.9609634551,
677
- "f":0.9739057239
678
- },
679
- "Gender":{
680
- "p":0.8911903513,
681
- "r":0.860724234,
682
- "f":0.875692387
683
- },
684
- "Number":{
685
- "p":0.9758520556,
686
- "r":0.9529850746,
687
- "f":0.9642830174
688
- },
689
- "Tense":{
690
- "p":0.9692565679,
691
- "r":0.9527472527,
692
- "f":0.9609310058
693
- },
694
- "VerbForm":{
695
- "p":0.9379733141,
696
- "r":0.9359481828,
697
- "f":0.9369596542
698
- },
699
- "Degree":{
700
- "p":0.9393491124,
701
- "r":0.9058487874,
702
- "f":0.9222948439
703
- },
704
- "Definite":{
705
- "p":0.9951584507,
706
- "r":0.9947206335,
707
- "f":0.9949394939
708
- },
709
- "Case":{
710
- "p":0.9940239044,
711
- "r":0.9940239044,
712
- "f":0.9940239044
713
- },
714
- "Reflex":{
715
- "p":1.0,
716
- "r":1.0,
717
- "f":1.0
718
- },
719
- "Foreign":{
720
- "p":0.68,
721
- "r":0.2615384615,
722
- "f":0.3777777778
723
- },
724
- "Abbr":{
725
- "p":0.8333333333,
726
- "r":0.8333333333,
727
- "f":0.8333333333
728
- }
729
- }
730
  },
731
  "sources":[
732
  {
733
- "name":"UD Dutch LassySmall v2.5",
734
  "url":"https://github.com/UniversalDependencies/UD_Dutch-LassySmall",
735
  "license":"CC BY-SA 4.0",
736
  "author":"Bouma, Gosse; van Noord, Gertjan"
@@ -742,13 +748,13 @@
742
  "author":"NLP Town"
743
  },
744
  {
745
- "name":"UD Dutch LassySmall v2.5",
746
  "url":"https://github.com/UniversalDependencies/UD_Dutch-LassySmall",
747
  "license":"CC BY-SA 4.0",
748
  "author":"Bouma, Gosse; van Noord, Gertjan"
749
  },
750
  {
751
- "name":"UD Dutch Alpino v2.5",
752
  "url":"https://github.com/UniversalDependencies/UD_Dutch-Alpino",
753
  "license":"CC BY-SA 4.0",
754
  "author":"Zeman, Daniel; \u017dabokrtsk\u00fd, Zden\u011bk; Bouma, Gosse; van Noord, Gertjan"
1
  {
2
  "lang":"nl",
3
  "name":"core_news_sm",
4
+ "version":"3.2.0",
5
  "description":"Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
+ "spacy_version":">=3.2.0,<3.3.0",
11
+ "spacy_git_version":"bb26550e2",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
68
  "Abbr=Yes|POS=X",
69
  "Gender=Com,Neut|Number=Sing|POS=PROPN",
70
  "Degree=Sup|POS=ADJ",
 
71
  "POS=ADJ",
72
  "Number=Sing|POS=PROPN",
73
  "POS=PRON|PronType=Dem",
77
  "POS=PRON|Person=2|Poss=Yes|PronType=Prs",
78
  "Case=Dat|POS=PRON|PronType=Dem",
79
  "Case=Nom|POS=PRON|Person=2|PronType=Prs",
 
80
  "POS=INTJ",
81
+ "Case=Acc|POS=PRON|Person=2|PronType=Prs",
82
  "Case=Gen|POS=PRON|Person=3|Poss=Yes|PronType=Prs",
83
  "POS=PRON|PronType=Int",
 
84
  "POS=PRON|Person=2|PronType=Prs",
85
+ "POS=PRON|Person=3",
86
+ "Case=Gen|POS=PRON|Person=2|PronType=Prs",
87
+ "POS=X"
88
  ],
89
  "tagger":[
90
  "ADJ|nom|basis|met-e|mv-n",
95
  "ADJ|nom|comp|met-e|mv-n",
96
  "ADJ|nom|comp|met-e|zonder-n|stan",
97
  "ADJ|nom|sup|met-e|mv-n",
98
+ "ADJ|nom|sup|met-e|zonder-n|bijz",
99
  "ADJ|nom|sup|met-e|zonder-n|stan",
100
  "ADJ|nom|sup|zonder|zonder-n",
101
  "ADJ|postnom|basis|met-s",
107
  "ADJ|prenom|comp|met-e|stan",
108
  "ADJ|prenom|comp|zonder",
109
  "ADJ|prenom|sup|met-e|stan",
110
+ "ADJ|prenom|sup|zonder",
111
  "ADJ|vrij|basis|zonder",
112
  "ADJ|vrij|comp|zonder",
113
  "ADJ|vrij|dim|zonder",
177
  "VNW|bez|det|stan|red|1|ev|prenom|zonder|agr",
178
  "VNW|bez|det|stan|red|2v|ev|prenom|zonder|agr",
179
  "VNW|bez|det|stan|red|3|ev|prenom|zonder|agr",
180
+ "VNW|bez|det|stan|vol|1|ev|prenom|met-e|rest",
181
  "VNW|bez|det|stan|vol|1|ev|prenom|zonder|agr",
182
  "VNW|bez|det|stan|vol|1|mv|prenom|met-e|rest",
183
  "VNW|bez|det|stan|vol|1|mv|prenom|zonder|evon",
190
  "VNW|bez|det|stan|vol|3v|ev|prenom|met-e|rest",
191
  "VNW|bez|det|stan|vol|3|ev|prenom|zonder|agr",
192
  "VNW|bez|det|stan|vol|3|mv|prenom|zonder|agr",
193
+ "VNW|excl|pron|stan|vol|3|getal",
194
  "VNW|onbep|adv-pron|gen|red|3|getal",
195
  "VNW|onbep|adv-pron|obl|vol|3o|getal",
196
  "VNW|onbep|det|stan|nom|met-e|mv-n",
197
  "VNW|onbep|det|stan|nom|met-e|zonder-n",
198
+ "VNW|onbep|det|stan|nom|zonder|zonder-n",
199
  "VNW|onbep|det|stan|prenom|met-e|agr",
200
  "VNW|onbep|det|stan|prenom|met-e|evz",
201
  "VNW|onbep|det|stan|prenom|met-e|mv",
382
  "senter"
383
  ],
384
  "performance":{
385
+ "token_acc":0.9997165842,
386
+ "token_p":0.9974281853,
387
+ "token_r":0.9975586363,
388
+ "token_f":0.9974934066,
389
+ "pos_acc":0.9581243184,
390
+ "morph_acc":0.9498527648,
391
+ "morph_micro_p":0.9588741568,
392
+ "morph_micro_r":0.9384419251,
393
+ "morph_micro_f":0.9485480234,
394
+ "morph_per_feat":{
395
+ "Person":{
396
+ "p":0.991202346,
397
+ "r":0.9694072658,
398
+ "f":0.9801836636
 
399
  },
400
+ "Poss":{
401
+ "p":0.9923076923,
402
+ "r":0.9885057471,
403
+ "f":0.990403071
404
  },
405
+ "PronType":{
406
+ "p":0.9914310197,
407
+ "r":0.961762261,
408
+ "f":0.976371308
409
  },
410
+ "Gender":{
411
+ "p":0.9029177719,
412
+ "r":0.8659374205,
413
+ "f":0.8840410336
414
  },
415
+ "Number":{
416
+ "p":0.9755422243,
417
+ "r":0.949259093,
418
+ "f":0.9622212107
419
  },
420
+ "Tense":{
421
+ "p":0.9620958751,
422
+ "r":0.9488730071,
423
+ "f":0.9554386936
424
  },
425
+ "VerbForm":{
426
+ "p":0.9396676301,
427
+ "r":0.9359481828,
428
+ "f":0.9378042185
429
  },
430
+ "Degree":{
431
+ "p":0.9345930233,
432
+ "r":0.9238505747,
433
+ "f":0.9291907514
434
  },
435
+ "Definite":{
436
+ "p":0.995584989,
437
+ "r":0.9925176056,
438
+ "f":0.994048931
439
  },
440
+ "Case":{
441
+ "p":0.9960159363,
442
+ "r":0.9960159363,
443
+ "f":0.9960159363
444
  },
445
+ "Reflex":{
446
+ "p":1.0,
447
+ "r":1.0,
448
+ "f":1.0
449
  },
450
+ "Abbr":{
451
+ "p":0.9,
452
+ "r":0.5,
453
+ "f":0.6428571429
454
+ }
455
+ },
456
+ "tag_acc":0.9392161567,
457
+ "sents_p":0.8339006127,
458
+ "sents_r":0.8787661406,
459
+ "sents_f":0.8557457213,
460
+ "dep_uas":0.8546007605,
461
+ "dep_las":0.8062660009,
462
+ "dep_las_per_type":{
463
+ "det":{
464
+ "p":0.8796718323,
465
+ "r":0.9516765286,
466
+ "f":0.9142586452
467
  },
468
+ "nsubj":{
469
+ "p":0.7457627119,
470
+ "r":0.7857142857,
471
+ "f":0.7652173913
472
  },
473
+ "root":{
474
+ "p":0.7325428195,
475
+ "r":0.8298507463,
476
+ "f":0.77816655
477
  },
478
+ "case":{
479
+ "p":0.8851174935,
480
+ "r":0.9373271889,
481
+ "f":0.9104744852
482
  },
483
+ "obl":{
484
+ "p":0.7081339713,
485
+ "r":0.7036450079,
486
+ "f":0.7058823529
487
  },
488
+ "nmod":{
489
+ "p":0.6162420382,
490
+ "r":0.6753926702,
491
+ "f":0.6444629475
492
  },
493
+ "advmod":{
494
+ "p":0.6918604651,
495
+ "r":0.7212121212,
496
+ "f":0.706231454
497
+ },
498
+ "obj":{
499
+ "p":0.717472119,
500
+ "r":0.7121771218,
501
+ "f":0.7148148148
502
+ },
503
+ "mark":{
504
+ "p":0.8205128205,
505
+ "r":0.7933884298,
506
+ "f":0.8067226891
507
  },
508
  "advcl":{
509
+ "p":0.52,
510
+ "r":0.4297520661,
511
+ "f":0.4705882353
512
  },
513
+ "amod":{
514
+ "p":0.8108581436,
515
+ "r":0.8448905109,
516
+ "f":0.8275245755
517
  },
518
+ "acl:relcl":{
519
+ "p":0.6329113924,
520
+ "r":0.6097560976,
521
+ "f":0.6211180124
522
+ },
523
+ "cop":{
524
+ "p":0.7094594595,
525
+ "r":0.5769230769,
526
+ "f":0.6363636364
527
  },
528
  "cc":{
529
+ "p":0.798013245,
530
+ "r":0.8281786942,
531
+ "f":0.8128161889
532
  },
533
  "conj":{
534
+ "p":0.54784689,
535
+ "r":0.5066371681,
536
+ "f":0.5264367816
 
 
 
 
 
537
  },
538
+ "fixed":{
539
+ "p":0.7045454545,
540
+ "r":0.2520325203,
541
+ "f":0.371257485
542
  },
543
+ "flat":{
544
+ "p":0.7467889908,
545
+ "r":0.6933560477,
546
+ "f":0.7190812721
547
  },
548
+ "csubj":{
549
+ "p":0.75,
550
+ "r":0.5,
551
+ "f":0.6
552
  },
553
+ "aux":{
554
+ "p":0.7647058824,
555
+ "r":0.7572815534,
556
+ "f":0.7609756098
557
  },
558
+ "compound:prt":{
559
+ "p":0.7727272727,
560
+ "r":0.6623376623,
561
+ "f":0.7132867133
562
  },
563
+ "nummod":{
564
+ "p":0.5641025641,
565
+ "r":0.602739726,
566
+ "f":0.582781457
567
  },
568
+ "acl":{
569
+ "p":0.4375,
570
+ "r":0.3559322034,
571
+ "f":0.3925233645
572
  },
573
+ "expl":{
574
+ "p":0.4,
575
+ "r":0.3333333333,
576
+ "f":0.3636363636
577
  },
578
+ "appos":{
579
+ "p":0.6092715232,
580
+ "r":0.5317919075,
581
+ "f":0.5679012346
582
  },
583
  "dep":{
584
  "p":0.0,
585
  "r":0.0,
586
  "f":0.0
587
  },
588
+ "nsubj:pass":{
589
+ "p":0.7882352941,
590
+ "r":0.7790697674,
591
+ "f":0.783625731
592
+ },
593
+ "aux:pass":{
594
+ "p":0.8252427184,
595
+ "r":0.8673469388,
596
+ "f":0.8457711443
597
+ },
598
+ "ccomp":{
599
+ "p":0.5882352941,
600
+ "r":0.5882352941,
601
+ "f":0.5882352941
602
+ },
603
+ "xcomp":{
604
+ "p":0.36,
605
+ "r":0.6164383562,
606
+ "f":0.4545454545
607
+ },
608
+ "parataxis":{
609
+ "p":0.350877193,
610
+ "r":0.2684563758,
611
+ "f":0.3041825095
612
+ },
613
+ "expl:pv":{
614
+ "p":0.8095238095,
615
+ "r":0.8947368421,
616
+ "f":0.85
617
+ },
618
+ "iobj":{
619
+ "p":0.3636363636,
620
+ "r":0.4,
621
+ "f":0.380952381
622
+ },
623
+ "nmod:poss":{
624
+ "p":0.8940397351,
625
+ "r":0.8823529412,
626
+ "f":0.8881578947
627
+ },
628
+ "obl:agent":{
629
+ "p":0.6875,
630
+ "r":0.7857142857,
631
+ "f":0.7333333333
632
+ },
633
  "orphan":{
634
  "p":0.0,
635
  "r":0.0,
636
  "f":0.0
637
  }
638
  },
639
+ "lemma_acc":0.8133109449,
640
+ "ents_p":0.7446964155,
641
+ "ents_r":0.704011065,
642
+ "ents_f":0.7237824387,
643
  "ents_per_type":{
644
  "DATE":{
645
+ "p":0.9317406143,
646
+ "r":0.8863636364,
647
+ "f":0.9084858569
648
  },
649
  "NORP":{
650
+ "p":0.693877551,
651
+ "r":0.8192771084,
652
+ "f":0.7513812155
653
  },
654
  "ORG":{
655
+ "p":0.762295082,
656
+ "r":0.550295858,
657
+ "f":0.6391752577
658
  },
659
  "CARDINAL":{
660
+ "p":0.858974359,
661
  "r":0.9571428571,
662
+ "f":0.9054054054
663
  },
664
  "GPE":{
665
+ "p":0.6722689076,
666
+ "r":0.8791208791,
667
+ "f":0.7619047619
668
  },
669
  "MONEY":{
670
+ "p":0.0,
671
+ "r":0.0,
672
+ "f":0.0
673
  },
674
  "PERCENT":{
675
+ "p":0.6666666667,
676
  "r":1.0,
677
+ "f":0.8
678
  },
679
  "PERSON":{
680
+ "p":0.7281879195,
681
+ "r":0.7022653722,
682
+ "f":0.7149917628
683
+ },
684
+ "WORK_OF_ART":{
685
+ "p":0.6146788991,
686
+ "r":0.3722222222,
687
+ "f":0.4636678201
688
  },
689
  "LAW":{
690
  "p":1.0,
691
  "r":1.0,
692
  "f":1.0
693
  },
694
+ "PRODUCT":{
695
+ "p":0.0,
696
+ "r":0.0,
697
+ "f":0.0
698
  },
699
+ "ORDINAL":{
700
+ "p":0.90625,
701
+ "r":0.8787878788,
702
+ "f":0.8923076923
703
  },
704
  "LANGUAGE":{
705
+ "p":0.4375,
706
  "r":0.6363636364,
707
+ "f":0.5185185185
708
  },
709
  "QUANTITY":{
710
+ "p":0.9166666667,
711
  "r":0.9166666667,
712
+ "f":0.9166666667
713
  },
714
  "LOC":{
715
+ "p":0.6428571429,
716
+ "r":0.2647058824,
717
+ "f":0.375
718
  },
719
  "FAC":{
720
+ "p":0.0625,
721
  "r":0.0714285714,
722
+ "f":0.0666666667
723
  },
724
  "EVENT":{
725
+ "p":0.3157894737,
726
+ "r":0.2608695652,
727
+ "f":0.2857142857
 
 
 
 
 
728
  },
729
  "TIME":{
730
+ "p":0.5,
731
  "r":1.0,
732
+ "f":0.6666666667
733
  }
734
  },
735
+ "speed":3273.3086753398
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
736
  },
737
  "sources":[
738
  {
739
+ "name":"UD Dutch LassySmall v2.8",
740
  "url":"https://github.com/UniversalDependencies/UD_Dutch-LassySmall",
741
  "license":"CC BY-SA 4.0",
742
  "author":"Bouma, Gosse; van Noord, Gertjan"
748
  "author":"NLP Town"
749
  },
750
  {
751
+ "name":"UD Dutch LassySmall v2.8",
752
  "url":"https://github.com/UniversalDependencies/UD_Dutch-LassySmall",
753
  "license":"CC BY-SA 4.0",
754
  "author":"Bouma, Gosse; van Noord, Gertjan"
755
  },
756
  {
757
+ "name":"UD Dutch Alpino v2.8",
758
  "url":"https://github.com/UniversalDependencies/UD_Dutch-Alpino",
759
  "license":"CC BY-SA 4.0",
760
  "author":"Zeman, Daniel; \u017dabokrtsk\u00fd, Zden\u011bk; Bouma, Gosse; van Noord, Gertjan"
morphologizer/cfg CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "labels_morph":{
3
  "POS=PRON|Person=3|PronType=Dem":"Person=3|PronType=Dem",
4
  "Number=Sing|POS=AUX|Tense=Pres|VerbForm=Fin":"Number=Sing|Tense=Pres|VerbForm=Fin",
@@ -48,7 +49,6 @@
48
  "Abbr=Yes|POS=X":"Abbr=Yes",
49
  "Gender=Com,Neut|Number=Sing|POS=PROPN":"Gender=Com,Neut|Number=Sing",
50
  "Degree=Sup|POS=ADJ":"Degree=Sup",
51
- "Foreign=Yes|POS=X":"Foreign=Yes",
52
  "POS=ADJ":"",
53
  "Number=Sing|POS=PROPN":"Number=Sing",
54
  "POS=PRON|PronType=Dem":"PronType=Dem",
@@ -58,13 +58,14 @@
58
  "POS=PRON|Person=2|Poss=Yes|PronType=Prs":"Person=2|Poss=Yes|PronType=Prs",
59
  "Case=Dat|POS=PRON|PronType=Dem":"Case=Dat|PronType=Dem",
60
  "Case=Nom|POS=PRON|Person=2|PronType=Prs":"Case=Nom|Person=2|PronType=Prs",
61
- "POS=X":"",
62
  "POS=INTJ":"",
 
63
  "Case=Gen|POS=PRON|Person=3|Poss=Yes|PronType=Prs":"Case=Gen|Person=3|Poss=Yes|PronType=Prs",
64
  "POS=PRON|PronType=Int":"PronType=Int",
65
- "Case=Acc|POS=PRON|Person=2|PronType=Prs":"Case=Acc|Person=2|PronType=Prs",
66
  "POS=PRON|Person=2|PronType=Prs":"Person=2|PronType=Prs",
67
- "Case=Gen|POS=PRON|Person=2|PronType=Prs":"Case=Gen|Person=2|PronType=Prs"
 
 
68
  },
69
  "labels_pos":{
70
  "POS=PRON|Person=3|PronType=Dem":95,
@@ -115,7 +116,6 @@
115
  "Abbr=Yes|POS=X":101,
116
  "Gender=Com,Neut|Number=Sing|POS=PROPN":96,
117
  "Degree=Sup|POS=ADJ":84,
118
- "Foreign=Yes|POS=X":101,
119
  "POS=ADJ":84,
120
  "Number=Sing|POS=PROPN":96,
121
  "POS=PRON|PronType=Dem":95,
@@ -125,12 +125,14 @@
125
  "POS=PRON|Person=2|Poss=Yes|PronType=Prs":95,
126
  "Case=Dat|POS=PRON|PronType=Dem":95,
127
  "Case=Nom|POS=PRON|Person=2|PronType=Prs":95,
128
- "POS=X":101,
129
  "POS=INTJ":91,
 
130
  "Case=Gen|POS=PRON|Person=3|Poss=Yes|PronType=Prs":95,
131
  "POS=PRON|PronType=Int":95,
132
- "Case=Acc|POS=PRON|Person=2|PronType=Prs":95,
133
  "POS=PRON|Person=2|PronType=Prs":95,
134
- "Case=Gen|POS=PRON|Person=2|PronType=Prs":95
135
- }
 
 
 
136
  }
1
  {
2
+ "extend":false,
3
  "labels_morph":{
4
  "POS=PRON|Person=3|PronType=Dem":"Person=3|PronType=Dem",
5
  "Number=Sing|POS=AUX|Tense=Pres|VerbForm=Fin":"Number=Sing|Tense=Pres|VerbForm=Fin",
49
  "Abbr=Yes|POS=X":"Abbr=Yes",
50
  "Gender=Com,Neut|Number=Sing|POS=PROPN":"Gender=Com,Neut|Number=Sing",
51
  "Degree=Sup|POS=ADJ":"Degree=Sup",
 
52
  "POS=ADJ":"",
53
  "Number=Sing|POS=PROPN":"Number=Sing",
54
  "POS=PRON|PronType=Dem":"PronType=Dem",
58
  "POS=PRON|Person=2|Poss=Yes|PronType=Prs":"Person=2|Poss=Yes|PronType=Prs",
59
  "Case=Dat|POS=PRON|PronType=Dem":"Case=Dat|PronType=Dem",
60
  "Case=Nom|POS=PRON|Person=2|PronType=Prs":"Case=Nom|Person=2|PronType=Prs",
 
61
  "POS=INTJ":"",
62
+ "Case=Acc|POS=PRON|Person=2|PronType=Prs":"Case=Acc|Person=2|PronType=Prs",
63
  "Case=Gen|POS=PRON|Person=3|Poss=Yes|PronType=Prs":"Case=Gen|Person=3|Poss=Yes|PronType=Prs",
64
  "POS=PRON|PronType=Int":"PronType=Int",
 
65
  "POS=PRON|Person=2|PronType=Prs":"Person=2|PronType=Prs",
66
+ "POS=PRON|Person=3":"Person=3",
67
+ "Case=Gen|POS=PRON|Person=2|PronType=Prs":"Case=Gen|Person=2|PronType=Prs",
68
+ "POS=X":""
69
  },
70
  "labels_pos":{
71
  "POS=PRON|Person=3|PronType=Dem":95,
116
  "Abbr=Yes|POS=X":101,
117
  "Gender=Com,Neut|Number=Sing|POS=PROPN":96,
118
  "Degree=Sup|POS=ADJ":84,
 
119
  "POS=ADJ":84,
120
  "Number=Sing|POS=PROPN":96,
121
  "POS=PRON|PronType=Dem":95,
125
  "POS=PRON|Person=2|Poss=Yes|PronType=Prs":95,
126
  "Case=Dat|POS=PRON|PronType=Dem":95,
127
  "Case=Nom|POS=PRON|Person=2|PronType=Prs":95,
 
128
  "POS=INTJ":91,
129
+ "Case=Acc|POS=PRON|Person=2|PronType=Prs":95,
130
  "Case=Gen|POS=PRON|Person=3|Poss=Yes|PronType=Prs":95,
131
  "POS=PRON|PronType=Int":95,
 
132
  "POS=PRON|Person=2|PronType=Prs":95,
133
+ "POS=PRON|Person=3":95,
134
+ "Case=Gen|POS=PRON|Person=2|PronType=Prs":95,
135
+ "POS=X":101
136
+ },
137
+ "overwrite":true
138
  }
morphologizer/model CHANGED
Binary files a/morphologizer/model and b/morphologizer/model differ
ner/model CHANGED
Binary files a/ner/model and b/ner/model differ
nl_core_news_sm-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:702d52cc3abb24f4cee4ed7343aa3efd6588cb15a1aaff307e79250bc23db016
3
- size 17109227
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4eec449886e97da4fedbf9fa8d25ee9893ef2df3d655eba0d53124b5d32efc88
3
+ size 17426437
parser/model CHANGED
Binary files a/parser/model and b/parser/model differ
parser/moves CHANGED
@@ -1 +1 @@
1
- ��moves��{"0":{"":151388},"1":{"":91342},"2":{"det":29783,"case":26176,"nsubj":13581,"amod":12961,"punct":11732,"advmod":9779,"obl":8002,"mark":6673,"cc":5427,"obj":4504,"aux":4192,"nsubj:pass":2504,"aux:pass":2464,"cop":2079,"nummod":2057,"nmod:poss":2023,"nmod":1226,"xcomp":1155,"compound:prt":842,"advcl":643,"acl":491,"parataxis":419,"iobj":306,"expl":273,"advmod||xcomp":271,"obl||xcomp":266,"expl:pv":262,"obl:agent":227,"obj||xcomp":204,"case||obl":161,"ccomp":108,"expl||advcl":60,"case||advcl":51,"csubj":50,"advmod||ccomp":50,"obl||ccomp":48,"obj||ccomp":47,"obl||obj":38,"advcl||xcomp":30,"dep":0},"3":{"punct":19438,"nmod":13005,"flat":9123,"conj":7122,"obl":6774,"fixed":4655,"nsubj":4274,"appos":3317,"obj":3143,"advmod":3111,"parataxis":2300,"xcomp":2115,"acl:relcl":2029,"advcl":1591,"compound:prt":1377,"cop":1278,"ccomp":1228,"acl":767,"amod":504,"aux:pass":396,"csubj":394,"nummod":367,"aux":353,"iobj":229,"expl:pv":225,"obl:agent":220,"nmod||obj":179,"advcl||advmod":154,"case":148,"acl:relcl||obj":135,"case||obl":133,"acl:relcl||nsubj":98,"acl||obj":88,"expl":83,"orphan":72,"mark":69,"acl:relcl||nsubj:pass":56,"obl||xcomp":54,"expl||advcl":47,"cc":38,"advcl||amod":35,"advcl||nmod":32,"obl||obj":31,"nmod||nsubj":31,"dep":0},"4":{"ROOT":18043}}�cfg��neg_key�
1
+ ��moves��{"0":{"":151558},"1":{"":91349},"2":{"det":29810,"case":26215,"nsubj":13579,"amod":12918,"punct":11737,"advmod":9702,"obl":8128,"mark":6683,"cc":5438,"obj":4515,"aux":4218,"nsubj:pass":2513,"aux:pass":2468,"cop":2077,"nummod":2050,"nmod:poss":2023,"nmod":1255,"xcomp":1160,"compound:prt":839,"advcl":643,"acl":505,"parataxis":416,"iobj":307,"expl":273,"advmod||xcomp":266,"expl:pv":261,"obl||xcomp":259,"obl:agent":227,"obj||xcomp":200,"case||obl":162,"ccomp":108,"expl||advcl":60,"case||advcl":51,"obl||ccomp":50,"csubj":50,"advmod||ccomp":49,"obj||ccomp":47,"obl||obj":42,"advcl||xcomp":31,"dep":0},"3":{"punct":19438,"nmod":13028,"flat":9160,"conj":7136,"obl":6802,"fixed":4623,"nsubj":4273,"appos":3320,"obj":3142,"advmod":3090,"parataxis":2280,"xcomp":2095,"acl:relcl":2032,"advcl":1595,"compound:prt":1376,"cop":1281,"ccomp":1230,"acl":774,"amod":490,"aux:pass":398,"csubj":395,"nummod":365,"aux":355,"iobj":229,"expl:pv":225,"obl:agent":221,"nmod||obj":178,"advcl||advmod":152,"case":147,"acl:relcl||obj":135,"case||obl":132,"acl:relcl||nsubj":98,"acl||obj":88,"expl":83,"mark":69,"orphan":68,"acl:relcl||nsubj:pass":55,"obl||xcomp":53,"expl||advcl":47,"cc":35,"advcl||amod":35,"advcl||nmod":34,"obl||obj":32,"nmod||nsubj":31,"dep":0},"4":{"ROOT":18070}}�cfg��neg_key�
senter/cfg CHANGED
@@ -1,3 +1,3 @@
1
  {
2
-
3
  }
1
  {
2
+ "overwrite":false
3
  }
senter/model CHANGED
Binary files a/senter/model and b/senter/model differ
tagger/cfg CHANGED
@@ -8,6 +8,7 @@
8
  "ADJ|nom|comp|met-e|mv-n",
9
  "ADJ|nom|comp|met-e|zonder-n|stan",
10
  "ADJ|nom|sup|met-e|mv-n",
 
11
  "ADJ|nom|sup|met-e|zonder-n|stan",
12
  "ADJ|nom|sup|zonder|zonder-n",
13
  "ADJ|postnom|basis|met-s",
@@ -19,6 +20,7 @@
19
  "ADJ|prenom|comp|met-e|stan",
20
  "ADJ|prenom|comp|zonder",
21
  "ADJ|prenom|sup|met-e|stan",
 
22
  "ADJ|vrij|basis|zonder",
23
  "ADJ|vrij|comp|zonder",
24
  "ADJ|vrij|dim|zonder",
@@ -88,6 +90,7 @@
88
  "VNW|bez|det|stan|red|1|ev|prenom|zonder|agr",
89
  "VNW|bez|det|stan|red|2v|ev|prenom|zonder|agr",
90
  "VNW|bez|det|stan|red|3|ev|prenom|zonder|agr",
 
91
  "VNW|bez|det|stan|vol|1|ev|prenom|zonder|agr",
92
  "VNW|bez|det|stan|vol|1|mv|prenom|met-e|rest",
93
  "VNW|bez|det|stan|vol|1|mv|prenom|zonder|evon",
@@ -100,10 +103,12 @@
100
  "VNW|bez|det|stan|vol|3v|ev|prenom|met-e|rest",
101
  "VNW|bez|det|stan|vol|3|ev|prenom|zonder|agr",
102
  "VNW|bez|det|stan|vol|3|mv|prenom|zonder|agr",
 
103
  "VNW|onbep|adv-pron|gen|red|3|getal",
104
  "VNW|onbep|adv-pron|obl|vol|3o|getal",
105
  "VNW|onbep|det|stan|nom|met-e|mv-n",
106
  "VNW|onbep|det|stan|nom|met-e|zonder-n",
 
107
  "VNW|onbep|det|stan|prenom|met-e|agr",
108
  "VNW|onbep|det|stan|prenom|met-e|evz",
109
  "VNW|onbep|det|stan|prenom|met-e|mv",
@@ -197,5 +202,6 @@
197
  "WW|vd|prenom|met-e",
198
  "WW|vd|prenom|zonder",
199
  "WW|vd|vrij|zonder"
200
- ]
 
201
  }
8
  "ADJ|nom|comp|met-e|mv-n",
9
  "ADJ|nom|comp|met-e|zonder-n|stan",
10
  "ADJ|nom|sup|met-e|mv-n",
11
+ "ADJ|nom|sup|met-e|zonder-n|bijz",
12
  "ADJ|nom|sup|met-e|zonder-n|stan",
13
  "ADJ|nom|sup|zonder|zonder-n",
14
  "ADJ|postnom|basis|met-s",
20
  "ADJ|prenom|comp|met-e|stan",
21
  "ADJ|prenom|comp|zonder",
22
  "ADJ|prenom|sup|met-e|stan",
23
+ "ADJ|prenom|sup|zonder",
24
  "ADJ|vrij|basis|zonder",
25
  "ADJ|vrij|comp|zonder",
26
  "ADJ|vrij|dim|zonder",
90
  "VNW|bez|det|stan|red|1|ev|prenom|zonder|agr",
91
  "VNW|bez|det|stan|red|2v|ev|prenom|zonder|agr",
92
  "VNW|bez|det|stan|red|3|ev|prenom|zonder|agr",
93
+ "VNW|bez|det|stan|vol|1|ev|prenom|met-e|rest",
94
  "VNW|bez|det|stan|vol|1|ev|prenom|zonder|agr",
95
  "VNW|bez|det|stan|vol|1|mv|prenom|met-e|rest",
96
  "VNW|bez|det|stan|vol|1|mv|prenom|zonder|evon",
103
  "VNW|bez|det|stan|vol|3v|ev|prenom|met-e|rest",
104
  "VNW|bez|det|stan|vol|3|ev|prenom|zonder|agr",
105
  "VNW|bez|det|stan|vol|3|mv|prenom|zonder|agr",
106
+ "VNW|excl|pron|stan|vol|3|getal",
107
  "VNW|onbep|adv-pron|gen|red|3|getal",
108
  "VNW|onbep|adv-pron|obl|vol|3o|getal",
109
  "VNW|onbep|det|stan|nom|met-e|mv-n",
110
  "VNW|onbep|det|stan|nom|met-e|zonder-n",
111
+ "VNW|onbep|det|stan|nom|zonder|zonder-n",
112
  "VNW|onbep|det|stan|prenom|met-e|agr",
113
  "VNW|onbep|det|stan|prenom|met-e|evz",
114
  "VNW|onbep|det|stan|prenom|met-e|mv",
202
  "WW|vd|prenom|met-e",
203
  "WW|vd|prenom|zonder",
204
  "WW|vd|vrij|zonder"
205
+ ],
206
+ "overwrite":false
207
  }
tagger/model CHANGED
Binary files a/tagger/model and b/tagger/model differ
tok2vec/model CHANGED
Binary files a/tok2vec/model and b/tok2vec/model differ
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b604a5e176f59b212c7504edc48f210c9d54707c8bce8892ec021ce80127098e
3
- size 959164
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:02a5913787e6f6e6b9f9eac52871bad95981a880502167ef38e39048c4c929e2
3
+ size 1055293
vocab/vectors.cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ {
2
+ "mode":"default"
3
+ }