osanseviero HF staff commited on
Commit
0ffa29b
1 Parent(s): 604474b

Update spaCy pipeline

Browse files
LICENSES_SOURCES CHANGED
@@ -1,4 +1,4 @@
1
- # UD Dutch LassySmall v2.5
2
 
3
  * Author: Bouma, Gosse; van Noord, Gertjan
4
  * URL: https://github.com/UniversalDependencies/UD_Dutch-LassySmall
@@ -878,7 +878,7 @@ Creative Commons may be contacted at creativecommons.org.
878
 
879
 
880
 
881
- # UD Dutch LassySmall v2.5
882
 
883
  * Author: Bouma, Gosse; van Noord, Gertjan
884
  * URL: https://github.com/UniversalDependencies/UD_Dutch-LassySmall
@@ -1318,7 +1318,7 @@ Creative Commons may be contacted at creativecommons.org.
1318
 
1319
 
1320
 
1321
- # UD Dutch Alpino v2.5
1322
 
1323
  * Author: Zeman, Daniel; Žabokrtský, Zdeněk; Bouma, Gosse; van Noord, Gertjan
1324
  * URL: https://github.com/UniversalDependencies/UD_Dutch-Alpino
 
1
+ # UD Dutch LassySmall v2.8
2
 
3
  * Author: Bouma, Gosse; van Noord, Gertjan
4
  * URL: https://github.com/UniversalDependencies/UD_Dutch-LassySmall
 
878
 
879
 
880
 
881
+ # UD Dutch LassySmall v2.8
882
 
883
  * Author: Bouma, Gosse; van Noord, Gertjan
884
  * URL: https://github.com/UniversalDependencies/UD_Dutch-LassySmall
 
1318
 
1319
 
1320
 
1321
+ # UD Dutch Alpino v2.8
1322
 
1323
  * Author: Zeman, Daniel; Žabokrtský, Zdeněk; Bouma, Gosse; van Noord, Gertjan
1324
  * URL: https://github.com/UniversalDependencies/UD_Dutch-Alpino
README.md CHANGED
@@ -4,7 +4,7 @@ tags:
4
  - token-classification
5
  language:
6
  - nl
7
- license: CC-BY-SA-4.0
8
  model-index:
9
  - name: nl_core_news_md
10
  results:
@@ -14,47 +14,47 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.7817002882
18
  - name: NER Recall
19
  type: recall
20
- value: 0.7503457815
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.7657021877
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
- value: 0.9429007634
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
- value: 0.8534067447
38
  - name: SENTER Recall
39
  type: recall
40
- value: 0.8895265423
41
  - name: SENTER F Score
42
  type: f_score
43
- value: 0.8710923779
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
- value: 0.8651950224
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
- value: 0.8651950224
58
  ---
59
  ### Details: https://spacy.io/models/nl#nl_core_news_md
60
 
@@ -63,12 +63,12 @@ Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, pa
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `nl_core_news_md` |
66
- | **Version** | `3.1.0` |
67
- | **spaCy** | `>=3.1.0,<3.2.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 500000 keys, 20000 unique vectors (300 dimensions) |
71
- | **Sources** | [UD Dutch LassySmall v2.5](https://github.com/UniversalDependencies/UD_Dutch-LassySmall) (Bouma, Gosse; van Noord, Gertjan)<br />[Dutch NER Annotations for UD LassySmall](https://nlp.town) (NLP Town)<br />[UD Dutch LassySmall v2.5](https://github.com/UniversalDependencies/UD_Dutch-LassySmall) (Bouma, Gosse; van Noord, Gertjan)<br />[UD Dutch Alpino v2.5](https://github.com/UniversalDependencies/UD_Dutch-Alpino) (Zeman, Daniel; Žabokrtský, Zdeněk; Bouma, Gosse; van Noord, Gertjan)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
72
  | **License** | `CC BY-SA 4.0` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
@@ -76,12 +76,12 @@ Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, pa
76
 
77
  <details>
78
 
79
- <summary>View label scheme (318 labels for 5 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
- | **`morphologizer`** | `POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=AUX\|Tense=Pres\|VerbForm=Fin`, `POS=ADV`, `POS=VERB\|VerbForm=Part`, `POS=PUNCT`, `Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Fin`, `POS=ADP`, `POS=NUM`, `Number=Plur\|POS=NOUN`, `POS=VERB\|VerbForm=Inf`, `POS=SCONJ`, `Definite=Def\|POS=DET`, `Gender=Com\|Number=Sing\|POS=NOUN`, `Number=Sing\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Degree=Pos\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=PROPN`, `Gender=Com\|Number=Sing\|POS=PROPN`, `POS=AUX\|VerbForm=Inf`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Fin`, `POS=DET`, `Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|Person=3\|PronType=Prs`, `POS=CCONJ`, `Number=Plur\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `POS=PRON\|Person=3\|PronType=Ind`, `Degree=Cmp\|POS=ADJ`, `Case=Nom\|POS=PRON\|Person=1\|PronType=Prs`, `Definite=Ind\|POS=DET`, `Case=Nom\|POS=PRON\|Person=3\|PronType=Prs`, `POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Number=Plur\|POS=AUX\|Tense=Pres\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Case=Acc\|POS=PRON\|Person=1\|PronType=Prs`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Fin`, `Gender=Com,Neut\|Number=Sing\|POS=NOUN`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs`, `POS=PROPN`, `POS=PRON\|PronType=Ind`, `POS=PRON\|Person=3\|PronType=Int`, `Case=Acc\|POS=PRON\|PronType=Rcp`, `Number=Plur\|POS=AUX\|Tense=Past\|VerbForm=Fin`, `Number=Sing\|POS=NOUN`, `POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `POS=SYM`, `Abbr=Yes\|POS=X`, `Gender=Com,Neut\|Number=Sing\|POS=PROPN`, `Degree=Sup\|POS=ADJ`, `Foreign=Yes\|POS=X`, `POS=ADJ`, `Number=Sing\|POS=PROPN`, `POS=PRON\|PronType=Dem`, `POS=AUX\|VerbForm=Part`, `POS=PRON\|Person=3\|PronType=Rel`, `Number=Plur\|POS=PROPN`, `POS=PRON\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Dat\|POS=PRON\|PronType=Dem`, `Case=Nom\|POS=PRON\|Person=2\|PronType=Prs`, `POS=X`, `POS=INTJ`, `Case=Gen\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=PRON\|PronType=Int`, `Case=Acc\|POS=PRON\|Person=2\|PronType=Prs`, `POS=PRON\|Person=2\|PronType=Prs`, `Case=Gen\|POS=PRON\|Person=2\|PronType=Prs` |
84
- | **`tagger`** | `ADJ\|nom\|basis\|met-e\|mv-n`, `ADJ\|nom\|basis\|met-e\|zonder-n\|bijz`, `ADJ\|nom\|basis\|met-e\|zonder-n\|stan`, `ADJ\|nom\|basis\|zonder\|mv-n`, `ADJ\|nom\|basis\|zonder\|zonder-n`, `ADJ\|nom\|comp\|met-e\|mv-n`, `ADJ\|nom\|comp\|met-e\|zonder-n\|stan`, `ADJ\|nom\|sup\|met-e\|mv-n`, `ADJ\|nom\|sup\|met-e\|zonder-n\|stan`, `ADJ\|nom\|sup\|zonder\|zonder-n`, `ADJ\|postnom\|basis\|met-s`, `ADJ\|postnom\|basis\|zonder`, `ADJ\|postnom\|comp\|met-s`, `ADJ\|prenom\|basis\|met-e\|bijz`, `ADJ\|prenom\|basis\|met-e\|stan`, `ADJ\|prenom\|basis\|zonder`, `ADJ\|prenom\|comp\|met-e\|stan`, `ADJ\|prenom\|comp\|zonder`, `ADJ\|prenom\|sup\|met-e\|stan`, `ADJ\|vrij\|basis\|zonder`, `ADJ\|vrij\|comp\|zonder`, `ADJ\|vrij\|dim\|zonder`, `ADJ\|vrij\|sup\|zonder`, `BW`, `LET`, `LID\|bep\|dat\|evmo`, `LID\|bep\|gen\|evmo`, `LID\|bep\|gen\|rest3`, `LID\|bep\|stan\|evon`, `LID\|bep\|stan\|rest`, `LID\|onbep\|stan\|agr`, `N\|eigen\|ev\|basis\|gen`, `N\|eigen\|ev\|basis\|genus\|stan`, `N\|eigen\|ev\|basis\|onz\|stan`, `N\|eigen\|ev\|basis\|zijd\|stan`, `N\|eigen\|ev\|dim\|onz\|stan`, `N\|eigen\|mv\|basis`, `N\|soort\|ev\|basis\|dat`, `N\|soort\|ev\|basis\|gen`, `N\|soort\|ev\|basis\|genus\|stan`, `N\|soort\|ev\|basis\|onz\|stan`, `N\|soort\|ev\|basis\|zijd\|stan`, `N\|soort\|ev\|dim\|onz\|stan`, `N\|soort\|mv\|basis`, `N\|soort\|mv\|dim`, `SPEC\|afgebr`, `SPEC\|afk`, `SPEC\|deeleigen`, `SPEC\|enof`, `SPEC\|meta`, `SPEC\|symb`, `SPEC\|vreemd`, `TSW`, `TW\|hoofd\|nom\|mv-n\|basis`, `TW\|hoofd\|nom\|mv-n\|dim`, `TW\|hoofd\|nom\|zonder-n\|basis`, `TW\|hoofd\|nom\|zonder-n\|dim`, `TW\|hoofd\|prenom\|stan`, `TW\|hoofd\|vrij`, `TW\|rang\|nom\|mv-n`, `TW\|rang\|nom\|zonder-n`, `TW\|rang\|prenom\|stan`, `VG\|neven`, `VG\|onder`, `VNW\|aanw\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|aanw\|adv-pron\|stan\|red\|3\|getal`, `VNW\|aanw\|det\|dat\|nom\|met-e\|zonder-n`, `VNW\|aanw\|det\|dat\|prenom\|met-e\|evmo`, `VNW\|aanw\|det\|gen\|prenom\|met-e\|rest3`, `VNW\|aanw\|det\|stan\|nom\|met-e\|mv-n`, `VNW\|aanw\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|aanw\|det\|stan\|prenom\|met-e\|rest`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|agr`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|evon`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|rest`, `VNW\|aanw\|det\|stan\|vrij\|zonder`, `VNW\|aanw\|pron\|gen\|vol\|3m\|ev`, `VNW\|aanw\|pron\|stan\|vol\|3o\|ev`, `VNW\|aanw\|pron\|stan\|vol\|3\|getal`, `VNW\|betr\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|betr\|det\|stan\|nom\|zonder\|zonder-n`, `VNW\|betr\|pron\|stan\|vol\|3\|ev`, `VNW\|betr\|pron\|stan\|vol\|persoon\|getal`, `VNW\|bez\|det\|gen\|vol\|3\|ev\|prenom\|met-e\|rest3`, `VNW\|bez\|det\|stan\|nadr\|2v\|mv\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|1\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|2v\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|3\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|1\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|1\|mv\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|1\|mv\|prenom\|zonder\|evon`, `VNW\|bez\|det\|stan\|vol\|2v\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|2\|getal\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|3m\|ev\|nom\|met-e\|zonder-n`, `VNW\|bez\|det\|stan\|vol\|3m\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3p\|mv\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3v\|ev\|nom\|met-e\|zonder-n`, `VNW\|bez\|det\|stan\|vol\|3v\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|3\|mv\|prenom\|zonder\|agr`, `VNW\|onbep\|adv-pron\|gen\|red\|3\|getal`, `VNW\|onbep\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|onbep\|det\|stan\|nom\|met-e\|mv-n`, `VNW\|onbep\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|agr`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|evz`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|mv`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|rest`, `VNW\|onbep\|det\|stan\|prenom\|zonder\|agr`, `VNW\|onbep\|det\|stan\|prenom\|zonder\|evon`, `VNW\|onbep\|det\|stan\|vrij\|zonder`, `VNW\|onbep\|grad\|gen\|nom\|met-e\|mv-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|mv-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|mv-n\|sup`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|zonder-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|zonder-n\|sup`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|comp`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|sup`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|mv\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|zonder\|agr\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|zonder\|agr\|comp`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|basis`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|comp`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|sup`, `VNW\|onbep\|pron\|gen\|vol\|3p\|ev`, `VNW\|onbep\|pron\|stan\|vol\|3o\|ev`, `VNW\|onbep\|pron\|stan\|vol\|3p\|ev`, `VNW\|pers\|pron\|gen\|vol\|2\|getal`, `VNW\|pers\|pron\|nomin\|nadr\|3m\|ev\|masc`, `VNW\|pers\|pron\|nomin\|nadr\|3v\|ev\|fem`, `VNW\|pers\|pron\|nomin\|red\|1\|mv`, `VNW\|pers\|pron\|nomin\|red\|2v\|ev`, `VNW\|pers\|pron\|nomin\|red\|2\|getal`, `VNW\|pers\|pron\|nomin\|red\|3p\|ev\|masc`, `VNW\|pers\|pron\|nomin\|red\|3\|ev\|masc`, `VNW\|pers\|pron\|nomin\|vol\|1\|ev`, `VNW\|pers\|pron\|nomin\|vol\|1\|mv`, `VNW\|pers\|pron\|nomin\|vol\|2b\|getal`, `VNW\|pers\|pron\|nomin\|vol\|2v\|ev`, `VNW\|pers\|pron\|nomin\|vol\|2\|getal`, `VNW\|pers\|pron\|nomin\|vol\|3p\|mv`, `VNW\|pers\|pron\|nomin\|vol\|3v\|ev\|fem`, `VNW\|pers\|pron\|nomin\|vol\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|nadr\|3m\|ev\|masc`, `VNW\|pers\|pron\|obl\|red\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|vol\|2v\|ev`, `VNW\|pers\|pron\|obl\|vol\|3p\|mv`, `VNW\|pers\|pron\|obl\|vol\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|vol\|3\|getal\|fem`, `VNW\|pers\|pron\|stan\|nadr\|2v\|mv`, `VNW\|pers\|pron\|stan\|red\|3\|ev\|fem`, `VNW\|pers\|pron\|stan\|red\|3\|ev\|onz`, `VNW\|pers\|pron\|stan\|red\|3\|mv`, `VNW\|pr\|pron\|obl\|nadr\|1\|ev`, `VNW\|pr\|pron\|obl\|nadr\|2v\|getal`, `VNW\|pr\|pron\|obl\|nadr\|2\|getal`, `VNW\|pr\|pron\|obl\|red\|1\|ev`, `VNW\|pr\|pron\|obl\|red\|2v\|getal`, `VNW\|pr\|pron\|obl\|vol\|1\|ev`, `VNW\|pr\|pron\|obl\|vol\|1\|mv`, `VNW\|pr\|pron\|obl\|vol\|2\|getal`, `VNW\|recip\|pron\|gen\|vol\|persoon\|mv`, `VNW\|recip\|pron\|obl\|vol\|persoon\|mv`, `VNW\|refl\|pron\|obl\|nadr\|3\|getal`, `VNW\|refl\|pron\|obl\|red\|3\|getal`, `VNW\|vb\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|vb\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|vb\|det\|stan\|prenom\|met-e\|rest`, `VNW\|vb\|det\|stan\|prenom\|zonder\|evon`, `VNW\|vb\|pron\|gen\|vol\|3m\|ev`, `VNW\|vb\|pron\|gen\|vol\|3p\|mv`, `VNW\|vb\|pron\|gen\|vol\|3v\|ev`, `VNW\|vb\|pron\|stan\|vol\|3o\|ev`, `VNW\|vb\|pron\|stan\|vol\|3p\|getal`, `VZ\|fin`, `VZ\|init`, `VZ\|versm`, `WW\|inf\|nom\|zonder\|zonder-n`, `WW\|inf\|prenom\|met-e`, `WW\|inf\|vrij\|zonder`, `WW\|od\|nom\|met-e\|mv-n`, `WW\|od\|nom\|met-e\|zonder-n`, `WW\|od\|prenom\|met-e`, `WW\|od\|prenom\|zonder`, `WW\|od\|vrij\|zonder`, `WW\|pv\|conj\|ev`, `WW\|pv\|tgw\|ev`, `WW\|pv\|tgw\|met-t`, `WW\|pv\|tgw\|mv`, `WW\|pv\|verl\|ev`, `WW\|pv\|verl\|mv`, `WW\|vd\|nom\|met-e\|mv-n`, `WW\|vd\|nom\|met-e\|zonder-n`, `WW\|vd\|prenom\|met-e`, `WW\|vd\|prenom\|zonder`, `WW\|vd\|vrij\|zonder` |
85
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux`, `aux:pass`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `csubj`, `dep`, `det`, `expl`, `expl:pv`, `fixed`, `flat`, `iobj`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl`, `obl:agent`, `orphan`, `parataxis`, `punct`, `xcomp` |
86
  | **`senter`** | `I`, `S` |
87
  | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PRODUCT`, `QUANTITY`, `TIME`, `WORK_OF_ART` |
@@ -92,16 +92,22 @@ Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, pa
92
 
93
  | Type | Score |
94
  | --- | --- |
 
 
 
 
 
 
 
 
 
95
  | `TOKEN_ACC` | 99.97 |
96
- | `TAG_ACC` | 94.29 |
97
- | `POS_ACC` | 96.04 |
98
- | `MORPH_ACC` | 95.41 |
99
- | `LEMMA_ACC` | 85.31 |
100
- | `DEP_UAS` | 86.52 |
101
- | `DEP_LAS` | 82.02 |
102
- | `SENTS_P` | 85.34 |
103
- | `SENTS_R` | 88.95 |
104
- | `SENTS_F` | 87.11 |
105
- | `ENTS_P` | 78.17 |
106
- | `ENTS_R` | 75.03 |
107
- | `ENTS_F` | 76.57 |
 
4
  - token-classification
5
  language:
6
  - nl
7
+ license: cc-by-sa-4.0
8
  model-index:
9
  - name: nl_core_news_md
10
  results:
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.775862069
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.7468879668
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.7610993658
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
+ value: 0.9482224646
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
+ value: 0.8472032742
38
  - name: SENTER Recall
39
  type: recall
40
+ value: 0.8909612626
41
  - name: SENTER F Score
42
  type: f_score
43
+ value: 0.8685314685
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
+ value: 0.8688732323
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
+ value: 0.8688732323
58
  ---
59
  ### Details: https://spacy.io/models/nl#nl_core_news_md
60
 
 
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `nl_core_news_md` |
66
+ | **Version** | `3.2.0` |
67
+ | **spaCy** | `>=3.2.0,<3.3.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 500000 keys, 20000 unique vectors (300 dimensions) |
71
+ | **Sources** | [UD Dutch LassySmall v2.8](https://github.com/UniversalDependencies/UD_Dutch-LassySmall) (Bouma, Gosse; van Noord, Gertjan)<br />[Dutch NER Annotations for UD LassySmall](https://nlp.town) (NLP Town)<br />[UD Dutch LassySmall v2.8](https://github.com/UniversalDependencies/UD_Dutch-LassySmall) (Bouma, Gosse; van Noord, Gertjan)<br />[UD Dutch Alpino v2.8](https://github.com/UniversalDependencies/UD_Dutch-Alpino) (Zeman, Daniel; Žabokrtský, Zdeněk; Bouma, Gosse; van Noord, Gertjan)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
72
  | **License** | `CC BY-SA 4.0` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
 
76
 
77
  <details>
78
 
79
+ <summary>View label scheme (323 labels for 5 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
+ | **`morphologizer`** | `POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=AUX\|Tense=Pres\|VerbForm=Fin`, `POS=ADV`, `POS=VERB\|VerbForm=Part`, `POS=PUNCT`, `Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Fin`, `POS=ADP`, `POS=NUM`, `Number=Plur\|POS=NOUN`, `POS=VERB\|VerbForm=Inf`, `POS=SCONJ`, `Definite=Def\|POS=DET`, `Gender=Com\|Number=Sing\|POS=NOUN`, `Number=Sing\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Degree=Pos\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=PROPN`, `Gender=Com\|Number=Sing\|POS=PROPN`, `POS=AUX\|VerbForm=Inf`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Fin`, `POS=DET`, `Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|Person=3\|PronType=Prs`, `POS=CCONJ`, `Number=Plur\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `POS=PRON\|Person=3\|PronType=Ind`, `Degree=Cmp\|POS=ADJ`, `Case=Nom\|POS=PRON\|Person=1\|PronType=Prs`, `Definite=Ind\|POS=DET`, `Case=Nom\|POS=PRON\|Person=3\|PronType=Prs`, `POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Number=Plur\|POS=AUX\|Tense=Pres\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Case=Acc\|POS=PRON\|Person=1\|PronType=Prs`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Fin`, `Gender=Com,Neut\|Number=Sing\|POS=NOUN`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs`, `POS=PROPN`, `POS=PRON\|PronType=Ind`, `POS=PRON\|Person=3\|PronType=Int`, `Case=Acc\|POS=PRON\|PronType=Rcp`, `Number=Plur\|POS=AUX\|Tense=Past\|VerbForm=Fin`, `Number=Sing\|POS=NOUN`, `POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `POS=SYM`, `Abbr=Yes\|POS=X`, `Gender=Com,Neut\|Number=Sing\|POS=PROPN`, `Degree=Sup\|POS=ADJ`, `POS=ADJ`, `Number=Sing\|POS=PROPN`, `POS=PRON\|PronType=Dem`, `POS=AUX\|VerbForm=Part`, `POS=PRON\|Person=3\|PronType=Rel`, `Number=Plur\|POS=PROPN`, `POS=PRON\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Dat\|POS=PRON\|PronType=Dem`, `Case=Nom\|POS=PRON\|Person=2\|PronType=Prs`, `POS=INTJ`, `Case=Acc\|POS=PRON\|Person=2\|PronType=Prs`, `Case=Gen\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=PRON\|PronType=Int`, `POS=PRON\|Person=2\|PronType=Prs`, `POS=PRON\|Person=3`, `Case=Gen\|POS=PRON\|Person=2\|PronType=Prs`, `POS=X` |
84
+ | **`tagger`** | `ADJ\|nom\|basis\|met-e\|mv-n`, `ADJ\|nom\|basis\|met-e\|zonder-n\|bijz`, `ADJ\|nom\|basis\|met-e\|zonder-n\|stan`, `ADJ\|nom\|basis\|zonder\|mv-n`, `ADJ\|nom\|basis\|zonder\|zonder-n`, `ADJ\|nom\|comp\|met-e\|mv-n`, `ADJ\|nom\|comp\|met-e\|zonder-n\|stan`, `ADJ\|nom\|sup\|met-e\|mv-n`, `ADJ\|nom\|sup\|met-e\|zonder-n\|bijz`, `ADJ\|nom\|sup\|met-e\|zonder-n\|stan`, `ADJ\|nom\|sup\|zonder\|zonder-n`, `ADJ\|postnom\|basis\|met-s`, `ADJ\|postnom\|basis\|zonder`, `ADJ\|postnom\|comp\|met-s`, `ADJ\|prenom\|basis\|met-e\|bijz`, `ADJ\|prenom\|basis\|met-e\|stan`, `ADJ\|prenom\|basis\|zonder`, `ADJ\|prenom\|comp\|met-e\|stan`, `ADJ\|prenom\|comp\|zonder`, `ADJ\|prenom\|sup\|met-e\|stan`, `ADJ\|prenom\|sup\|zonder`, `ADJ\|vrij\|basis\|zonder`, `ADJ\|vrij\|comp\|zonder`, `ADJ\|vrij\|dim\|zonder`, `ADJ\|vrij\|sup\|zonder`, `BW`, `LET`, `LID\|bep\|dat\|evmo`, `LID\|bep\|gen\|evmo`, `LID\|bep\|gen\|rest3`, `LID\|bep\|stan\|evon`, `LID\|bep\|stan\|rest`, `LID\|onbep\|stan\|agr`, `N\|eigen\|ev\|basis\|gen`, `N\|eigen\|ev\|basis\|genus\|stan`, `N\|eigen\|ev\|basis\|onz\|stan`, `N\|eigen\|ev\|basis\|zijd\|stan`, `N\|eigen\|ev\|dim\|onz\|stan`, `N\|eigen\|mv\|basis`, `N\|soort\|ev\|basis\|dat`, `N\|soort\|ev\|basis\|gen`, `N\|soort\|ev\|basis\|genus\|stan`, `N\|soort\|ev\|basis\|onz\|stan`, `N\|soort\|ev\|basis\|zijd\|stan`, `N\|soort\|ev\|dim\|onz\|stan`, `N\|soort\|mv\|basis`, `N\|soort\|mv\|dim`, `SPEC\|afgebr`, `SPEC\|afk`, `SPEC\|deeleigen`, `SPEC\|enof`, `SPEC\|meta`, `SPEC\|symb`, `SPEC\|vreemd`, `TSW`, `TW\|hoofd\|nom\|mv-n\|basis`, `TW\|hoofd\|nom\|mv-n\|dim`, `TW\|hoofd\|nom\|zonder-n\|basis`, `TW\|hoofd\|nom\|zonder-n\|dim`, `TW\|hoofd\|prenom\|stan`, `TW\|hoofd\|vrij`, `TW\|rang\|nom\|mv-n`, `TW\|rang\|nom\|zonder-n`, `TW\|rang\|prenom\|stan`, `VG\|neven`, `VG\|onder`, `VNW\|aanw\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|aanw\|adv-pron\|stan\|red\|3\|getal`, `VNW\|aanw\|det\|dat\|nom\|met-e\|zonder-n`, `VNW\|aanw\|det\|dat\|prenom\|met-e\|evmo`, `VNW\|aanw\|det\|gen\|prenom\|met-e\|rest3`, `VNW\|aanw\|det\|stan\|nom\|met-e\|mv-n`, `VNW\|aanw\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|aanw\|det\|stan\|prenom\|met-e\|rest`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|agr`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|evon`, `VNW\|aanw\|det\|stan\|prenom\|zonder\|rest`, `VNW\|aanw\|det\|stan\|vrij\|zonder`, `VNW\|aanw\|pron\|gen\|vol\|3m\|ev`, `VNW\|aanw\|pron\|stan\|vol\|3o\|ev`, `VNW\|aanw\|pron\|stan\|vol\|3\|getal`, `VNW\|betr\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|betr\|det\|stan\|nom\|zonder\|zonder-n`, `VNW\|betr\|pron\|stan\|vol\|3\|ev`, `VNW\|betr\|pron\|stan\|vol\|persoon\|getal`, `VNW\|bez\|det\|gen\|vol\|3\|ev\|prenom\|met-e\|rest3`, `VNW\|bez\|det\|stan\|nadr\|2v\|mv\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|1\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|2v\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|red\|3\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|1\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|1\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|1\|mv\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|1\|mv\|prenom\|zonder\|evon`, `VNW\|bez\|det\|stan\|vol\|2v\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|2\|getal\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|3m\|ev\|nom\|met-e\|zonder-n`, `VNW\|bez\|det\|stan\|vol\|3m\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3p\|mv\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3v\|ev\|nom\|met-e\|zonder-n`, `VNW\|bez\|det\|stan\|vol\|3v\|ev\|prenom\|met-e\|rest`, `VNW\|bez\|det\|stan\|vol\|3\|ev\|prenom\|zonder\|agr`, `VNW\|bez\|det\|stan\|vol\|3\|mv\|prenom\|zonder\|agr`, `VNW\|excl\|pron\|stan\|vol\|3\|getal`, `VNW\|onbep\|adv-pron\|gen\|red\|3\|getal`, `VNW\|onbep\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|onbep\|det\|stan\|nom\|met-e\|mv-n`, `VNW\|onbep\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|onbep\|det\|stan\|nom\|zonder\|zonder-n`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|agr`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|evz`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|mv`, `VNW\|onbep\|det\|stan\|prenom\|met-e\|rest`, `VNW\|onbep\|det\|stan\|prenom\|zonder\|agr`, `VNW\|onbep\|det\|stan\|prenom\|zonder\|evon`, `VNW\|onbep\|det\|stan\|vrij\|zonder`, `VNW\|onbep\|grad\|gen\|nom\|met-e\|mv-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|mv-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|mv-n\|sup`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|zonder-n\|basis`, `VNW\|onbep\|grad\|stan\|nom\|met-e\|zonder-n\|sup`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|comp`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|agr\|sup`, `VNW\|onbep\|grad\|stan\|prenom\|met-e\|mv\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|zonder\|agr\|basis`, `VNW\|onbep\|grad\|stan\|prenom\|zonder\|agr\|comp`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|basis`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|comp`, `VNW\|onbep\|grad\|stan\|vrij\|zonder\|sup`, `VNW\|onbep\|pron\|gen\|vol\|3p\|ev`, `VNW\|onbep\|pron\|stan\|vol\|3o\|ev`, `VNW\|onbep\|pron\|stan\|vol\|3p\|ev`, `VNW\|pers\|pron\|gen\|vol\|2\|getal`, `VNW\|pers\|pron\|nomin\|nadr\|3m\|ev\|masc`, `VNW\|pers\|pron\|nomin\|nadr\|3v\|ev\|fem`, `VNW\|pers\|pron\|nomin\|red\|1\|mv`, `VNW\|pers\|pron\|nomin\|red\|2v\|ev`, `VNW\|pers\|pron\|nomin\|red\|2\|getal`, `VNW\|pers\|pron\|nomin\|red\|3p\|ev\|masc`, `VNW\|pers\|pron\|nomin\|red\|3\|ev\|masc`, `VNW\|pers\|pron\|nomin\|vol\|1\|ev`, `VNW\|pers\|pron\|nomin\|vol\|1\|mv`, `VNW\|pers\|pron\|nomin\|vol\|2b\|getal`, `VNW\|pers\|pron\|nomin\|vol\|2v\|ev`, `VNW\|pers\|pron\|nomin\|vol\|2\|getal`, `VNW\|pers\|pron\|nomin\|vol\|3p\|mv`, `VNW\|pers\|pron\|nomin\|vol\|3v\|ev\|fem`, `VNW\|pers\|pron\|nomin\|vol\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|nadr\|3m\|ev\|masc`, `VNW\|pers\|pron\|obl\|red\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|vol\|2v\|ev`, `VNW\|pers\|pron\|obl\|vol\|3p\|mv`, `VNW\|pers\|pron\|obl\|vol\|3\|ev\|masc`, `VNW\|pers\|pron\|obl\|vol\|3\|getal\|fem`, `VNW\|pers\|pron\|stan\|nadr\|2v\|mv`, `VNW\|pers\|pron\|stan\|red\|3\|ev\|fem`, `VNW\|pers\|pron\|stan\|red\|3\|ev\|onz`, `VNW\|pers\|pron\|stan\|red\|3\|mv`, `VNW\|pr\|pron\|obl\|nadr\|1\|ev`, `VNW\|pr\|pron\|obl\|nadr\|2v\|getal`, `VNW\|pr\|pron\|obl\|nadr\|2\|getal`, `VNW\|pr\|pron\|obl\|red\|1\|ev`, `VNW\|pr\|pron\|obl\|red\|2v\|getal`, `VNW\|pr\|pron\|obl\|vol\|1\|ev`, `VNW\|pr\|pron\|obl\|vol\|1\|mv`, `VNW\|pr\|pron\|obl\|vol\|2\|getal`, `VNW\|recip\|pron\|gen\|vol\|persoon\|mv`, `VNW\|recip\|pron\|obl\|vol\|persoon\|mv`, `VNW\|refl\|pron\|obl\|nadr\|3\|getal`, `VNW\|refl\|pron\|obl\|red\|3\|getal`, `VNW\|vb\|adv-pron\|obl\|vol\|3o\|getal`, `VNW\|vb\|det\|stan\|nom\|met-e\|zonder-n`, `VNW\|vb\|det\|stan\|prenom\|met-e\|rest`, `VNW\|vb\|det\|stan\|prenom\|zonder\|evon`, `VNW\|vb\|pron\|gen\|vol\|3m\|ev`, `VNW\|vb\|pron\|gen\|vol\|3p\|mv`, `VNW\|vb\|pron\|gen\|vol\|3v\|ev`, `VNW\|vb\|pron\|stan\|vol\|3o\|ev`, `VNW\|vb\|pron\|stan\|vol\|3p\|getal`, `VZ\|fin`, `VZ\|init`, `VZ\|versm`, `WW\|inf\|nom\|zonder\|zonder-n`, `WW\|inf\|prenom\|met-e`, `WW\|inf\|vrij\|zonder`, `WW\|od\|nom\|met-e\|mv-n`, `WW\|od\|nom\|met-e\|zonder-n`, `WW\|od\|prenom\|met-e`, `WW\|od\|prenom\|zonder`, `WW\|od\|vrij\|zonder`, `WW\|pv\|conj\|ev`, `WW\|pv\|tgw\|ev`, `WW\|pv\|tgw\|met-t`, `WW\|pv\|tgw\|mv`, `WW\|pv\|verl\|ev`, `WW\|pv\|verl\|mv`, `WW\|vd\|nom\|met-e\|mv-n`, `WW\|vd\|nom\|met-e\|zonder-n`, `WW\|vd\|prenom\|met-e`, `WW\|vd\|prenom\|zonder`, `WW\|vd\|vrij\|zonder` |
85
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux`, `aux:pass`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `csubj`, `dep`, `det`, `expl`, `expl:pv`, `fixed`, `flat`, `iobj`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl`, `obl:agent`, `orphan`, `parataxis`, `punct`, `xcomp` |
86
  | **`senter`** | `I`, `S` |
87
  | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PRODUCT`, `QUANTITY`, `TIME`, `WORK_OF_ART` |
 
92
 
93
  | Type | Score |
94
  | --- | --- |
95
+ | `TAG_ACC` | 94.82 |
96
+ | `SENTS_P` | 84.72 |
97
+ | `SENTS_R` | 89.10 |
98
+ | `SENTS_F` | 86.85 |
99
+ | `DEP_UAS` | 86.89 |
100
+ | `DEP_LAS` | 82.12 |
101
+ | `ENTS_P` | 77.59 |
102
+ | `ENTS_R` | 74.69 |
103
+ | `ENTS_F` | 76.11 |
104
  | `TOKEN_ACC` | 99.97 |
105
+ | `TOKEN_P` | 99.74 |
106
+ | `TOKEN_R` | 99.76 |
107
+ | `TOKEN_F` | 99.75 |
108
+ | `POS_ACC` | 96.47 |
109
+ | `MORPH_ACC` | 96.00 |
110
+ | `MORPH_MICRO_P` | 96.98 |
111
+ | `MORPH_MICRO_R` | 95.13 |
112
+ | `MORPH_MICRO_F` | 96.04 |
113
+ | `LEMMA_ACC` | 81.51 |
 
 
 
accuracy.json CHANGED
@@ -1,252 +1,180 @@
1
  {
2
- "token_acc": 0.9997165842,
3
- "tag_acc": 0.9429007634,
4
- "pos_acc": 0.960436205,
5
- "morph_acc": 0.9541280401,
6
- "lemma_acc": 0.8531303836,
7
- "dep_uas": 0.8651950224,
8
- "dep_las": 0.8202138983,
9
- "sents_p": 0.8534067447,
10
- "sents_r": 0.8895265423,
11
- "sents_f": 0.8710923779,
12
- "speed": 4346.394666307,
13
- "morph_per_feat": {
14
- "Person": {
15
- "p": 0.9960784314,
16
- "r": 0.970391595,
17
- "f": 0.9830672472
18
- },
19
- "Poss": {
20
- "p": 0.9923664122,
21
- "r": 0.9885931559,
22
- "f": 0.9904761905
23
- },
24
- "PronType": {
25
- "p": 0.994017094,
26
- "r": 0.9659468439,
27
- "f": 0.9797809604
28
- },
29
- "Gender": {
30
- "p": 0.9102564103,
31
- "r": 0.8809825272,
32
- "f": 0.8953802599
33
- },
34
- "Number": {
35
- "p": 0.9794238683,
36
- "r": 0.9591044776,
37
- "f": 0.9691576804
38
- },
39
- "Tense": {
40
- "p": 0.9738015608,
41
- "r": 0.9598901099,
42
- "f": 0.9667957941
43
- },
44
- "VerbForm": {
45
- "p": 0.9548269581,
46
- "r": 0.9431450162,
47
- "f": 0.9489500362
48
- },
49
- "Degree": {
50
- "p": 0.9421547361,
51
- "r": 0.9293865906,
52
- "f": 0.9357271095
53
- },
54
- "Definite": {
55
- "p": 0.9969217238,
56
- "r": 0.9973603168,
57
- "f": 0.9971409721
58
- },
59
- "Case": {
60
- "p": 0.998,
61
- "r": 0.9940239044,
62
- "f": 0.996007984
63
- },
64
- "Reflex": {
65
- "p": 1.0,
66
- "r": 1.0,
67
- "f": 1.0
68
- },
69
- "Foreign": {
70
- "p": 0.6451612903,
71
- "r": 0.3076923077,
72
- "f": 0.4166666667
73
- },
74
- "Abbr": {
75
- "p": 1.0,
76
- "r": 1.0,
77
- "f": 1.0
78
- }
79
- },
80
  "dep_las_per_type": {
81
- "det": {
82
- "p": 0.8819255223,
83
- "r": 0.9575936884,
84
- "f": 0.9182033097
85
  },
86
  "nsubj": {
87
- "p": 0.7484939759,
88
- "r": 0.8068181818,
89
- "f": 0.7765625
90
  },
91
- "root": {
92
- "p": 0.7513586957,
93
- "r": 0.8253731343,
94
- "f": 0.786628734
95
  },
96
- "case": {
97
- "p": 0.8877284595,
98
- "r": 0.9400921659,
99
- "f": 0.9131602507
100
  },
101
- "obl": {
102
- "p": 0.7350565428,
103
- "r": 0.7210776545,
104
- "f": 0.728
105
  },
106
- "nmod": {
107
- "p": 0.6371100164,
108
- "r": 0.6771378709,
109
- "f": 0.6565143824
110
  },
111
- "advmod": {
112
- "p": 0.7519379845,
113
- "r": 0.7838383838,
114
- "f": 0.7675568744
115
  },
116
- "obj": {
117
- "p": 0.7448979592,
118
- "r": 0.8081180812,
119
- "f": 0.7752212389
120
  },
121
  "mark": {
122
- "p": 0.8242677824,
123
- "r": 0.8140495868,
124
- "f": 0.8191268191
125
- },
126
- "advcl": {
127
- "p": 0.5,
128
- "r": 0.4545454545,
129
- "f": 0.4761904762
130
  },
131
- "amod": {
132
- "p": 0.8095238095,
133
- "r": 0.8686131387,
134
- "f": 0.838028169
135
  },
136
- "acl:relcl": {
137
- "p": 0.6470588235,
138
- "r": 0.6707317073,
139
- "f": 0.6586826347
140
  },
141
- "cop": {
142
- "p": 0.7364864865,
143
- "r": 0.5989010989,
144
- "f": 0.6606060606
145
  },
146
- "cc": {
147
- "p": 0.7948717949,
148
- "r": 0.852233677,
149
- "f": 0.8225538972
150
  },
151
- "conj": {
152
- "p": 0.6227848101,
153
- "r": 0.5442477876,
154
- "f": 0.5808736718
155
  },
156
- "fixed": {
157
- "p": 0.7132867133,
158
- "r": 0.2764227642,
159
- "f": 0.3984375
160
  },
161
  "flat": {
162
- "p": 0.8263473054,
163
- "r": 0.7052810903,
164
- "f": 0.7610294118
165
  },
166
- "csubj": {
167
- "p": 0.3333333333,
168
- "r": 0.3333333333,
169
- "f": 0.3333333333
170
  },
171
- "aux": {
172
- "p": 0.7676767677,
173
- "r": 0.7378640777,
174
- "f": 0.7524752475
175
  },
176
- "compound:prt": {
177
- "p": 0.7361111111,
178
- "r": 0.6883116883,
179
- "f": 0.711409396
180
  },
181
  "nummod": {
182
- "p": 0.5792682927,
183
- "r": 0.6506849315,
184
- "f": 0.6129032258
185
- },
186
- "acl": {
187
- "p": 0.4042553191,
188
- "r": 0.3220338983,
189
- "f": 0.358490566
190
  },
191
- "expl": {
192
- "p": 0.1428571429,
193
- "r": 0.1666666667,
194
- "f": 0.1538461538
195
  },
196
- "appos": {
197
- "p": 0.527607362,
198
- "r": 0.4971098266,
199
- "f": 0.5119047619
200
  },
201
- "dep": {
202
- "p": 0.0,
203
- "r": 0.0,
204
- "f": 0.0
205
  },
206
  "nsubj:pass": {
207
- "p": 0.8414634146,
208
- "r": 0.8023255814,
209
- "f": 0.8214285714
210
  },
211
  "aux:pass": {
212
- "p": 0.8921568627,
213
- "r": 0.9285714286,
214
- "f": 0.91
215
  },
216
- "xcomp": {
217
- "p": 0.3382352941,
218
- "r": 0.6301369863,
219
- "f": 0.4401913876
220
  },
221
- "ccomp": {
222
- "p": 0.5333333333,
223
- "r": 0.4705882353,
224
- "f": 0.5
225
  },
226
  "parataxis": {
227
- "p": 0.3333333333,
228
- "r": 0.2751677852,
229
- "f": 0.3014705882
230
  },
231
- "expl:pv": {
232
- "p": 0.9375,
233
- "r": 0.7894736842,
234
- "f": 0.8571428571
235
  },
236
  "iobj": {
237
- "p": 0.3333333333,
238
- "r": 0.3,
239
- "f": 0.3157894737
240
- },
241
- "nmod:poss": {
242
- "p": 0.8851351351,
243
- "r": 0.8562091503,
244
- "f": 0.8704318937
245
  },
246
  "obl:agent": {
247
- "p": 0.9166666667,
248
- "r": 0.7857142857,
249
- "f": 0.8461538462
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
250
  },
251
  "orphan": {
252
  "p": 0.0,
@@ -254,99 +182,172 @@
254
  "f": 0.0
255
  }
256
  },
257
- "ents_p": 0.7817002882,
258
- "ents_r": 0.7503457815,
259
- "ents_f": 0.7657021877,
260
  "ents_per_type": {
261
- "DATE": {
262
- "p": 0.9306930693,
263
- "r": 0.9155844156,
264
- "f": 0.9230769231
265
  },
266
- "NORP": {
267
- "p": 0.8111111111,
268
- "r": 0.8795180723,
269
- "f": 0.8439306358
270
  },
271
- "ORG": {
272
- "p": 0.7313432836,
273
- "r": 0.5798816568,
274
- "f": 0.6468646865
275
  },
276
  "CARDINAL": {
277
- "p": 0.8933333333,
278
- "r": 0.9571428571,
279
- "f": 0.924137931
280
  },
281
  "GPE": {
282
- "p": 0.7442922374,
283
- "r": 0.8956043956,
284
- "f": 0.812967581
285
  },
286
- "PERCENT": {
287
- "p": 0.7142857143,
288
- "r": 0.8333333333,
289
- "f": 0.7692307692
290
  },
291
- "PERSON": {
292
- "p": 0.7841269841,
293
- "r": 0.7993527508,
294
- "f": 0.7916666667
295
  },
296
- "LAW": {
297
- "p": 0.75,
298
- "r": 1.0,
299
- "f": 0.8571428571
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
300
  },
301
  "ORDINAL": {
302
- "p": 0.9117647059,
303
- "r": 0.9393939394,
304
- "f": 0.9253731343
305
  },
306
  "LANGUAGE": {
307
- "p": 0.5833333333,
308
- "r": 0.6363636364,
309
- "f": 0.6086956522
310
- },
311
- "EVENT": {
312
- "p": 0.3461538462,
313
- "r": 0.3913043478,
314
- "f": 0.3673469388
315
  },
316
- "QUANTITY": {
317
- "p": 0.8333333333,
318
- "r": 0.8333333333,
319
- "f": 0.8333333333
320
  },
321
  "LOC": {
322
- "p": 0.7,
323
- "r": 0.2058823529,
324
- "f": 0.3181818182
325
- },
326
- "FAC": {
327
- "p": 0.1904761905,
328
- "r": 0.2857142857,
329
- "f": 0.2285714286
330
  },
331
- "PRODUCT": {
332
  "p": 0.0,
333
  "r": 0.0,
334
  "f": 0.0
335
  },
336
- "WORK_OF_ART": {
337
- "p": 0.624,
338
- "r": 0.4333333333,
339
- "f": 0.5114754098
340
  },
341
- "MONEY": {
342
  "p": 0.0,
343
  "r": 0.0,
344
  "f": 0.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
345
  },
346
- "TIME": {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
347
  "p": 1.0,
348
  "r": 1.0,
349
  "f": 1.0
 
 
 
 
 
350
  }
351
- }
 
352
  }
 
1
  {
2
+ "tag_acc": 0.9482224646,
3
+ "sents_p": 0.8472032742,
4
+ "sents_r": 0.8909612626,
5
+ "sents_f": 0.8685314685,
6
+ "dep_uas": 0.8688732323,
7
+ "dep_las": 0.8211769476,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  "dep_las_per_type": {
9
+ "nmod:poss": {
10
+ "p": 0.9522058824,
11
+ "r": 0.9452554745,
12
+ "f": 0.9487179487
13
  },
14
  "nsubj": {
15
+ "p": 0.8497072219,
16
+ "r": 0.8586456279,
17
+ "f": 0.8541530412
18
  },
19
+ "aux": {
20
+ "p": 0.9013157895,
21
+ "r": 0.9013157895,
22
+ "f": 0.9013157895
23
  },
24
+ "advmod": {
25
+ "p": 0.7848537005,
26
+ "r": 0.8142857143,
27
+ "f": 0.7992988606
28
  },
29
+ "root": {
30
+ "p": 0.8676671214,
31
+ "r": 0.912482066,
32
+ "f": 0.8895104895
33
  },
34
+ "det": {
35
+ "p": 0.9465937763,
36
+ "r": 0.9740372133,
37
+ "f": 0.9601194284
38
  },
39
+ "amod": {
40
+ "p": 0.8758802817,
41
+ "r": 0.8915770609,
42
+ "f": 0.8836589698
43
  },
44
+ "obl": {
45
+ "p": 0.7556008147,
46
+ "r": 0.7535545024,
47
+ "f": 0.7545762712
48
  },
49
  "mark": {
50
+ "p": 0.8709677419,
51
+ "r": 0.8836363636,
52
+ "f": 0.8772563177
 
 
 
 
 
53
  },
54
+ "ccomp": {
55
+ "p": 0.7222222222,
56
+ "r": 0.6074766355,
57
+ "f": 0.6598984772
58
  },
59
+ "case": {
60
+ "p": 0.9368290669,
61
+ "r": 0.9610334604,
62
+ "f": 0.9487769183
63
  },
64
+ "appos": {
65
+ "p": 0.7060606061,
66
+ "r": 0.7060606061,
67
+ "f": 0.7060606061
68
  },
69
+ "obj": {
70
+ "p": 0.7666666667,
71
+ "r": 0.772519084,
72
+ "f": 0.769581749
73
  },
74
+ "compound:prt": {
75
+ "p": 0.7696335079,
76
+ "r": 0.6901408451,
77
+ "f": 0.7277227723
78
  },
79
+ "xcomp": {
80
+ "p": 0.6373239437,
81
+ "r": 0.6581818182,
82
+ "f": 0.6475849732
83
  },
84
  "flat": {
85
+ "p": 0.7919911012,
86
+ "r": 0.7550371156,
87
+ "f": 0.773072747
88
  },
89
+ "expl:pv": {
90
+ "p": 0.8,
91
+ "r": 0.8181818182,
92
+ "f": 0.808988764
93
  },
94
+ "acl": {
95
+ "p": 0.5176470588,
96
+ "r": 0.4489795918,
97
+ "f": 0.4808743169
98
  },
99
+ "advcl": {
100
+ "p": 0.5211267606,
101
+ "r": 0.5,
102
+ "f": 0.5103448276
103
  },
104
  "nummod": {
105
+ "p": 0.7961783439,
106
+ "r": 0.8333333333,
107
+ "f": 0.8143322476
 
 
 
 
 
108
  },
109
+ "nmod": {
110
+ "p": 0.7246011755,
111
+ "r": 0.7504347826,
112
+ "f": 0.7372917557
113
  },
114
+ "cc": {
115
+ "p": 0.8579545455,
116
+ "r": 0.8579545455,
117
+ "f": 0.8579545455
118
  },
119
+ "conj": {
120
+ "p": 0.6610407876,
121
+ "r": 0.6385869565,
122
+ "f": 0.6496199032
123
  },
124
  "nsubj:pass": {
125
+ "p": 0.8,
126
+ "r": 0.8301886792,
127
+ "f": 0.8148148148
128
  },
129
  "aux:pass": {
130
+ "p": 0.8911917098,
131
+ "r": 0.9555555556,
132
+ "f": 0.9222520107
133
  },
134
+ "cop": {
135
+ "p": 0.7810218978,
136
+ "r": 0.7838827839,
137
+ "f": 0.7824497258
138
  },
139
+ "acl:relcl": {
140
+ "p": 0.6647398844,
141
+ "r": 0.7232704403,
142
+ "f": 0.6927710843
143
  },
144
  "parataxis": {
145
+ "p": 0.3712871287,
146
+ "r": 0.2717391304,
147
+ "f": 0.3138075314
148
  },
149
+ "fixed": {
150
+ "p": 0.6714285714,
151
+ "r": 0.424954792,
152
+ "f": 0.5204872647
153
  },
154
  "iobj": {
155
+ "p": 0.5789473684,
156
+ "r": 0.3333333333,
157
+ "f": 0.4230769231
 
 
 
 
 
158
  },
159
  "obl:agent": {
160
+ "p": 0.8666666667,
161
+ "r": 0.8965517241,
162
+ "f": 0.8813559322
163
+ },
164
+ "expl": {
165
+ "p": 0.6153846154,
166
+ "r": 0.380952381,
167
+ "f": 0.4705882353
168
+ },
169
+ "csubj": {
170
+ "p": 0.5555555556,
171
+ "r": 0.25,
172
+ "f": 0.3448275862
173
+ },
174
+ "dep": {
175
+ "p": 0.0,
176
+ "r": 0.0,
177
+ "f": 0.0
178
  },
179
  "orphan": {
180
  "p": 0.0,
 
182
  "f": 0.0
183
  }
184
  },
185
+ "ents_p": 0.775862069,
186
+ "ents_r": 0.7468879668,
187
+ "ents_f": 0.7610993658,
188
  "ents_per_type": {
189
+ "ORG": {
190
+ "p": 0.0,
191
+ "r": 0.0,
192
+ "f": 0.0
193
  },
194
+ "PERSON": {
195
+ "p": 0.0,
196
+ "r": 0.0,
197
+ "f": 0.0
198
  },
199
+ "QUANTITY": {
200
+ "p": 0.0,
201
+ "r": 0.0,
202
+ "f": 0.0
203
  },
204
  "CARDINAL": {
205
+ "p": 0.0,
206
+ "r": 0.0,
207
+ "f": 0.0
208
  },
209
  "GPE": {
210
+ "p": 0.0,
211
+ "r": 0.0,
212
+ "f": 0.0
213
  },
214
+ "WORK_OF_ART": {
215
+ "p": 0.0,
216
+ "r": 0.0,
217
+ "f": 0.0
218
  },
219
+ "NORP": {
220
+ "p": 0.0,
221
+ "r": 0.0,
222
+ "f": 0.0
223
  },
224
+ "PRODUCT": {
225
+ "p": 0.0,
226
+ "r": 0.0,
227
+ "f": 0.0
228
+ },
229
+ "EVENT": {
230
+ "p": 0.0,
231
+ "r": 0.0,
232
+ "f": 0.0
233
+ },
234
+ "FAC": {
235
+ "p": 0.0,
236
+ "r": 0.0,
237
+ "f": 0.0
238
+ },
239
+ "DATE": {
240
+ "p": 0.0,
241
+ "r": 0.0,
242
+ "f": 0.0
243
  },
244
  "ORDINAL": {
245
+ "p": 0.0,
246
+ "r": 0.0,
247
+ "f": 0.0
248
  },
249
  "LANGUAGE": {
250
+ "p": 0.0,
251
+ "r": 0.0,
252
+ "f": 0.0
 
 
 
 
 
253
  },
254
+ "TIME": {
255
+ "p": 0.0,
256
+ "r": 0.0,
257
+ "f": 0.0
258
  },
259
  "LOC": {
260
+ "p": 0.0,
261
+ "r": 0.0,
262
+ "f": 0.0
 
 
 
 
 
263
  },
264
+ "MONEY": {
265
  "p": 0.0,
266
  "r": 0.0,
267
  "f": 0.0
268
  },
269
+ "PERCENT": {
270
+ "p": 0.0,
271
+ "r": 0.0,
272
+ "f": 0.0
273
  },
274
+ "LAW": {
275
  "p": 0.0,
276
  "r": 0.0,
277
  "f": 0.0
278
+ }
279
+ },
280
+ "speed": 2996.2278607909,
281
+ "token_acc": 0.9997165842,
282
+ "token_p": 0.9974281853,
283
+ "token_r": 0.9975586363,
284
+ "token_f": 0.9974934066,
285
+ "pos_acc": 0.9646673937,
286
+ "morph_acc": 0.9599947649,
287
+ "morph_micro_p": 0.9697827701,
288
+ "morph_micro_r": 0.951281701,
289
+ "morph_micro_f": 0.9604431471,
290
+ "morph_per_feat": {
291
+ "Person": {
292
+ "p": 0.9931640625,
293
+ "r": 0.9722753346,
294
+ "f": 0.9826086957
295
  },
296
+ "Poss": {
297
+ "p": 0.9923664122,
298
+ "r": 0.9961685824,
299
+ "f": 0.9942638623
300
+ },
301
+ "PronType": {
302
+ "p": 0.9931914894,
303
+ "r": 0.970074813,
304
+ "f": 0.9814970563
305
+ },
306
+ "Gender": {
307
+ "p": 0.9223479863,
308
+ "r": 0.8913762401,
309
+ "f": 0.9065976714
310
+ },
311
+ "Number": {
312
+ "p": 0.9819737244,
313
+ "r": 0.9621314175,
314
+ "f": 0.9719513117
315
+ },
316
+ "Tense": {
317
+ "p": 0.9787353106,
318
+ "r": 0.9615173172,
319
+ "f": 0.9700499168
320
+ },
321
+ "VerbForm": {
322
+ "p": 0.9617625637,
323
+ "r": 0.9503418496,
324
+ "f": 0.9560180995
325
+ },
326
+ "Degree": {
327
+ "p": 0.9489795918,
328
+ "r": 0.9353448276,
329
+ "f": 0.9421128799
330
+ },
331
+ "Definite": {
332
+ "p": 0.9973486522,
333
+ "r": 0.9933978873,
334
+ "f": 0.9953693495
335
+ },
336
+ "Case": {
337
+ "p": 0.996007984,
338
+ "r": 0.9940239044,
339
+ "f": 0.9950149551
340
+ },
341
+ "Reflex": {
342
  "p": 1.0,
343
  "r": 1.0,
344
  "f": 1.0
345
+ },
346
+ "Abbr": {
347
+ "p": 1.0,
348
+ "r": 0.5555555556,
349
+ "f": 0.7142857143
350
  }
351
+ },
352
+ "lemma_acc": 0.8150554986
353
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
 
config.cfg CHANGED
@@ -1,10 +1,8 @@
1
  [paths]
2
- train = "corpus/nl-dep-news/train.spacy"
3
- dev = "corpus/nl-dep-news/dev.spacy"
4
- vectors = "corpus/nl_vectors"
5
- raw = null
6
  init_tok2vec = null
7
- vocab_data = null
8
 
9
  [system]
10
  gpu_allocator = null
@@ -24,6 +22,7 @@ tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
24
 
25
  [components.attribute_ruler]
26
  factory = "attribute_ruler"
 
27
  validate = false
28
 
29
  [components.lemmatizer]
@@ -31,9 +30,13 @@ factory = "lemmatizer"
31
  mode = "rule"
32
  model = null
33
  overwrite = false
 
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
 
 
 
37
 
38
  [components.morphologizer.model]
39
  @architectures = "spacy.Tagger.v1"
@@ -48,6 +51,7 @@ upstream = "tok2vec"
48
  factory = "ner"
49
  incorrect_spans_key = null
50
  moves = null
 
51
  update_with_oracle_cut_size = 100
52
 
53
  [components.ner.model]
@@ -65,8 +69,8 @@ nO = null
65
  [components.ner.model.tok2vec.embed]
66
  @architectures = "spacy.MultiHashEmbed.v2"
67
  width = 96
68
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
69
- rows = [5000,2500,2500,2500]
70
  include_static_vectors = true
71
 
72
  [components.ner.model.tok2vec.encode]
@@ -81,6 +85,7 @@ factory = "parser"
81
  learn_tokens = false
82
  min_action_freq = 30
83
  moves = null
 
84
  update_with_oracle_cut_size = 100
85
 
86
  [components.parser.model]
@@ -99,6 +104,8 @@ upstream = "tok2vec"
99
 
100
  [components.senter]
101
  factory = "senter"
 
 
102
 
103
  [components.senter.model]
104
  @architectures = "spacy.Tagger.v1"
@@ -110,8 +117,8 @@ nO = null
110
  [components.senter.model.tok2vec.embed]
111
  @architectures = "spacy.MultiHashEmbed.v2"
112
  width = 16
113
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
114
- rows = [1000,500,500,500]
115
  include_static_vectors = true
116
 
117
  [components.senter.model.tok2vec.encode]
@@ -123,6 +130,8 @@ maxout_pieces = 2
123
 
124
  [components.tagger]
125
  factory = "tagger"
 
 
126
 
127
  [components.tagger.model]
128
  @architectures = "spacy.Tagger.v1"
@@ -142,8 +151,8 @@ factory = "tok2vec"
142
  [components.tok2vec.model.embed]
143
  @architectures = "spacy.MultiHashEmbed.v2"
144
  width = ${components.tok2vec.model.encode:width}
145
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
146
- rows = [5000,2500,2500,2500]
147
  include_static_vectors = true
148
 
149
  [components.tok2vec.model.encode]
@@ -157,22 +166,19 @@ maxout_pieces = 3
157
 
158
  [corpora.dev]
159
  @readers = "spacy.Corpus.v1"
160
- limit = 0
161
- max_length = 0
162
- path = ${paths:dev}
163
  gold_preproc = false
 
 
164
  augmenter = null
165
 
166
  [corpora.train]
167
  @readers = "spacy.Corpus.v1"
168
- path = ${paths:train}
169
- max_length = 5000
170
  gold_preproc = false
 
171
  limit = 0
172
-
173
- [corpora.train.augmenter]
174
- @augmenters = "spacy.lower_case.v1"
175
- level = 0.1
176
 
177
  [training]
178
  train_corpus = "corpora.train"
@@ -203,9 +209,8 @@ compound = 1.001
203
  t = 0.0
204
 
205
  [training.logger]
206
- @loggers = "spacy.WandbLogger.v1"
207
- project_name = "spacy-v3.0.0a2"
208
- remove_config_values = []
209
 
210
  [training.optimizer]
211
  @optimizers = "Adam.v1"
@@ -219,26 +224,27 @@ eps = 0.00000001
219
  learn_rate = 0.001
220
 
221
  [training.score_weights]
222
- pos_acc = 0.05
223
  morph_acc = 0.05
224
  morph_per_feat = null
225
- tag_acc = 0.05
226
  dep_uas = 0.0
227
  dep_las = 0.16
228
  dep_las_per_type = null
229
  sents_p = null
230
  sents_r = null
231
  sents_f = 0.02
232
- lemma_acc = 0.33
233
- ents_f = 0.33
234
  ents_p = 0.0
235
  ents_r = 0.0
236
  ents_per_type = null
 
237
 
238
  [pretraining]
239
 
240
  [initialize]
241
- vocab_data = ${paths.vocab_data}
242
  vectors = ${paths.vectors}
243
  init_tok2vec = ${paths.init_tok2vec}
244
  before_init = null
 
1
  [paths]
2
+ train = null
3
+ dev = null
4
+ vectors = null
 
5
  init_tok2vec = null
 
6
 
7
  [system]
8
  gpu_allocator = null
 
22
 
23
  [components.attribute_ruler]
24
  factory = "attribute_ruler"
25
+ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
26
  validate = false
27
 
28
  [components.lemmatizer]
 
30
  mode = "rule"
31
  model = null
32
  overwrite = false
33
+ scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
37
+ extend = false
38
+ overwrite = true
39
+ scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
41
  [components.morphologizer.model]
42
  @architectures = "spacy.Tagger.v1"
 
51
  factory = "ner"
52
  incorrect_spans_key = null
53
  moves = null
54
+ scorer = {"@scorers":"spacy.ner_scorer.v1"}
55
  update_with_oracle_cut_size = 100
56
 
57
  [components.ner.model]
 
69
  [components.ner.model.tok2vec.embed]
70
  @architectures = "spacy.MultiHashEmbed.v2"
71
  width = 96
72
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
73
+ rows = [5000,2500,2500,2500,100]
74
  include_static_vectors = true
75
 
76
  [components.ner.model.tok2vec.encode]
 
85
  learn_tokens = false
86
  min_action_freq = 30
87
  moves = null
88
+ scorer = {"@scorers":"spacy.parser_scorer.v1"}
89
  update_with_oracle_cut_size = 100
90
 
91
  [components.parser.model]
 
104
 
105
  [components.senter]
106
  factory = "senter"
107
+ overwrite = false
108
+ scorer = {"@scorers":"spacy.senter_scorer.v1"}
109
 
110
  [components.senter.model]
111
  @architectures = "spacy.Tagger.v1"
 
117
  [components.senter.model.tok2vec.embed]
118
  @architectures = "spacy.MultiHashEmbed.v2"
119
  width = 16
120
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
121
+ rows = [1000,500,500,500,50]
122
  include_static_vectors = true
123
 
124
  [components.senter.model.tok2vec.encode]
 
130
 
131
  [components.tagger]
132
  factory = "tagger"
133
+ overwrite = false
134
+ scorer = {"@scorers":"spacy.tagger_scorer.v1"}
135
 
136
  [components.tagger.model]
137
  @architectures = "spacy.Tagger.v1"
 
151
  [components.tok2vec.model.embed]
152
  @architectures = "spacy.MultiHashEmbed.v2"
153
  width = ${components.tok2vec.model.encode:width}
154
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
155
+ rows = [5000,2500,2500,2500,100]
156
  include_static_vectors = true
157
 
158
  [components.tok2vec.model.encode]
 
166
 
167
  [corpora.dev]
168
  @readers = "spacy.Corpus.v1"
169
+ path = ${paths.dev}
 
 
170
  gold_preproc = false
171
+ max_length = 0
172
+ limit = 0
173
  augmenter = null
174
 
175
  [corpora.train]
176
  @readers = "spacy.Corpus.v1"
177
+ path = ${paths.train}
 
178
  gold_preproc = false
179
+ max_length = 0
180
  limit = 0
181
+ augmenter = null
 
 
 
182
 
183
  [training]
184
  train_corpus = "corpora.train"
 
209
  t = 0.0
210
 
211
  [training.logger]
212
+ @loggers = "spacy.ConsoleLogger.v1"
213
+ progress_bar = false
 
214
 
215
  [training.optimizer]
216
  @optimizers = "Adam.v1"
 
224
  learn_rate = 0.001
225
 
226
  [training.score_weights]
227
+ pos_acc = 0.06
228
  morph_acc = 0.05
229
  morph_per_feat = null
230
+ tag_acc = 0.06
231
  dep_uas = 0.0
232
  dep_las = 0.16
233
  dep_las_per_type = null
234
  sents_p = null
235
  sents_r = null
236
  sents_f = 0.02
237
+ lemma_acc = 0.5
238
+ ents_f = 0.16
239
  ents_p = 0.0
240
  ents_r = 0.0
241
  ents_per_type = null
242
+ speed = 0.0
243
 
244
  [pretraining]
245
 
246
  [initialize]
247
+ vocab_data = null
248
  vectors = ${paths.vectors}
249
  init_tok2vec = ${paths.init_tok2vec}
250
  before_init = null
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"nl",
3
  "name":"core_news_md",
4
- "version":"3.1.0",
5
  "description":"Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
- "spacy_version":">=3.1.0,<3.2.0",
11
- "spacy_git_version":"caba63b74",
12
  "vectors":{
13
  "width":300,
14
  "vectors":20000,
@@ -68,7 +68,6 @@
68
  "Abbr=Yes|POS=X",
69
  "Gender=Com,Neut|Number=Sing|POS=PROPN",
70
  "Degree=Sup|POS=ADJ",
71
- "Foreign=Yes|POS=X",
72
  "POS=ADJ",
73
  "Number=Sing|POS=PROPN",
74
  "POS=PRON|PronType=Dem",
@@ -78,13 +77,14 @@
78
  "POS=PRON|Person=2|Poss=Yes|PronType=Prs",
79
  "Case=Dat|POS=PRON|PronType=Dem",
80
  "Case=Nom|POS=PRON|Person=2|PronType=Prs",
81
- "POS=X",
82
  "POS=INTJ",
 
83
  "Case=Gen|POS=PRON|Person=3|Poss=Yes|PronType=Prs",
84
  "POS=PRON|PronType=Int",
85
- "Case=Acc|POS=PRON|Person=2|PronType=Prs",
86
  "POS=PRON|Person=2|PronType=Prs",
87
- "Case=Gen|POS=PRON|Person=2|PronType=Prs"
 
 
88
  ],
89
  "tagger":[
90
  "ADJ|nom|basis|met-e|mv-n",
@@ -95,6 +95,7 @@
95
  "ADJ|nom|comp|met-e|mv-n",
96
  "ADJ|nom|comp|met-e|zonder-n|stan",
97
  "ADJ|nom|sup|met-e|mv-n",
 
98
  "ADJ|nom|sup|met-e|zonder-n|stan",
99
  "ADJ|nom|sup|zonder|zonder-n",
100
  "ADJ|postnom|basis|met-s",
@@ -106,6 +107,7 @@
106
  "ADJ|prenom|comp|met-e|stan",
107
  "ADJ|prenom|comp|zonder",
108
  "ADJ|prenom|sup|met-e|stan",
 
109
  "ADJ|vrij|basis|zonder",
110
  "ADJ|vrij|comp|zonder",
111
  "ADJ|vrij|dim|zonder",
@@ -175,6 +177,7 @@
175
  "VNW|bez|det|stan|red|1|ev|prenom|zonder|agr",
176
  "VNW|bez|det|stan|red|2v|ev|prenom|zonder|agr",
177
  "VNW|bez|det|stan|red|3|ev|prenom|zonder|agr",
 
178
  "VNW|bez|det|stan|vol|1|ev|prenom|zonder|agr",
179
  "VNW|bez|det|stan|vol|1|mv|prenom|met-e|rest",
180
  "VNW|bez|det|stan|vol|1|mv|prenom|zonder|evon",
@@ -187,10 +190,12 @@
187
  "VNW|bez|det|stan|vol|3v|ev|prenom|met-e|rest",
188
  "VNW|bez|det|stan|vol|3|ev|prenom|zonder|agr",
189
  "VNW|bez|det|stan|vol|3|mv|prenom|zonder|agr",
 
190
  "VNW|onbep|adv-pron|gen|red|3|getal",
191
  "VNW|onbep|adv-pron|obl|vol|3o|getal",
192
  "VNW|onbep|det|stan|nom|met-e|mv-n",
193
  "VNW|onbep|det|stan|nom|met-e|zonder-n",
 
194
  "VNW|onbep|det|stan|prenom|met-e|agr",
195
  "VNW|onbep|det|stan|prenom|met-e|evz",
196
  "VNW|onbep|det|stan|prenom|met-e|mv",
@@ -377,254 +382,182 @@
377
  "senter"
378
  ],
379
  "performance":{
380
- "token_acc":0.9997165842,
381
- "tag_acc":0.9429007634,
382
- "pos_acc":0.960436205,
383
- "morph_acc":0.9541280401,
384
- "lemma_acc":0.8531303836,
385
- "dep_uas":0.8651950224,
386
- "dep_las":0.8202138983,
387
- "sents_p":0.8534067447,
388
- "sents_r":0.8895265423,
389
- "sents_f":0.8710923779,
390
- "speed":4346.394666307,
391
- "morph_per_feat":{
392
- "Person":{
393
- "p":0.9960784314,
394
- "r":0.970391595,
395
- "f":0.9830672472
396
- },
397
- "Poss":{
398
- "p":0.9923664122,
399
- "r":0.9885931559,
400
- "f":0.9904761905
401
- },
402
- "PronType":{
403
- "p":0.994017094,
404
- "r":0.9659468439,
405
- "f":0.9797809604
406
- },
407
- "Gender":{
408
- "p":0.9102564103,
409
- "r":0.8809825272,
410
- "f":0.8953802599
411
- },
412
- "Number":{
413
- "p":0.9794238683,
414
- "r":0.9591044776,
415
- "f":0.9691576804
416
- },
417
- "Tense":{
418
- "p":0.9738015608,
419
- "r":0.9598901099,
420
- "f":0.9667957941
421
- },
422
- "VerbForm":{
423
- "p":0.9548269581,
424
- "r":0.9431450162,
425
- "f":0.9489500362
426
- },
427
- "Degree":{
428
- "p":0.9421547361,
429
- "r":0.9293865906,
430
- "f":0.9357271095
431
- },
432
- "Definite":{
433
- "p":0.9969217238,
434
- "r":0.9973603168,
435
- "f":0.9971409721
436
- },
437
- "Case":{
438
- "p":0.998,
439
- "r":0.9940239044,
440
- "f":0.996007984
441
- },
442
- "Reflex":{
443
- "p":1.0,
444
- "r":1.0,
445
- "f":1.0
446
- },
447
- "Foreign":{
448
- "p":0.6451612903,
449
- "r":0.3076923077,
450
- "f":0.4166666667
451
- },
452
- "Abbr":{
453
- "p":1.0,
454
- "r":1.0,
455
- "f":1.0
456
- }
457
- },
458
  "dep_las_per_type":{
459
- "det":{
460
- "p":0.8819255223,
461
- "r":0.9575936884,
462
- "f":0.9182033097
463
  },
464
  "nsubj":{
465
- "p":0.7484939759,
466
- "r":0.8068181818,
467
- "f":0.7765625
468
  },
469
- "root":{
470
- "p":0.7513586957,
471
- "r":0.8253731343,
472
- "f":0.786628734
473
  },
474
- "case":{
475
- "p":0.8877284595,
476
- "r":0.9400921659,
477
- "f":0.9131602507
478
  },
479
- "obl":{
480
- "p":0.7350565428,
481
- "r":0.7210776545,
482
- "f":0.728
483
  },
484
- "nmod":{
485
- "p":0.6371100164,
486
- "r":0.6771378709,
487
- "f":0.6565143824
488
  },
489
- "advmod":{
490
- "p":0.7519379845,
491
- "r":0.7838383838,
492
- "f":0.7675568744
493
  },
494
- "obj":{
495
- "p":0.7448979592,
496
- "r":0.8081180812,
497
- "f":0.7752212389
498
  },
499
  "mark":{
500
- "p":0.8242677824,
501
- "r":0.8140495868,
502
- "f":0.8191268191
503
- },
504
- "advcl":{
505
- "p":0.5,
506
- "r":0.4545454545,
507
- "f":0.4761904762
508
  },
509
- "amod":{
510
- "p":0.8095238095,
511
- "r":0.8686131387,
512
- "f":0.838028169
513
  },
514
- "acl:relcl":{
515
- "p":0.6470588235,
516
- "r":0.6707317073,
517
- "f":0.6586826347
518
  },
519
- "cop":{
520
- "p":0.7364864865,
521
- "r":0.5989010989,
522
- "f":0.6606060606
523
  },
524
- "cc":{
525
- "p":0.7948717949,
526
- "r":0.852233677,
527
- "f":0.8225538972
528
  },
529
- "conj":{
530
- "p":0.6227848101,
531
- "r":0.5442477876,
532
- "f":0.5808736718
533
  },
534
- "fixed":{
535
- "p":0.7132867133,
536
- "r":0.2764227642,
537
- "f":0.3984375
538
  },
539
  "flat":{
540
- "p":0.8263473054,
541
- "r":0.7052810903,
542
- "f":0.7610294118
543
  },
544
- "csubj":{
545
- "p":0.3333333333,
546
- "r":0.3333333333,
547
- "f":0.3333333333
548
  },
549
- "aux":{
550
- "p":0.7676767677,
551
- "r":0.7378640777,
552
- "f":0.7524752475
553
  },
554
- "compound:prt":{
555
- "p":0.7361111111,
556
- "r":0.6883116883,
557
- "f":0.711409396
558
  },
559
  "nummod":{
560
- "p":0.5792682927,
561
- "r":0.6506849315,
562
- "f":0.6129032258
563
- },
564
- "acl":{
565
- "p":0.4042553191,
566
- "r":0.3220338983,
567
- "f":0.358490566
568
  },
569
- "expl":{
570
- "p":0.1428571429,
571
- "r":0.1666666667,
572
- "f":0.1538461538
573
  },
574
- "appos":{
575
- "p":0.527607362,
576
- "r":0.4971098266,
577
- "f":0.5119047619
578
  },
579
- "dep":{
580
- "p":0.0,
581
- "r":0.0,
582
- "f":0.0
583
  },
584
  "nsubj:pass":{
585
- "p":0.8414634146,
586
- "r":0.8023255814,
587
- "f":0.8214285714
588
  },
589
  "aux:pass":{
590
- "p":0.8921568627,
591
- "r":0.9285714286,
592
- "f":0.91
593
  },
594
- "xcomp":{
595
- "p":0.3382352941,
596
- "r":0.6301369863,
597
- "f":0.4401913876
598
  },
599
- "ccomp":{
600
- "p":0.5333333333,
601
- "r":0.4705882353,
602
- "f":0.5
603
  },
604
  "parataxis":{
605
- "p":0.3333333333,
606
- "r":0.2751677852,
607
- "f":0.3014705882
608
  },
609
- "expl:pv":{
610
- "p":0.9375,
611
- "r":0.7894736842,
612
- "f":0.8571428571
613
  },
614
  "iobj":{
615
- "p":0.3333333333,
616
- "r":0.3,
617
- "f":0.3157894737
618
- },
619
- "nmod:poss":{
620
- "p":0.8851351351,
621
- "r":0.8562091503,
622
- "f":0.8704318937
623
  },
624
  "obl:agent":{
625
- "p":0.9166666667,
626
- "r":0.7857142857,
627
- "f":0.8461538462
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
628
  },
629
  "orphan":{
630
  "p":0.0,
@@ -632,105 +565,178 @@
632
  "f":0.0
633
  }
634
  },
635
- "ents_p":0.7817002882,
636
- "ents_r":0.7503457815,
637
- "ents_f":0.7657021877,
638
  "ents_per_type":{
639
- "DATE":{
640
- "p":0.9306930693,
641
- "r":0.9155844156,
642
- "f":0.9230769231
643
  },
644
- "NORP":{
645
- "p":0.8111111111,
646
- "r":0.8795180723,
647
- "f":0.8439306358
648
  },
649
- "ORG":{
650
- "p":0.7313432836,
651
- "r":0.5798816568,
652
- "f":0.6468646865
653
  },
654
  "CARDINAL":{
655
- "p":0.8933333333,
656
- "r":0.9571428571,
657
- "f":0.924137931
658
  },
659
  "GPE":{
660
- "p":0.7442922374,
661
- "r":0.8956043956,
662
- "f":0.812967581
663
  },
664
- "PERCENT":{
665
- "p":0.7142857143,
666
- "r":0.8333333333,
667
- "f":0.7692307692
668
  },
669
- "PERSON":{
670
- "p":0.7841269841,
671
- "r":0.7993527508,
672
- "f":0.7916666667
673
  },
674
- "LAW":{
675
- "p":0.75,
676
- "r":1.0,
677
- "f":0.8571428571
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
678
  },
679
  "ORDINAL":{
680
- "p":0.9117647059,
681
- "r":0.9393939394,
682
- "f":0.9253731343
683
  },
684
  "LANGUAGE":{
685
- "p":0.5833333333,
686
- "r":0.6363636364,
687
- "f":0.6086956522
688
- },
689
- "EVENT":{
690
- "p":0.3461538462,
691
- "r":0.3913043478,
692
- "f":0.3673469388
693
  },
694
- "QUANTITY":{
695
- "p":0.8333333333,
696
- "r":0.8333333333,
697
- "f":0.8333333333
698
  },
699
  "LOC":{
700
- "p":0.7,
701
- "r":0.2058823529,
702
- "f":0.3181818182
703
- },
704
- "FAC":{
705
- "p":0.1904761905,
706
- "r":0.2857142857,
707
- "f":0.2285714286
708
  },
709
- "PRODUCT":{
710
  "p":0.0,
711
  "r":0.0,
712
  "f":0.0
713
  },
714
- "WORK_OF_ART":{
715
- "p":0.624,
716
- "r":0.4333333333,
717
- "f":0.5114754098
718
  },
719
- "MONEY":{
720
  "p":0.0,
721
  "r":0.0,
722
  "f":0.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
723
  },
724
- "TIME":{
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
725
  "p":1.0,
726
  "r":1.0,
727
  "f":1.0
 
 
 
 
 
728
  }
729
- }
 
730
  },
731
  "sources":[
732
  {
733
- "name":"UD Dutch LassySmall v2.5",
734
  "url":"https://github.com/UniversalDependencies/UD_Dutch-LassySmall",
735
  "license":"CC BY-SA 4.0",
736
  "author":"Bouma, Gosse; van Noord, Gertjan"
@@ -742,13 +748,13 @@
742
  "author":"NLP Town"
743
  },
744
  {
745
- "name":"UD Dutch LassySmall v2.5",
746
  "url":"https://github.com/UniversalDependencies/UD_Dutch-LassySmall",
747
  "license":"CC BY-SA 4.0",
748
  "author":"Bouma, Gosse; van Noord, Gertjan"
749
  },
750
  {
751
- "name":"UD Dutch Alpino v2.5",
752
  "url":"https://github.com/UniversalDependencies/UD_Dutch-Alpino",
753
  "license":"CC BY-SA 4.0",
754
  "author":"Zeman, Daniel; \u017dabokrtsk\u00fd, Zden\u011bk; Bouma, Gosse; van Noord, Gertjan"
 
1
  {
2
  "lang":"nl",
3
  "name":"core_news_md",
4
+ "version":"3.2.0",
5
  "description":"Dutch pipeline optimized for CPU. Components: tok2vec, morphologizer, tagger, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
+ "spacy_version":">=3.2.0,<3.3.0",
11
+ "spacy_git_version":"bb26550e2",
12
  "vectors":{
13
  "width":300,
14
  "vectors":20000,
 
68
  "Abbr=Yes|POS=X",
69
  "Gender=Com,Neut|Number=Sing|POS=PROPN",
70
  "Degree=Sup|POS=ADJ",
 
71
  "POS=ADJ",
72
  "Number=Sing|POS=PROPN",
73
  "POS=PRON|PronType=Dem",
 
77
  "POS=PRON|Person=2|Poss=Yes|PronType=Prs",
78
  "Case=Dat|POS=PRON|PronType=Dem",
79
  "Case=Nom|POS=PRON|Person=2|PronType=Prs",
 
80
  "POS=INTJ",
81
+ "Case=Acc|POS=PRON|Person=2|PronType=Prs",
82
  "Case=Gen|POS=PRON|Person=3|Poss=Yes|PronType=Prs",
83
  "POS=PRON|PronType=Int",
 
84
  "POS=PRON|Person=2|PronType=Prs",
85
+ "POS=PRON|Person=3",
86
+ "Case=Gen|POS=PRON|Person=2|PronType=Prs",
87
+ "POS=X"
88
  ],
89
  "tagger":[
90
  "ADJ|nom|basis|met-e|mv-n",
 
95
  "ADJ|nom|comp|met-e|mv-n",
96
  "ADJ|nom|comp|met-e|zonder-n|stan",
97
  "ADJ|nom|sup|met-e|mv-n",
98
+ "ADJ|nom|sup|met-e|zonder-n|bijz",
99
  "ADJ|nom|sup|met-e|zonder-n|stan",
100
  "ADJ|nom|sup|zonder|zonder-n",
101
  "ADJ|postnom|basis|met-s",
 
107
  "ADJ|prenom|comp|met-e|stan",
108
  "ADJ|prenom|comp|zonder",
109
  "ADJ|prenom|sup|met-e|stan",
110
+ "ADJ|prenom|sup|zonder",
111
  "ADJ|vrij|basis|zonder",
112
  "ADJ|vrij|comp|zonder",
113
  "ADJ|vrij|dim|zonder",
 
177
  "VNW|bez|det|stan|red|1|ev|prenom|zonder|agr",
178
  "VNW|bez|det|stan|red|2v|ev|prenom|zonder|agr",
179
  "VNW|bez|det|stan|red|3|ev|prenom|zonder|agr",
180
+ "VNW|bez|det|stan|vol|1|ev|prenom|met-e|rest",
181
  "VNW|bez|det|stan|vol|1|ev|prenom|zonder|agr",
182
  "VNW|bez|det|stan|vol|1|mv|prenom|met-e|rest",
183
  "VNW|bez|det|stan|vol|1|mv|prenom|zonder|evon",
 
190
  "VNW|bez|det|stan|vol|3v|ev|prenom|met-e|rest",
191
  "VNW|bez|det|stan|vol|3|ev|prenom|zonder|agr",
192
  "VNW|bez|det|stan|vol|3|mv|prenom|zonder|agr",
193
+ "VNW|excl|pron|stan|vol|3|getal",
194
  "VNW|onbep|adv-pron|gen|red|3|getal",
195
  "VNW|onbep|adv-pron|obl|vol|3o|getal",
196
  "VNW|onbep|det|stan|nom|met-e|mv-n",
197
  "VNW|onbep|det|stan|nom|met-e|zonder-n",
198
+ "VNW|onbep|det|stan|nom|zonder|zonder-n",
199
  "VNW|onbep|det|stan|prenom|met-e|agr",
200
  "VNW|onbep|det|stan|prenom|met-e|evz",
201
  "VNW|onbep|det|stan|prenom|met-e|mv",
 
382
  "senter"
383
  ],
384
  "performance":{
385
+ "tag_acc":0.9482224646,
386
+ "sents_p":0.8472032742,
387
+ "sents_r":0.8909612626,
388
+ "sents_f":0.8685314685,
389
+ "dep_uas":0.8688732323,
390
+ "dep_las":0.8211769476,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
391
  "dep_las_per_type":{
392
+ "nmod:poss":{
393
+ "p":0.9522058824,
394
+ "r":0.9452554745,
395
+ "f":0.9487179487
396
  },
397
  "nsubj":{
398
+ "p":0.8497072219,
399
+ "r":0.8586456279,
400
+ "f":0.8541530412
401
  },
402
+ "aux":{
403
+ "p":0.9013157895,
404
+ "r":0.9013157895,
405
+ "f":0.9013157895
406
  },
407
+ "advmod":{
408
+ "p":0.7848537005,
409
+ "r":0.8142857143,
410
+ "f":0.7992988606
411
  },
412
+ "root":{
413
+ "p":0.8676671214,
414
+ "r":0.912482066,
415
+ "f":0.8895104895
416
  },
417
+ "det":{
418
+ "p":0.9465937763,
419
+ "r":0.9740372133,
420
+ "f":0.9601194284
421
  },
422
+ "amod":{
423
+ "p":0.8758802817,
424
+ "r":0.8915770609,
425
+ "f":0.8836589698
426
  },
427
+ "obl":{
428
+ "p":0.7556008147,
429
+ "r":0.7535545024,
430
+ "f":0.7545762712
431
  },
432
  "mark":{
433
+ "p":0.8709677419,
434
+ "r":0.8836363636,
435
+ "f":0.8772563177
 
 
 
 
 
436
  },
437
+ "ccomp":{
438
+ "p":0.7222222222,
439
+ "r":0.6074766355,
440
+ "f":0.6598984772
441
  },
442
+ "case":{
443
+ "p":0.9368290669,
444
+ "r":0.9610334604,
445
+ "f":0.9487769183
446
  },
447
+ "appos":{
448
+ "p":0.7060606061,
449
+ "r":0.7060606061,
450
+ "f":0.7060606061
451
  },
452
+ "obj":{
453
+ "p":0.7666666667,
454
+ "r":0.772519084,
455
+ "f":0.769581749
456
  },
457
+ "compound:prt":{
458
+ "p":0.7696335079,
459
+ "r":0.6901408451,
460
+ "f":0.7277227723
461
  },
462
+ "xcomp":{
463
+ "p":0.6373239437,
464
+ "r":0.6581818182,
465
+ "f":0.6475849732
466
  },
467
  "flat":{
468
+ "p":0.7919911012,
469
+ "r":0.7550371156,
470
+ "f":0.773072747
471
  },
472
+ "expl:pv":{
473
+ "p":0.8,
474
+ "r":0.8181818182,
475
+ "f":0.808988764
476
  },
477
+ "acl":{
478
+ "p":0.5176470588,
479
+ "r":0.4489795918,
480
+ "f":0.4808743169
481
  },
482
+ "advcl":{
483
+ "p":0.5211267606,
484
+ "r":0.5,
485
+ "f":0.5103448276
486
  },
487
  "nummod":{
488
+ "p":0.7961783439,
489
+ "r":0.8333333333,
490
+ "f":0.8143322476
 
 
 
 
 
491
  },
492
+ "nmod":{
493
+ "p":0.7246011755,
494
+ "r":0.7504347826,
495
+ "f":0.7372917557
496
  },
497
+ "cc":{
498
+ "p":0.8579545455,
499
+ "r":0.8579545455,
500
+ "f":0.8579545455
501
  },
502
+ "conj":{
503
+ "p":0.6610407876,
504
+ "r":0.6385869565,
505
+ "f":0.6496199032
506
  },
507
  "nsubj:pass":{
508
+ "p":0.8,
509
+ "r":0.8301886792,
510
+ "f":0.8148148148
511
  },
512
  "aux:pass":{
513
+ "p":0.8911917098,
514
+ "r":0.9555555556,
515
+ "f":0.9222520107
516
  },
517
+ "cop":{
518
+ "p":0.7810218978,
519
+ "r":0.7838827839,
520
+ "f":0.7824497258
521
  },
522
+ "acl:relcl":{
523
+ "p":0.6647398844,
524
+ "r":0.7232704403,
525
+ "f":0.6927710843
526
  },
527
  "parataxis":{
528
+ "p":0.3712871287,
529
+ "r":0.2717391304,
530
+ "f":0.3138075314
531
  },
532
+ "fixed":{
533
+ "p":0.6714285714,
534
+ "r":0.424954792,
535
+ "f":0.5204872647
536
  },
537
  "iobj":{
538
+ "p":0.5789473684,
539
+ "r":0.3333333333,
540
+ "f":0.4230769231
 
 
 
 
 
541
  },
542
  "obl:agent":{
543
+ "p":0.8666666667,
544
+ "r":0.8965517241,
545
+ "f":0.8813559322
546
+ },
547
+ "expl":{
548
+ "p":0.6153846154,
549
+ "r":0.380952381,
550
+ "f":0.4705882353
551
+ },
552
+ "csubj":{
553
+ "p":0.5555555556,
554
+ "r":0.25,
555
+ "f":0.3448275862
556
+ },
557
+ "dep":{
558
+ "p":0.0,
559
+ "r":0.0,
560
+ "f":0.0
561
  },
562
  "orphan":{
563
  "p":0.0,
 
565
  "f":0.0
566
  }
567
  },
568
+ "ents_p":0.775862069,
569
+ "ents_r":0.7468879668,
570
+ "ents_f":0.7610993658,
571
  "ents_per_type":{
572
+ "ORG":{
573
+ "p":0.0,
574
+ "r":0.0,
575
+ "f":0.0
576
  },
577
+ "PERSON":{
578
+ "p":0.0,
579
+ "r":0.0,
580
+ "f":0.0
581
  },
582
+ "QUANTITY":{
583
+ "p":0.0,
584
+ "r":0.0,
585
+ "f":0.0
586
  },
587
  "CARDINAL":{
588
+ "p":0.0,
589
+ "r":0.0,
590
+ "f":0.0
591
  },
592
  "GPE":{
593
+ "p":0.0,
594
+ "r":0.0,
595
+ "f":0.0
596
  },
597
+ "WORK_OF_ART":{
598
+ "p":0.0,
599
+ "r":0.0,
600
+ "f":0.0
601
  },
602
+ "NORP":{
603
+ "p":0.0,
604
+ "r":0.0,
605
+ "f":0.0
606
  },
607
+ "PRODUCT":{
608
+ "p":0.0,
609
+ "r":0.0,
610
+ "f":0.0
611
+ },
612
+ "EVENT":{
613
+ "p":0.0,
614
+ "r":0.0,
615
+ "f":0.0
616
+ },
617
+ "FAC":{
618
+ "p":0.0,
619
+ "r":0.0,
620
+ "f":0.0
621
+ },
622
+ "DATE":{
623
+ "p":0.0,
624
+ "r":0.0,
625
+ "f":0.0
626
  },
627
  "ORDINAL":{
628
+ "p":0.0,
629
+ "r":0.0,
630
+ "f":0.0
631
  },
632
  "LANGUAGE":{
633
+ "p":0.0,
634
+ "r":0.0,
635
+ "f":0.0
 
 
 
 
 
636
  },
637
+ "TIME":{
638
+ "p":0.0,
639
+ "r":0.0,
640
+ "f":0.0
641
  },
642
  "LOC":{
643
+ "p":0.0,
644
+ "r":0.0,
645
+ "f":0.0
 
 
 
 
 
646
  },
647
+ "MONEY":{
648
  "p":0.0,
649
  "r":0.0,
650
  "f":0.0
651
  },
652
+ "PERCENT":{
653
+ "p":0.0,
654
+ "r":0.0,
655
+ "f":0.0
656
  },
657
+ "LAW":{
658
  "p":0.0,
659
  "r":0.0,
660
  "f":0.0
661
+ }
662
+ },
663
+ "speed":2996.2278607909,
664
+ "token_acc":0.9997165842,
665
+ "token_p":0.9974281853,
666
+ "token_r":0.9975586363,
667
+ "token_f":0.9974934066,
668
+ "pos_acc":0.9646673937,
669
+ "morph_acc":0.9599947649,
670
+ "morph_micro_p":0.9697827701,
671
+ "morph_micro_r":0.951281701,
672
+ "morph_micro_f":0.9604431471,
673
+ "morph_per_feat":{
674
+ "Person":{
675
+ "p":0.9931640625,
676
+ "r":0.9722753346,
677
+ "f":0.9826086957
678
  },
679
+ "Poss":{
680
+ "p":0.9923664122,
681
+ "r":0.9961685824,
682
+ "f":0.9942638623
683
+ },
684
+ "PronType":{
685
+ "p":0.9931914894,
686
+ "r":0.970074813,
687
+ "f":0.9814970563
688
+ },
689
+ "Gender":{
690
+ "p":0.9223479863,
691
+ "r":0.8913762401,
692
+ "f":0.9065976714
693
+ },
694
+ "Number":{
695
+ "p":0.9819737244,
696
+ "r":0.9621314175,
697
+ "f":0.9719513117
698
+ },
699
+ "Tense":{
700
+ "p":0.9787353106,
701
+ "r":0.9615173172,
702
+ "f":0.9700499168
703
+ },
704
+ "VerbForm":{
705
+ "p":0.9617625637,
706
+ "r":0.9503418496,
707
+ "f":0.9560180995
708
+ },
709
+ "Degree":{
710
+ "p":0.9489795918,
711
+ "r":0.9353448276,
712
+ "f":0.9421128799
713
+ },
714
+ "Definite":{
715
+ "p":0.9973486522,
716
+ "r":0.9933978873,
717
+ "f":0.9953693495
718
+ },
719
+ "Case":{
720
+ "p":0.996007984,
721
+ "r":0.9940239044,
722
+ "f":0.9950149551
723
+ },
724
+ "Reflex":{
725
  "p":1.0,
726
  "r":1.0,
727
  "f":1.0
728
+ },
729
+ "Abbr":{
730
+ "p":1.0,
731
+ "r":0.5555555556,
732
+ "f":0.7142857143
733
  }
734
+ },
735
+ "lemma_acc":0.8150554986
736
  },
737
  "sources":[
738
  {
739
+ "name":"UD Dutch LassySmall v2.8",
740
  "url":"https://github.com/UniversalDependencies/UD_Dutch-LassySmall",
741
  "license":"CC BY-SA 4.0",
742
  "author":"Bouma, Gosse; van Noord, Gertjan"
 
748
  "author":"NLP Town"
749
  },
750
  {
751
+ "name":"UD Dutch LassySmall v2.8",
752
  "url":"https://github.com/UniversalDependencies/UD_Dutch-LassySmall",
753
  "license":"CC BY-SA 4.0",
754
  "author":"Bouma, Gosse; van Noord, Gertjan"
755
  },
756
  {
757
+ "name":"UD Dutch Alpino v2.8",
758
  "url":"https://github.com/UniversalDependencies/UD_Dutch-Alpino",
759
  "license":"CC BY-SA 4.0",
760
  "author":"Zeman, Daniel; \u017dabokrtsk\u00fd, Zden\u011bk; Bouma, Gosse; van Noord, Gertjan"
morphologizer/cfg CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "labels_morph":{
3
  "POS=PRON|Person=3|PronType=Dem":"Person=3|PronType=Dem",
4
  "Number=Sing|POS=AUX|Tense=Pres|VerbForm=Fin":"Number=Sing|Tense=Pres|VerbForm=Fin",
@@ -48,7 +49,6 @@
48
  "Abbr=Yes|POS=X":"Abbr=Yes",
49
  "Gender=Com,Neut|Number=Sing|POS=PROPN":"Gender=Com,Neut|Number=Sing",
50
  "Degree=Sup|POS=ADJ":"Degree=Sup",
51
- "Foreign=Yes|POS=X":"Foreign=Yes",
52
  "POS=ADJ":"",
53
  "Number=Sing|POS=PROPN":"Number=Sing",
54
  "POS=PRON|PronType=Dem":"PronType=Dem",
@@ -58,13 +58,14 @@
58
  "POS=PRON|Person=2|Poss=Yes|PronType=Prs":"Person=2|Poss=Yes|PronType=Prs",
59
  "Case=Dat|POS=PRON|PronType=Dem":"Case=Dat|PronType=Dem",
60
  "Case=Nom|POS=PRON|Person=2|PronType=Prs":"Case=Nom|Person=2|PronType=Prs",
61
- "POS=X":"",
62
  "POS=INTJ":"",
 
63
  "Case=Gen|POS=PRON|Person=3|Poss=Yes|PronType=Prs":"Case=Gen|Person=3|Poss=Yes|PronType=Prs",
64
  "POS=PRON|PronType=Int":"PronType=Int",
65
- "Case=Acc|POS=PRON|Person=2|PronType=Prs":"Case=Acc|Person=2|PronType=Prs",
66
  "POS=PRON|Person=2|PronType=Prs":"Person=2|PronType=Prs",
67
- "Case=Gen|POS=PRON|Person=2|PronType=Prs":"Case=Gen|Person=2|PronType=Prs"
 
 
68
  },
69
  "labels_pos":{
70
  "POS=PRON|Person=3|PronType=Dem":95,
@@ -115,7 +116,6 @@
115
  "Abbr=Yes|POS=X":101,
116
  "Gender=Com,Neut|Number=Sing|POS=PROPN":96,
117
  "Degree=Sup|POS=ADJ":84,
118
- "Foreign=Yes|POS=X":101,
119
  "POS=ADJ":84,
120
  "Number=Sing|POS=PROPN":96,
121
  "POS=PRON|PronType=Dem":95,
@@ -125,12 +125,14 @@
125
  "POS=PRON|Person=2|Poss=Yes|PronType=Prs":95,
126
  "Case=Dat|POS=PRON|PronType=Dem":95,
127
  "Case=Nom|POS=PRON|Person=2|PronType=Prs":95,
128
- "POS=X":101,
129
  "POS=INTJ":91,
 
130
  "Case=Gen|POS=PRON|Person=3|Poss=Yes|PronType=Prs":95,
131
  "POS=PRON|PronType=Int":95,
132
- "Case=Acc|POS=PRON|Person=2|PronType=Prs":95,
133
  "POS=PRON|Person=2|PronType=Prs":95,
134
- "Case=Gen|POS=PRON|Person=2|PronType=Prs":95
135
- }
 
 
 
136
  }
 
1
  {
2
+ "extend":false,
3
  "labels_morph":{
4
  "POS=PRON|Person=3|PronType=Dem":"Person=3|PronType=Dem",
5
  "Number=Sing|POS=AUX|Tense=Pres|VerbForm=Fin":"Number=Sing|Tense=Pres|VerbForm=Fin",
 
49
  "Abbr=Yes|POS=X":"Abbr=Yes",
50
  "Gender=Com,Neut|Number=Sing|POS=PROPN":"Gender=Com,Neut|Number=Sing",
51
  "Degree=Sup|POS=ADJ":"Degree=Sup",
 
52
  "POS=ADJ":"",
53
  "Number=Sing|POS=PROPN":"Number=Sing",
54
  "POS=PRON|PronType=Dem":"PronType=Dem",
 
58
  "POS=PRON|Person=2|Poss=Yes|PronType=Prs":"Person=2|Poss=Yes|PronType=Prs",
59
  "Case=Dat|POS=PRON|PronType=Dem":"Case=Dat|PronType=Dem",
60
  "Case=Nom|POS=PRON|Person=2|PronType=Prs":"Case=Nom|Person=2|PronType=Prs",
 
61
  "POS=INTJ":"",
62
+ "Case=Acc|POS=PRON|Person=2|PronType=Prs":"Case=Acc|Person=2|PronType=Prs",
63
  "Case=Gen|POS=PRON|Person=3|Poss=Yes|PronType=Prs":"Case=Gen|Person=3|Poss=Yes|PronType=Prs",
64
  "POS=PRON|PronType=Int":"PronType=Int",
 
65
  "POS=PRON|Person=2|PronType=Prs":"Person=2|PronType=Prs",
66
+ "POS=PRON|Person=3":"Person=3",
67
+ "Case=Gen|POS=PRON|Person=2|PronType=Prs":"Case=Gen|Person=2|PronType=Prs",
68
+ "POS=X":""
69
  },
70
  "labels_pos":{
71
  "POS=PRON|Person=3|PronType=Dem":95,
 
116
  "Abbr=Yes|POS=X":101,
117
  "Gender=Com,Neut|Number=Sing|POS=PROPN":96,
118
  "Degree=Sup|POS=ADJ":84,
 
119
  "POS=ADJ":84,
120
  "Number=Sing|POS=PROPN":96,
121
  "POS=PRON|PronType=Dem":95,
 
125
  "POS=PRON|Person=2|Poss=Yes|PronType=Prs":95,
126
  "Case=Dat|POS=PRON|PronType=Dem":95,
127
  "Case=Nom|POS=PRON|Person=2|PronType=Prs":95,
 
128
  "POS=INTJ":91,
129
+ "Case=Acc|POS=PRON|Person=2|PronType=Prs":95,
130
  "Case=Gen|POS=PRON|Person=3|Poss=Yes|PronType=Prs":95,
131
  "POS=PRON|PronType=Int":95,
 
132
  "POS=PRON|Person=2|PronType=Prs":95,
133
+ "POS=PRON|Person=3":95,
134
+ "Case=Gen|POS=PRON|Person=2|PronType=Prs":95,
135
+ "POS=X":101
136
+ },
137
+ "overwrite":true
138
  }
morphologizer/model CHANGED
Binary files a/morphologizer/model and b/morphologizer/model differ
 
ner/model CHANGED
Binary files a/ner/model and b/ner/model differ
 
nl_core_news_md-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:80a83cb8edd13201ea49798b0854a3e478a90e1d7870e328890b72a775859487
3
- size 46551969
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ef6b4f150ff63782df0b9d2ae94098fcc343da296f792841265c005f1c530ffd
3
+ size 47325478
parser/model CHANGED
Binary files a/parser/model and b/parser/model differ
 
parser/moves CHANGED
@@ -1 +1 @@
1
- ��moves��{"0":{"":151388},"1":{"":91342},"2":{"det":29783,"case":26176,"nsubj":13581,"amod":12961,"punct":11732,"advmod":9779,"obl":8002,"mark":6673,"cc":5427,"obj":4504,"aux":4192,"nsubj:pass":2504,"aux:pass":2464,"cop":2079,"nummod":2057,"nmod:poss":2023,"nmod":1226,"xcomp":1155,"compound:prt":842,"advcl":643,"acl":491,"parataxis":419,"iobj":306,"expl":273,"advmod||xcomp":271,"obl||xcomp":266,"expl:pv":262,"obl:agent":227,"obj||xcomp":204,"case||obl":161,"ccomp":108,"expl||advcl":60,"case||advcl":51,"csubj":50,"advmod||ccomp":50,"obl||ccomp":48,"obj||ccomp":47,"obl||obj":38,"advcl||xcomp":30,"dep":0},"3":{"punct":19438,"nmod":13005,"flat":9123,"conj":7122,"obl":6774,"fixed":4655,"nsubj":4274,"appos":3317,"obj":3143,"advmod":3111,"parataxis":2300,"xcomp":2115,"acl:relcl":2029,"advcl":1591,"compound:prt":1377,"cop":1278,"ccomp":1228,"acl":767,"amod":504,"aux:pass":396,"csubj":394,"nummod":367,"aux":353,"iobj":229,"expl:pv":225,"obl:agent":220,"nmod||obj":179,"advcl||advmod":154,"case":148,"acl:relcl||obj":135,"case||obl":133,"acl:relcl||nsubj":98,"acl||obj":88,"expl":83,"orphan":72,"mark":69,"acl:relcl||nsubj:pass":56,"obl||xcomp":54,"expl||advcl":47,"cc":38,"advcl||amod":35,"advcl||nmod":32,"obl||obj":31,"nmod||nsubj":31,"dep":0},"4":{"ROOT":18043}}�cfg��neg_key�
 
1
+ ��moves��{"0":{"":151558},"1":{"":91349},"2":{"det":29810,"case":26215,"nsubj":13579,"amod":12918,"punct":11737,"advmod":9702,"obl":8128,"mark":6683,"cc":5438,"obj":4515,"aux":4218,"nsubj:pass":2513,"aux:pass":2468,"cop":2077,"nummod":2050,"nmod:poss":2023,"nmod":1255,"xcomp":1160,"compound:prt":839,"advcl":643,"acl":505,"parataxis":416,"iobj":307,"expl":273,"advmod||xcomp":266,"expl:pv":261,"obl||xcomp":259,"obl:agent":227,"obj||xcomp":200,"case||obl":162,"ccomp":108,"expl||advcl":60,"case||advcl":51,"obl||ccomp":50,"csubj":50,"advmod||ccomp":49,"obj||ccomp":47,"obl||obj":42,"advcl||xcomp":31,"dep":0},"3":{"punct":19438,"nmod":13028,"flat":9160,"conj":7136,"obl":6802,"fixed":4623,"nsubj":4273,"appos":3320,"obj":3142,"advmod":3090,"parataxis":2280,"xcomp":2095,"acl:relcl":2032,"advcl":1595,"compound:prt":1376,"cop":1281,"ccomp":1230,"acl":774,"amod":490,"aux:pass":398,"csubj":395,"nummod":365,"aux":355,"iobj":229,"expl:pv":225,"obl:agent":221,"nmod||obj":178,"advcl||advmod":152,"case":147,"acl:relcl||obj":135,"case||obl":132,"acl:relcl||nsubj":98,"acl||obj":88,"expl":83,"mark":69,"orphan":68,"acl:relcl||nsubj:pass":55,"obl||xcomp":53,"expl||advcl":47,"cc":35,"advcl||amod":35,"advcl||nmod":34,"obl||obj":32,"nmod||nsubj":31,"dep":0},"4":{"ROOT":18070}}�cfg��neg_key�
senter/cfg CHANGED
@@ -1,3 +1,3 @@
1
  {
2
-
3
  }
 
1
  {
2
+ "overwrite":false
3
  }
senter/model CHANGED
Binary files a/senter/model and b/senter/model differ
 
tagger/cfg CHANGED
@@ -8,6 +8,7 @@
8
  "ADJ|nom|comp|met-e|mv-n",
9
  "ADJ|nom|comp|met-e|zonder-n|stan",
10
  "ADJ|nom|sup|met-e|mv-n",
 
11
  "ADJ|nom|sup|met-e|zonder-n|stan",
12
  "ADJ|nom|sup|zonder|zonder-n",
13
  "ADJ|postnom|basis|met-s",
@@ -19,6 +20,7 @@
19
  "ADJ|prenom|comp|met-e|stan",
20
  "ADJ|prenom|comp|zonder",
21
  "ADJ|prenom|sup|met-e|stan",
 
22
  "ADJ|vrij|basis|zonder",
23
  "ADJ|vrij|comp|zonder",
24
  "ADJ|vrij|dim|zonder",
@@ -88,6 +90,7 @@
88
  "VNW|bez|det|stan|red|1|ev|prenom|zonder|agr",
89
  "VNW|bez|det|stan|red|2v|ev|prenom|zonder|agr",
90
  "VNW|bez|det|stan|red|3|ev|prenom|zonder|agr",
 
91
  "VNW|bez|det|stan|vol|1|ev|prenom|zonder|agr",
92
  "VNW|bez|det|stan|vol|1|mv|prenom|met-e|rest",
93
  "VNW|bez|det|stan|vol|1|mv|prenom|zonder|evon",
@@ -100,10 +103,12 @@
100
  "VNW|bez|det|stan|vol|3v|ev|prenom|met-e|rest",
101
  "VNW|bez|det|stan|vol|3|ev|prenom|zonder|agr",
102
  "VNW|bez|det|stan|vol|3|mv|prenom|zonder|agr",
 
103
  "VNW|onbep|adv-pron|gen|red|3|getal",
104
  "VNW|onbep|adv-pron|obl|vol|3o|getal",
105
  "VNW|onbep|det|stan|nom|met-e|mv-n",
106
  "VNW|onbep|det|stan|nom|met-e|zonder-n",
 
107
  "VNW|onbep|det|stan|prenom|met-e|agr",
108
  "VNW|onbep|det|stan|prenom|met-e|evz",
109
  "VNW|onbep|det|stan|prenom|met-e|mv",
@@ -197,5 +202,6 @@
197
  "WW|vd|prenom|met-e",
198
  "WW|vd|prenom|zonder",
199
  "WW|vd|vrij|zonder"
200
- ]
 
201
  }
 
8
  "ADJ|nom|comp|met-e|mv-n",
9
  "ADJ|nom|comp|met-e|zonder-n|stan",
10
  "ADJ|nom|sup|met-e|mv-n",
11
+ "ADJ|nom|sup|met-e|zonder-n|bijz",
12
  "ADJ|nom|sup|met-e|zonder-n|stan",
13
  "ADJ|nom|sup|zonder|zonder-n",
14
  "ADJ|postnom|basis|met-s",
 
20
  "ADJ|prenom|comp|met-e|stan",
21
  "ADJ|prenom|comp|zonder",
22
  "ADJ|prenom|sup|met-e|stan",
23
+ "ADJ|prenom|sup|zonder",
24
  "ADJ|vrij|basis|zonder",
25
  "ADJ|vrij|comp|zonder",
26
  "ADJ|vrij|dim|zonder",
 
90
  "VNW|bez|det|stan|red|1|ev|prenom|zonder|agr",
91
  "VNW|bez|det|stan|red|2v|ev|prenom|zonder|agr",
92
  "VNW|bez|det|stan|red|3|ev|prenom|zonder|agr",
93
+ "VNW|bez|det|stan|vol|1|ev|prenom|met-e|rest",
94
  "VNW|bez|det|stan|vol|1|ev|prenom|zonder|agr",
95
  "VNW|bez|det|stan|vol|1|mv|prenom|met-e|rest",
96
  "VNW|bez|det|stan|vol|1|mv|prenom|zonder|evon",
 
103
  "VNW|bez|det|stan|vol|3v|ev|prenom|met-e|rest",
104
  "VNW|bez|det|stan|vol|3|ev|prenom|zonder|agr",
105
  "VNW|bez|det|stan|vol|3|mv|prenom|zonder|agr",
106
+ "VNW|excl|pron|stan|vol|3|getal",
107
  "VNW|onbep|adv-pron|gen|red|3|getal",
108
  "VNW|onbep|adv-pron|obl|vol|3o|getal",
109
  "VNW|onbep|det|stan|nom|met-e|mv-n",
110
  "VNW|onbep|det|stan|nom|met-e|zonder-n",
111
+ "VNW|onbep|det|stan|nom|zonder|zonder-n",
112
  "VNW|onbep|det|stan|prenom|met-e|agr",
113
  "VNW|onbep|det|stan|prenom|met-e|evz",
114
  "VNW|onbep|det|stan|prenom|met-e|mv",
 
202
  "WW|vd|prenom|met-e",
203
  "WW|vd|prenom|zonder",
204
  "WW|vd|vrij|zonder"
205
+ ],
206
+ "overwrite":false
207
  }
tagger/model CHANGED
Binary files a/tagger/model and b/tagger/model differ
 
tok2vec/model CHANGED
Binary files a/tok2vec/model and b/tok2vec/model differ
 
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
 
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b69066ea6b2c2eb490737d7114b88d9355d6dfa8f2883f8a133f8c87d9babaa2
3
- size 8135406
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a81d5d5168ce264f0ff117a04099a8d6cce7b7366419e368481c07dd253127e3
3
+ size 10135659
vocab/vectors.cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "mode":"default"
3
+ }