osanseviero HF staff commited on
Commit
f799402
1 Parent(s): 058d178

Update spaCy pipeline

Browse files
LICENSES_SOURCES CHANGED
@@ -1,4 +1,4 @@
1
- # UD Danish DDT v2.5
2
 
3
  * Author: Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara
4
  * URL: https://github.com/UniversalDependencies/UD_Danish-DDT
 
1
+ # UD Danish DDT v2.8
2
 
3
  * Author: Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara
4
  * URL: https://github.com/UniversalDependencies/UD_Danish-DDT
README.md CHANGED
@@ -4,7 +4,7 @@ tags:
4
  - token-classification
5
  language:
6
  - da
7
- license: CC-BY-SA-4.0
8
  model-index:
9
  - name: da_core_news_lg
10
  results:
@@ -14,47 +14,47 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.8032786885
18
  - name: NER Recall
19
  type: recall
20
- value: 0.8166666667
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.8099173554
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
- value: 0.9631961259
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
- value: 0.8709677419
38
  - name: SENTER Recall
39
  type: recall
40
- value: 0.8617021277
41
  - name: SENTER F Score
42
  type: f_score
43
- value: 0.8663101604
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
- value: 0.823174479
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
- value: 0.823174479
58
  ---
59
  ### Details: https://spacy.io/models/da#da_core_news_lg
60
 
@@ -63,12 +63,12 @@ Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `da_core_news_lg` |
66
- | **Version** | `3.1.0` |
67
- | **spaCy** | `>=3.1.0,<3.2.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
71
- | **Sources** | [UD Danish DDT v2.5](https://github.com/UniversalDependencies/UD_Danish-DDT) (Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara)<br />[DaNE](https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane) (Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders Søgaard)<br />[Lemmatization Lists](https://github.com/michmech/lemmatization-lists/) (Michal Měchura)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
72
  | **License** | `CC BY-SA 4.0` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
@@ -76,12 +76,12 @@ Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
76
 
77
  <details>
78
 
79
- <summary>View label scheme (194 labels for 4 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
  | **`morphologizer`** | `AdpType=Prep\|POS=ADP`, `Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=AUX\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=PROPN`, `Definite=Ind\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=SCONJ`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=ADV`, `Number=Plur\|POS=DET\|PronType=Dem`, `Degree=Pos\|Number=Plur\|POS=ADJ`, `Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=PUNCT`, `POS=CCONJ`, `Definite=Ind\|Degree=Cmp\|Number=Sing\|POS=ADJ`, `Degree=Cmp\|POS=ADJ`, `POS=PRON\|PartType=Inf`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Definite=Ind\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Neut\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Dem`, `Degree=Pos\|POS=ADV`, `Definite=Def\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|PronType=Dem`, `NumType=Card\|POS=NUM`, `Definite=Ind\|Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `NumType=Ord\|POS=ADJ`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Mood=Ind\|POS=AUX\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=VERB\|VerbForm=Inf\|Voice=Act`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Pass`, `POS=ADP\|PartType=Inf`, `Degree=Pos\|POS=ADJ`, `Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `POS=AUX\|VerbForm=Inf\|Voice=Act`, `Definite=Ind\|Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Number=Plur\|POS=DET\|PronType=Ind`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Ind`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `POS=PART\|PartType=Inf`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Acc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Nom\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Nom\|Gender=Com\|POS=PRON\|PronType=Ind`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Ind`, `Mood=Imp\|POS=VERB`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Definite=Ind\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Number=Plur\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|VerbForm=Inf\|Voice=Pass`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Degree=Cmp\|POS=ADV`, `POS=ADV\|PartType=Inf`, `Degree=Sup\|POS=ADV`, `Number=Plur\|POS=PRON\|PronType=Dem`, `Number=Plur\|POS=PRON\|PronType=Ind`, `Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|POS=PROPN`, `POS=ADP`, `Degree=Cmp\|Number=Plur\|POS=ADJ`, `Definite=Def\|Degree=Sup\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Gender=Com\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Number=Plur\|POS=PRON\|PronType=Rcp`, `Case=Gen\|Degree=Cmp\|POS=ADJ`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=INTJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Number=Plur\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Definite=Def\|Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `POS=SYM`, `Case=Nom\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Degree=Sup\|POS=ADJ`, `Number=Plur\|POS=DET\|PronType=Ind\|Style=Arch`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Foreign=Yes\|POS=X`, `POS=DET\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Dem`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Gen\|POS=PRON\|PronType=Int,Rel`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Dem`, `Abbr=Yes\|POS=X`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Abs\|POS=ADJ`, `Definite=Ind\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Definite=Ind\|POS=NOUN`, `Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Gender=Com\|POS=PRON\|PronType=Int,Rel`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Degree=Abs\|POS=ADV`, `POS=VERB\|VerbForm=Ger`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Plur\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Gen\|Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Gen\|Degree=Pos\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|Tense=Pres`, `Case=Gen\|Number=Plur\|POS=DET\|PronType=Ind`, `Number[psor]=Plur\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=PRON\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Pass`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Plur\|POS=NOUN`, `Case=Gen\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Mood=Imp\|POS=AUX`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Gen\|POS=NOUN`, `Number[psor]=Plur\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=DET\|PronType=Dem`, `Definite=Def\|Number=Plur\|POS=NOUN` |
84
- | **`parser`** | `ROOT`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `dep`, `det`, `expl`, `fixed`, `flat`, `iobj`, `list`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nummod`, `obj`, `obl`, `obl:loc`, `obl:tmod`, `punct`, `xcomp` |
85
  | **`senter`** | `I`, `S` |
86
  | **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
87
 
@@ -92,15 +92,21 @@ Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
92
  | Type | Score |
93
  | --- | --- |
94
  | `TOKEN_ACC` | 99.95 |
95
- | `TAG_ACC` | 96.32 |
96
- | `POS_ACC` | 96.32 |
97
- | `MORPH_ACC` | 95.61 |
 
 
 
 
 
 
 
 
 
 
 
98
  | `LEMMA_ACC` | 84.91 |
99
- | `DEP_UAS` | 82.32 |
100
- | `DEP_LAS` | 78.29 |
101
- | `ENTS_P` | 80.33 |
102
- | `ENTS_R` | 81.67 |
103
- | `ENTS_F` | 80.99 |
104
- | `SENTS_P` | 87.10 |
105
- | `SENTS_R` | 86.17 |
106
- | `SENTS_F` | 86.63 |
 
4
  - token-classification
5
  language:
6
  - da
7
+ license: cc-by-sa-4.0
8
  model-index:
9
  - name: da_core_news_lg
10
  results:
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.8219461698
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.8270833333
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.8245067497
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
+ value: 0.9650363196
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
+ value: 0.9142335766
38
  - name: SENTER Recall
39
  type: recall
40
+ value: 0.8882978723
41
  - name: SENTER F Score
42
  type: f_score
43
+ value: 0.9010791367
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
+ value: 0.8225959658
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
+ value: 0.8225959658
58
  ---
59
  ### Details: https://spacy.io/models/da#da_core_news_lg
60
 
 
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `da_core_news_lg` |
66
+ | **Version** | `3.2.0` |
67
+ | **spaCy** | `>=3.2.0,<3.3.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
71
+ | **Sources** | [UD Danish DDT v2.8](https://github.com/UniversalDependencies/UD_Danish-DDT) (Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara)<br />[DaNE](https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane) (Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders Søgaard)<br />[Lemmatization Lists](https://github.com/michmech/lemmatization-lists/) (Michal Měchura)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
72
  | **License** | `CC BY-SA 4.0` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
 
76
 
77
  <details>
78
 
79
+ <summary>View label scheme (195 labels for 4 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
  | **`morphologizer`** | `AdpType=Prep\|POS=ADP`, `Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=AUX\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=PROPN`, `Definite=Ind\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=SCONJ`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=ADV`, `Number=Plur\|POS=DET\|PronType=Dem`, `Degree=Pos\|Number=Plur\|POS=ADJ`, `Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=PUNCT`, `POS=CCONJ`, `Definite=Ind\|Degree=Cmp\|Number=Sing\|POS=ADJ`, `Degree=Cmp\|POS=ADJ`, `POS=PRON\|PartType=Inf`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Definite=Ind\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Neut\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Dem`, `Degree=Pos\|POS=ADV`, `Definite=Def\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|PronType=Dem`, `NumType=Card\|POS=NUM`, `Definite=Ind\|Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `NumType=Ord\|POS=ADJ`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Mood=Ind\|POS=AUX\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=VERB\|VerbForm=Inf\|Voice=Act`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Pass`, `POS=ADP\|PartType=Inf`, `Degree=Pos\|POS=ADJ`, `Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `POS=AUX\|VerbForm=Inf\|Voice=Act`, `Definite=Ind\|Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Number=Plur\|POS=DET\|PronType=Ind`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Ind`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `POS=PART\|PartType=Inf`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Acc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Nom\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Nom\|Gender=Com\|POS=PRON\|PronType=Ind`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Ind`, `Mood=Imp\|POS=VERB`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Definite=Ind\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Number=Plur\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|VerbForm=Inf\|Voice=Pass`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Degree=Cmp\|POS=ADV`, `POS=ADV\|PartType=Inf`, `Degree=Sup\|POS=ADV`, `Number=Plur\|POS=PRON\|PronType=Dem`, `Number=Plur\|POS=PRON\|PronType=Ind`, `Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|POS=PROPN`, `POS=ADP`, `Degree=Cmp\|Number=Plur\|POS=ADJ`, `Definite=Def\|Degree=Sup\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Gender=Com\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Number=Plur\|POS=PRON\|PronType=Rcp`, `Case=Gen\|Degree=Cmp\|POS=ADJ`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=INTJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Number=Plur\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Definite=Def\|Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `POS=SYM`, `Case=Nom\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Degree=Sup\|POS=ADJ`, `Number=Plur\|POS=DET\|PronType=Ind\|Style=Arch`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Foreign=Yes\|POS=X`, `POS=DET\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Dem`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Gen\|POS=PRON\|PronType=Int,Rel`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Dem`, `Abbr=Yes\|POS=X`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Abs\|POS=ADJ`, `Definite=Ind\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Definite=Ind\|POS=NOUN`, `Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Gender=Com\|POS=PRON\|PronType=Int,Rel`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Degree=Abs\|POS=ADV`, `POS=VERB\|VerbForm=Ger`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Plur\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Gen\|Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Gen\|Degree=Pos\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|Tense=Pres`, `Case=Gen\|Number=Plur\|POS=DET\|PronType=Ind`, `Number[psor]=Plur\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=PRON\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Pass`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Plur\|POS=NOUN`, `Case=Gen\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Mood=Imp\|POS=AUX`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Gen\|POS=NOUN`, `Number[psor]=Plur\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=DET\|PronType=Dem`, `Definite=Def\|Number=Plur\|POS=NOUN` |
84
+ | **`parser`** | `ROOT`, `acl:relcl`, `advcl`, `advmod`, `advmod:lmod`, `amod`, `appos`, `aux`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `dep`, `det`, `expl`, `fixed`, `flat`, `iobj`, `list`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nummod`, `obj`, `obl`, `obl:lmod`, `obl:tmod`, `punct`, `xcomp` |
85
  | **`senter`** | `I`, `S` |
86
  | **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
87
 
 
92
  | Type | Score |
93
  | --- | --- |
94
  | `TOKEN_ACC` | 99.95 |
95
+ | `TOKEN_P` | 99.78 |
96
+ | `TOKEN_R` | 99.75 |
97
+ | `TOKEN_F` | 99.76 |
98
+ | `POS_ACC` | 96.50 |
99
+ | `MORPH_ACC` | 95.72 |
100
+ | `MORPH_MICRO_P` | 97.22 |
101
+ | `MORPH_MICRO_R` | 96.68 |
102
+ | `MORPH_MICRO_F` | 96.95 |
103
+ | `SENTS_P` | 91.42 |
104
+ | `SENTS_R` | 88.83 |
105
+ | `SENTS_F` | 90.11 |
106
+ | `DEP_UAS` | 82.26 |
107
+ | `DEP_LAS` | 78.33 |
108
+ | `TAG_ACC` | 96.50 |
109
  | `LEMMA_ACC` | 84.91 |
110
+ | `ENTS_P` | 82.19 |
111
+ | `ENTS_R` | 82.71 |
112
+ | `ENTS_F` | 82.45 |
 
 
 
 
 
accuracy.json CHANGED
@@ -1,58 +1,53 @@
1
  {
2
  "token_acc": 0.9994672349,
3
- "tag_acc": 0.9631961259,
4
- "pos_acc": 0.9631961259,
5
- "morph_acc": 0.956125908,
6
- "lemma_acc": 0.8491041162,
7
- "dep_uas": 0.823174479,
8
- "dep_las": 0.7829050279,
9
- "ents_p": 0.8032786885,
10
- "ents_r": 0.8166666667,
11
- "ents_f": 0.8099173554,
12
- "sents_p": 0.8709677419,
13
- "sents_r": 0.8617021277,
14
- "sents_f": 0.8663101604,
15
- "speed": 9700.6985683523,
16
  "morph_per_feat": {
17
  "Mood": {
18
- "p": 0.9799235182,
19
- "r": 0.9771210677,
20
- "f": 0.9785202864
21
  },
22
  "Tense": {
23
- "p": 0.9735049205,
24
- "r": 0.968373494,
25
- "f": 0.9709324273
26
  },
27
  "VerbForm": {
28
- "p": 0.9654747226,
29
- "r": 0.9583843329,
30
- "f": 0.9619164619
31
  },
32
  "Voice": {
33
- "p": 0.9805389222,
34
- "r": 0.9790732436,
35
- "f": 0.9798055348
36
  },
37
  "Definite": {
38
- "p": 0.9697090474,
39
- "r": 0.9612801264,
40
- "f": 0.9654761905
41
  },
42
  "Gender": {
43
- "p": 0.9583194398,
44
- "r": 0.9551345962,
45
- "f": 0.9567243675
46
  },
47
  "Number": {
48
- "p": 0.9684542587,
49
- "r": 0.9608763693,
50
- "f": 0.9646504321
51
  },
52
  "AdpType": {
53
- "p": 0.9964507542,
54
- "r": 0.9929266136,
55
- "f": 0.9946855624
56
  },
57
  "PartType": {
58
  "p": 1.0,
@@ -60,29 +55,29 @@
60
  "f": 1.0
61
  },
62
  "Case": {
63
- "p": 0.9823151125,
64
- "r": 0.9652448657,
65
- "f": 0.9737051793
66
  },
67
  "Person": {
68
- "p": 0.9787610619,
69
- "r": 0.9822380107,
70
- "f": 0.9804964539
71
  },
72
  "PronType": {
73
- "p": 0.9860312243,
74
- "r": 0.9868421053,
75
- "f": 0.9864364982
76
  },
77
  "NumType": {
78
- "p": 0.9795918367,
79
- "r": 0.9536423841,
80
- "f": 0.966442953
81
  },
82
  "Degree": {
83
- "p": 0.9548780488,
84
- "r": 0.943373494,
85
- "f": 0.9490909091
86
  },
87
  "Reflex": {
88
  "p": 1.0,
@@ -90,19 +85,19 @@
90
  "f": 1.0
91
  },
92
  "Number[psor]": {
93
- "p": 0.9772727273,
94
  "r": 1.0,
95
- "f": 0.9885057471
96
  },
97
  "Poss": {
98
- "p": 0.9887640449,
99
  "r": 1.0,
100
- "f": 0.9943502825
101
  },
102
  "Foreign": {
103
- "p": 1.0,
104
- "r": 0.3,
105
- "f": 0.4615384615
106
  },
107
  "Abbr": {
108
  "p": 1.0,
@@ -115,146 +110,146 @@
115
  "f": 1.0
116
  },
117
  "Polite": {
118
- "p": 0.6666666667,
119
  "r": 0.5,
120
- "f": 0.5714285714
121
  }
122
  },
 
 
 
 
 
123
  "dep_las_per_type": {
124
  "advmod": {
125
- "p": 0.6979591837,
126
- "r": 0.7245762712,
127
- "f": 0.711018711
128
  },
129
  "root": {
130
- "p": 0.8165467626,
131
- "r": 0.804964539,
132
- "f": 0.8107142857
133
  },
134
  "nsubj": {
135
- "p": 0.8518918919,
136
- "r": 0.8312236287,
137
- "f": 0.8414308596
138
  },
139
  "case": {
140
- "p": 0.8845014808,
141
- "r": 0.8853754941,
142
- "f": 0.8849382716
143
  },
144
  "obl": {
145
- "p": 0.7078651685,
146
- "r": 0.6858475894,
147
- "f": 0.6966824645
148
  },
149
  "cc": {
150
- "p": 0.7988338192,
151
- "r": 0.7965116279,
152
- "f": 0.7976710335
153
  },
154
  "conj": {
155
- "p": 0.654155496,
156
- "r": 0.6506666667,
157
- "f": 0.6524064171
158
  },
159
  "obj": {
160
- "p": 0.8052434457,
161
- "r": 0.8349514563,
162
- "f": 0.819828408
163
  },
164
  "aux": {
165
- "p": 0.8797653959,
166
- "r": 0.8746355685,
167
- "f": 0.8771929825
168
  },
169
  "acl:relcl": {
170
- "p": 0.6117021277,
171
- "r": 0.6216216216,
172
- "f": 0.6166219839
173
  },
174
- "obl:loc": {
175
- "p": 0.734375,
176
- "r": 0.6714285714,
177
- "f": 0.7014925373
178
  },
179
  "det": {
180
- "p": 0.9151712887,
181
- "r": 0.9242174629,
182
- "f": 0.9196721311
183
  },
184
  "amod": {
185
- "p": 0.8367346939,
186
- "r": 0.8395904437,
187
- "f": 0.8381601363
188
  },
189
  "nmod:poss": {
190
- "p": 0.6960784314,
191
- "r": 0.702970297,
192
- "f": 0.6995073892
193
  },
194
  "ccomp": {
195
- "p": 0.5967741935,
196
- "r": 0.5967741935,
197
- "f": 0.5967741935
198
  },
199
  "nummod": {
200
- "p": 0.8548387097,
201
- "r": 0.8833333333,
202
- "f": 0.868852459
203
  },
204
  "flat": {
205
- "p": 0.8101265823,
206
- "r": 0.8476821192,
207
- "f": 0.8284789644
208
  },
209
  "compound:prt": {
210
- "p": 0.4193548387,
211
- "r": 0.3170731707,
212
- "f": 0.3611111111
213
  },
214
  "advcl": {
215
- "p": 0.6260869565,
216
- "r": 0.6206896552,
217
- "f": 0.6233766234
218
  },
219
  "mark": {
220
- "p": 0.8770833333,
221
- "r": 0.864476386,
222
- "f": 0.8707342296
223
  },
224
  "cop": {
225
- "p": 0.7647058824,
226
- "r": 0.8171428571,
227
- "f": 0.7900552486
228
  },
229
  "dep": {
230
- "p": 0.2048192771,
231
- "r": 0.320754717,
232
- "f": 0.25
233
  },
234
  "nmod": {
235
- "p": 0.6310679612,
236
- "r": 0.634765625,
237
- "f": 0.6329113924
238
  },
239
  "iobj": {
240
- "p": 0.6470588235,
241
- "r": 0.5,
242
- "f": 0.5641025641
243
  },
244
  "xcomp": {
245
- "p": 0.5588235294,
246
- "r": 0.3220338983,
247
- "f": 0.4086021505
248
- },
249
- "appos": {
250
- "p": 0.53125,
251
- "r": 0.5151515152,
252
- "f": 0.5230769231
253
  },
254
  "list": {
255
- "p": 0.3333333333,
256
- "r": 0.2777777778,
257
- "f": 0.303030303
258
  },
259
  "vocative": {
260
  "p": 0.0,
@@ -262,46 +257,62 @@
262
  "f": 0.0
263
  },
264
  "fixed": {
265
- "p": 0.85,
266
- "r": 0.8095238095,
267
- "f": 0.8292682927
268
  },
269
  "expl": {
270
- "p": 0.8181818182,
271
- "r": 0.7941176471,
272
- "f": 0.8059701493
273
  },
274
- "obl:tmod": {
275
- "p": 0.6,
276
- "r": 0.3333333333,
277
  "f": 0.4285714286
278
  },
 
 
 
 
 
279
  "discourse": {
280
  "p": 0.0,
281
  "r": 0.0,
282
  "f": 0.0
 
 
 
 
 
283
  }
284
  },
 
 
 
 
 
285
  "ents_per_type": {
286
  "PER": {
287
- "p": 0.898089172,
288
- "r": 0.8493975904,
289
- "f": 0.8730650155
290
  },
291
  "ORG": {
292
- "p": 0.7529411765,
293
- "r": 0.7111111111,
294
- "f": 0.7314285714
295
  },
296
  "MISC": {
297
- "p": 0.6771653543,
298
- "r": 0.7610619469,
299
- "f": 0.7166666667
300
  },
301
  "LOC": {
302
- "p": 0.8487394958,
303
- "r": 0.9099099099,
304
- "f": 0.8782608696
305
  }
306
- }
 
307
  }
 
1
  {
2
  "token_acc": 0.9994672349,
3
+ "token_p": 0.9977732598,
4
+ "token_r": 0.9974835463,
5
+ "token_f": 0.997628382,
6
+ "pos_acc": 0.9650363196,
7
+ "morph_acc": 0.9571912833,
8
+ "morph_micro_p": 0.9722335025,
9
+ "morph_micro_r": 0.9667861289,
10
+ "morph_micro_f": 0.969502164,
 
 
 
 
 
11
  "morph_per_feat": {
12
  "Mood": {
13
+ "p": 0.9781160799,
14
+ "r": 0.9799809342,
15
+ "f": 0.979047619
16
  },
17
  "Tense": {
18
+ "p": 0.9765506808,
19
+ "r": 0.9721385542,
20
+ "f": 0.9743396226
21
  },
22
  "VerbForm": {
23
+ "p": 0.9697156984,
24
+ "r": 0.9602203182,
25
+ "f": 0.9649446494
26
  },
27
  "Voice": {
28
+ "p": 0.9797752809,
29
+ "r": 0.9775784753,
30
+ "f": 0.9786756453
31
  },
32
  "Definite": {
33
+ "p": 0.9666401906,
34
+ "r": 0.9616752272,
35
+ "f": 0.9641513171
36
  },
37
  "Gender": {
38
+ "p": 0.9588903743,
39
+ "r": 0.9534729146,
40
+ "f": 0.956173971
41
  },
42
  "Number": {
43
+ "p": 0.9666403993,
44
+ "r": 0.9598330725,
45
+ "f": 0.9632247088
46
  },
47
  "AdpType": {
48
+ "p": 0.9991071429,
49
+ "r": 0.9893899204,
50
+ "f": 0.994224789
51
  },
52
  "PartType": {
53
  "p": 1.0,
 
55
  "f": 1.0
56
  },
57
  "Case": {
58
+ "p": 0.9792332268,
59
+ "r": 0.9684044234,
60
+ "f": 0.9737887212
61
  },
62
  "Person": {
63
+ "p": 0.9805996473,
64
+ "r": 0.9875666075,
65
+ "f": 0.9840707965
66
  },
67
  "PronType": {
68
+ "p": 0.9860082305,
69
+ "r": 0.9851973684,
70
+ "f": 0.9856026327
71
  },
72
  "NumType": {
73
+ "p": 0.9731543624,
74
+ "r": 0.9602649007,
75
+ "f": 0.9666666667
76
  },
77
  "Degree": {
78
+ "p": 0.9587878788,
79
+ "r": 0.9530120482,
80
+ "f": 0.9558912387
81
  },
82
  "Reflex": {
83
  "p": 1.0,
 
85
  "f": 1.0
86
  },
87
  "Number[psor]": {
88
+ "p": 0.9885057471,
89
  "r": 1.0,
90
+ "f": 0.9942196532
91
  },
92
  "Poss": {
93
+ "p": 1.0,
94
  "r": 1.0,
95
+ "f": 1.0
96
  },
97
  "Foreign": {
98
+ "p": 0.6666666667,
99
+ "r": 0.4,
100
+ "f": 0.5
101
  },
102
  "Abbr": {
103
  "p": 1.0,
 
110
  "f": 1.0
111
  },
112
  "Polite": {
113
+ "p": 1.0,
114
  "r": 0.5,
115
+ "f": 0.6666666667
116
  }
117
  },
118
+ "sents_p": 0.9142335766,
119
+ "sents_r": 0.8882978723,
120
+ "sents_f": 0.9010791367,
121
+ "dep_uas": 0.8225959658,
122
+ "dep_las": 0.7833277461,
123
  "dep_las_per_type": {
124
  "advmod": {
125
+ "p": 0.6842105263,
126
+ "r": 0.697740113,
127
+ "f": 0.6909090909
128
  },
129
  "root": {
130
+ "p": 0.8513761468,
131
+ "r": 0.8226950355,
132
+ "f": 0.8367899008
133
  },
134
  "nsubj": {
135
+ "p": 0.8508583691,
136
+ "r": 0.8364978903,
137
+ "f": 0.8436170213
138
  },
139
  "case": {
140
+ "p": 0.8953603159,
141
+ "r": 0.8944773176,
142
+ "f": 0.8949185989
143
  },
144
  "obl": {
145
+ "p": 0.71973466,
146
+ "r": 0.6739130435,
147
+ "f": 0.6960705694
148
  },
149
  "cc": {
150
+ "p": 0.7885714286,
151
+ "r": 0.8023255814,
152
+ "f": 0.795389049
153
  },
154
  "conj": {
155
+ "p": 0.647696477,
156
+ "r": 0.6373333333,
157
+ "f": 0.6424731183
158
  },
159
  "obj": {
160
+ "p": 0.8161764706,
161
+ "r": 0.8621359223,
162
+ "f": 0.8385269122
163
  },
164
  "aux": {
165
+ "p": 0.8742690058,
166
+ "r": 0.8717201166,
167
+ "f": 0.8729927007
168
  },
169
  "acl:relcl": {
170
+ "p": 0.5773195876,
171
+ "r": 0.6054054054,
172
+ "f": 0.5910290237
173
  },
174
+ "advmod:lmod": {
175
+ "p": 0.6714285714,
176
+ "r": 0.7014925373,
177
+ "f": 0.6861313869
178
  },
179
  "det": {
180
+ "p": 0.9253731343,
181
+ "r": 0.9192751236,
182
+ "f": 0.9223140496
183
  },
184
  "amod": {
185
+ "p": 0.8313458262,
186
+ "r": 0.8327645051,
187
+ "f": 0.832054561
188
  },
189
  "nmod:poss": {
190
+ "p": 0.6326530612,
191
+ "r": 0.6138613861,
192
+ "f": 0.6231155779
193
  },
194
  "ccomp": {
195
+ "p": 0.676056338,
196
+ "r": 0.7741935484,
197
+ "f": 0.7218045113
198
  },
199
  "nummod": {
200
+ "p": 0.8620689655,
201
+ "r": 0.8333333333,
202
+ "f": 0.8474576271
203
  },
204
  "flat": {
205
+ "p": 0.8,
206
+ "r": 0.8741721854,
207
+ "f": 0.835443038
208
  },
209
  "compound:prt": {
210
+ "p": 0.4,
211
+ "r": 0.3414634146,
212
+ "f": 0.3684210526
213
  },
214
  "advcl": {
215
+ "p": 0.6181818182,
216
+ "r": 0.5862068966,
217
+ "f": 0.6017699115
218
  },
219
  "mark": {
220
+ "p": 0.8782051282,
221
+ "r": 0.8439425051,
222
+ "f": 0.8607329843
223
  },
224
  "cop": {
225
+ "p": 0.7842105263,
226
+ "r": 0.8514285714,
227
+ "f": 0.8164383562
228
  },
229
  "dep": {
230
+ "p": 0.1707317073,
231
+ "r": 0.2641509434,
232
+ "f": 0.2074074074
233
  },
234
  "nmod": {
235
+ "p": 0.6296992481,
236
+ "r": 0.654296875,
237
+ "f": 0.6417624521
238
  },
239
  "iobj": {
240
+ "p": 0.8,
241
+ "r": 0.5454545455,
242
+ "f": 0.6486486486
243
  },
244
  "xcomp": {
245
+ "p": 0.5333333333,
246
+ "r": 0.406779661,
247
+ "f": 0.4615384615
 
 
 
 
 
248
  },
249
  "list": {
250
+ "p": 0.5454545455,
251
+ "r": 0.3333333333,
252
+ "f": 0.4137931034
253
  },
254
  "vocative": {
255
  "p": 0.0,
 
257
  "f": 0.0
258
  },
259
  "fixed": {
260
+ "p": 0.8648648649,
261
+ "r": 0.7804878049,
262
+ "f": 0.8205128205
263
  },
264
  "expl": {
265
+ "p": 0.8,
266
+ "r": 0.8235294118,
267
+ "f": 0.8115942029
268
  },
269
+ "appos": {
270
+ "p": 0.4054054054,
271
+ "r": 0.4545454545,
272
  "f": 0.4285714286
273
  },
274
+ "obl:tmod": {
275
+ "p": 0.5555555556,
276
+ "r": 0.2777777778,
277
+ "f": 0.3703703704
278
+ },
279
  "discourse": {
280
  "p": 0.0,
281
  "r": 0.0,
282
  "f": 0.0
283
+ },
284
+ "obl:lmod": {
285
+ "p": 0.0,
286
+ "r": 0.0,
287
+ "f": 0.0
288
  }
289
  },
290
+ "tag_acc": 0.9650363196,
291
+ "lemma_acc": 0.8491041162,
292
+ "ents_p": 0.8219461698,
293
+ "ents_r": 0.8270833333,
294
+ "ents_f": 0.8245067497,
295
  "ents_per_type": {
296
  "PER": {
297
+ "p": 0.9171974522,
298
+ "r": 0.8674698795,
299
+ "f": 0.8916408669
300
  },
301
  "ORG": {
302
+ "p": 0.7840909091,
303
+ "r": 0.7666666667,
304
+ "f": 0.7752808989
305
  },
306
  "MISC": {
307
+ "p": 0.6776859504,
308
+ "r": 0.7256637168,
309
+ "f": 0.7008547009
310
  },
311
  "LOC": {
312
+ "p": 0.8717948718,
313
+ "r": 0.9189189189,
314
+ "f": 0.8947368421
315
  }
316
+ },
317
+ "speed": 8840.2170640394
318
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
 
config.cfg CHANGED
@@ -1,10 +1,8 @@
1
  [paths]
2
- train = "corpus/da-core-news/train.spacy"
3
- dev = "corpus/da-core-news/dev.spacy"
4
- vectors = "corpus/da_vectors"
5
- raw = null
6
  init_tok2vec = null
7
- vocab_data = null
8
 
9
  [system]
10
  gpu_allocator = null
@@ -24,6 +22,7 @@ tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
24
 
25
  [components.attribute_ruler]
26
  factory = "attribute_ruler"
 
27
  validate = false
28
 
29
  [components.lemmatizer]
@@ -31,9 +30,13 @@ factory = "lemmatizer"
31
  mode = "lookup"
32
  model = null
33
  overwrite = false
 
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
 
 
 
37
 
38
  [components.morphologizer.model]
39
  @architectures = "spacy.Tagger.v1"
@@ -48,6 +51,7 @@ upstream = "tok2vec"
48
  factory = "ner"
49
  incorrect_spans_key = null
50
  moves = null
 
51
  update_with_oracle_cut_size = 100
52
 
53
  [components.ner.model]
@@ -65,8 +69,8 @@ nO = null
65
  [components.ner.model.tok2vec.embed]
66
  @architectures = "spacy.MultiHashEmbed.v2"
67
  width = 96
68
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
69
- rows = [5000,2500,2500,2500]
70
  include_static_vectors = true
71
 
72
  [components.ner.model.tok2vec.encode]
@@ -81,6 +85,7 @@ factory = "parser"
81
  learn_tokens = false
82
  min_action_freq = 30
83
  moves = null
 
84
  update_with_oracle_cut_size = 100
85
 
86
  [components.parser.model]
@@ -99,6 +104,8 @@ upstream = "tok2vec"
99
 
100
  [components.senter]
101
  factory = "senter"
 
 
102
 
103
  [components.senter.model]
104
  @architectures = "spacy.Tagger.v1"
@@ -110,8 +117,8 @@ nO = null
110
  [components.senter.model.tok2vec.embed]
111
  @architectures = "spacy.MultiHashEmbed.v2"
112
  width = 16
113
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
114
- rows = [1000,500,500,500]
115
  include_static_vectors = true
116
 
117
  [components.senter.model.tok2vec.encode]
@@ -130,8 +137,8 @@ factory = "tok2vec"
130
  [components.tok2vec.model.embed]
131
  @architectures = "spacy.MultiHashEmbed.v2"
132
  width = ${components.tok2vec.model.encode:width}
133
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
134
- rows = [5000,2500,2500,2500]
135
  include_static_vectors = true
136
 
137
  [components.tok2vec.model.encode]
@@ -145,22 +152,19 @@ maxout_pieces = 3
145
 
146
  [corpora.dev]
147
  @readers = "spacy.Corpus.v1"
148
- limit = 0
149
- max_length = 0
150
- path = ${paths:dev}
151
  gold_preproc = false
 
 
152
  augmenter = null
153
 
154
  [corpora.train]
155
  @readers = "spacy.Corpus.v1"
156
- path = ${paths:train}
157
- max_length = 5000
158
  gold_preproc = false
 
159
  limit = 0
160
-
161
- [corpora.train.augmenter]
162
- @augmenters = "spacy.lower_case.v1"
163
- level = 0.1
164
 
165
  [training]
166
  train_corpus = "corpora.train"
@@ -191,9 +195,8 @@ compound = 1.001
191
  t = 0.0
192
 
193
  [training.logger]
194
- @loggers = "spacy.WandbLogger.v1"
195
- project_name = "spacy-v3.0.0a2"
196
- remove_config_values = []
197
 
198
  [training.optimizer]
199
  @optimizers = "Adam.v1"
@@ -216,16 +219,17 @@ dep_las_per_type = null
216
  sents_p = null
217
  sents_r = null
218
  sents_f = 0.02
219
- lemma_acc = 0.33
220
- ents_f = 0.33
221
  ents_p = 0.0
222
  ents_r = 0.0
223
  ents_per_type = null
 
224
 
225
  [pretraining]
226
 
227
  [initialize]
228
- vocab_data = ${paths.vocab_data}
229
  vectors = ${paths.vectors}
230
  init_tok2vec = ${paths.init_tok2vec}
231
  before_init = null
 
1
  [paths]
2
+ train = null
3
+ dev = null
4
+ vectors = null
 
5
  init_tok2vec = null
 
6
 
7
  [system]
8
  gpu_allocator = null
 
22
 
23
  [components.attribute_ruler]
24
  factory = "attribute_ruler"
25
+ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
26
  validate = false
27
 
28
  [components.lemmatizer]
 
30
  mode = "lookup"
31
  model = null
32
  overwrite = false
33
+ scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
37
+ extend = false
38
+ overwrite = true
39
+ scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
41
  [components.morphologizer.model]
42
  @architectures = "spacy.Tagger.v1"
 
51
  factory = "ner"
52
  incorrect_spans_key = null
53
  moves = null
54
+ scorer = {"@scorers":"spacy.ner_scorer.v1"}
55
  update_with_oracle_cut_size = 100
56
 
57
  [components.ner.model]
 
69
  [components.ner.model.tok2vec.embed]
70
  @architectures = "spacy.MultiHashEmbed.v2"
71
  width = 96
72
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
73
+ rows = [5000,2500,2500,2500,100]
74
  include_static_vectors = true
75
 
76
  [components.ner.model.tok2vec.encode]
 
85
  learn_tokens = false
86
  min_action_freq = 30
87
  moves = null
88
+ scorer = {"@scorers":"spacy.parser_scorer.v1"}
89
  update_with_oracle_cut_size = 100
90
 
91
  [components.parser.model]
 
104
 
105
  [components.senter]
106
  factory = "senter"
107
+ overwrite = false
108
+ scorer = {"@scorers":"spacy.senter_scorer.v1"}
109
 
110
  [components.senter.model]
111
  @architectures = "spacy.Tagger.v1"
 
117
  [components.senter.model.tok2vec.embed]
118
  @architectures = "spacy.MultiHashEmbed.v2"
119
  width = 16
120
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
121
+ rows = [1000,500,500,500,50]
122
  include_static_vectors = true
123
 
124
  [components.senter.model.tok2vec.encode]
 
137
  [components.tok2vec.model.embed]
138
  @architectures = "spacy.MultiHashEmbed.v2"
139
  width = ${components.tok2vec.model.encode:width}
140
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
141
+ rows = [5000,2500,2500,2500,100]
142
  include_static_vectors = true
143
 
144
  [components.tok2vec.model.encode]
 
152
 
153
  [corpora.dev]
154
  @readers = "spacy.Corpus.v1"
155
+ path = ${paths.dev}
 
 
156
  gold_preproc = false
157
+ max_length = 0
158
+ limit = 0
159
  augmenter = null
160
 
161
  [corpora.train]
162
  @readers = "spacy.Corpus.v1"
163
+ path = ${paths.train}
 
164
  gold_preproc = false
165
+ max_length = 0
166
  limit = 0
167
+ augmenter = null
 
 
 
168
 
169
  [training]
170
  train_corpus = "corpora.train"
 
195
  t = 0.0
196
 
197
  [training.logger]
198
+ @loggers = "spacy.ConsoleLogger.v1"
199
+ progress_bar = false
 
200
 
201
  [training.optimizer]
202
  @optimizers = "Adam.v1"
 
219
  sents_p = null
220
  sents_r = null
221
  sents_f = 0.02
222
+ lemma_acc = 0.5
223
+ ents_f = 0.16
224
  ents_p = 0.0
225
  ents_r = 0.0
226
  ents_per_type = null
227
+ speed = 0.0
228
 
229
  [pretraining]
230
 
231
  [initialize]
232
+ vocab_data = null
233
  vectors = ${paths.vectors}
234
  init_tok2vec = ${paths.init_tok2vec}
235
  before_init = null
da_core_news_lg-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e1eeb857234ad808c96a8196f919cc4bdd7fd319895e9dd14b62ea5ba2478062
3
- size 573201191
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d6b7f22582316614bc9c5c122a532b4eff39fc3b411054ee16ad4f89e14de765
3
+ size 573820420
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"da",
3
  "name":"core_news_lg",
4
- "version":"3.1.0",
5
  "description":"Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
- "spacy_version":">=3.1.0,<3.2.0",
11
- "spacy_git_version":"caba63b74",
12
  "vectors":{
13
  "width":300,
14
  "vectors":500000,
@@ -183,6 +183,7 @@
183
  "acl:relcl",
184
  "advcl",
185
  "advmod",
 
186
  "amod",
187
  "appos",
188
  "aux",
@@ -206,7 +207,7 @@
206
  "nummod",
207
  "obj",
208
  "obl",
209
- "obl:loc",
210
  "obl:tmod",
211
  "punct",
212
  "xcomp"
@@ -250,59 +251,54 @@
250
  ],
251
  "performance":{
252
  "token_acc":0.9994672349,
253
- "tag_acc":0.9631961259,
254
- "pos_acc":0.9631961259,
255
- "morph_acc":0.956125908,
256
- "lemma_acc":0.8491041162,
257
- "dep_uas":0.823174479,
258
- "dep_las":0.7829050279,
259
- "ents_p":0.8032786885,
260
- "ents_r":0.8166666667,
261
- "ents_f":0.8099173554,
262
- "sents_p":0.8709677419,
263
- "sents_r":0.8617021277,
264
- "sents_f":0.8663101604,
265
- "speed":9700.6985683523,
266
  "morph_per_feat":{
267
  "Mood":{
268
- "p":0.9799235182,
269
- "r":0.9771210677,
270
- "f":0.9785202864
271
  },
272
  "Tense":{
273
- "p":0.9735049205,
274
- "r":0.968373494,
275
- "f":0.9709324273
276
  },
277
  "VerbForm":{
278
- "p":0.9654747226,
279
- "r":0.9583843329,
280
- "f":0.9619164619
281
  },
282
  "Voice":{
283
- "p":0.9805389222,
284
- "r":0.9790732436,
285
- "f":0.9798055348
286
  },
287
  "Definite":{
288
- "p":0.9697090474,
289
- "r":0.9612801264,
290
- "f":0.9654761905
291
  },
292
  "Gender":{
293
- "p":0.9583194398,
294
- "r":0.9551345962,
295
- "f":0.9567243675
296
  },
297
  "Number":{
298
- "p":0.9684542587,
299
- "r":0.9608763693,
300
- "f":0.9646504321
301
  },
302
  "AdpType":{
303
- "p":0.9964507542,
304
- "r":0.9929266136,
305
- "f":0.9946855624
306
  },
307
  "PartType":{
308
  "p":1.0,
@@ -310,29 +306,29 @@
310
  "f":1.0
311
  },
312
  "Case":{
313
- "p":0.9823151125,
314
- "r":0.9652448657,
315
- "f":0.9737051793
316
  },
317
  "Person":{
318
- "p":0.9787610619,
319
- "r":0.9822380107,
320
- "f":0.9804964539
321
  },
322
  "PronType":{
323
- "p":0.9860312243,
324
- "r":0.9868421053,
325
- "f":0.9864364982
326
  },
327
  "NumType":{
328
- "p":0.9795918367,
329
- "r":0.9536423841,
330
- "f":0.966442953
331
  },
332
  "Degree":{
333
- "p":0.9548780488,
334
- "r":0.943373494,
335
- "f":0.9490909091
336
  },
337
  "Reflex":{
338
  "p":1.0,
@@ -340,19 +336,19 @@
340
  "f":1.0
341
  },
342
  "Number[psor]":{
343
- "p":0.9772727273,
344
  "r":1.0,
345
- "f":0.9885057471
346
  },
347
  "Poss":{
348
- "p":0.9887640449,
349
  "r":1.0,
350
- "f":0.9943502825
351
  },
352
  "Foreign":{
353
- "p":1.0,
354
- "r":0.3,
355
- "f":0.4615384615
356
  },
357
  "Abbr":{
358
  "p":1.0,
@@ -365,146 +361,146 @@
365
  "f":1.0
366
  },
367
  "Polite":{
368
- "p":0.6666666667,
369
  "r":0.5,
370
- "f":0.5714285714
371
  }
372
  },
 
 
 
 
 
373
  "dep_las_per_type":{
374
  "advmod":{
375
- "p":0.6979591837,
376
- "r":0.7245762712,
377
- "f":0.711018711
378
  },
379
  "root":{
380
- "p":0.8165467626,
381
- "r":0.804964539,
382
- "f":0.8107142857
383
  },
384
  "nsubj":{
385
- "p":0.8518918919,
386
- "r":0.8312236287,
387
- "f":0.8414308596
388
  },
389
  "case":{
390
- "p":0.8845014808,
391
- "r":0.8853754941,
392
- "f":0.8849382716
393
  },
394
  "obl":{
395
- "p":0.7078651685,
396
- "r":0.6858475894,
397
- "f":0.6966824645
398
  },
399
  "cc":{
400
- "p":0.7988338192,
401
- "r":0.7965116279,
402
- "f":0.7976710335
403
  },
404
  "conj":{
405
- "p":0.654155496,
406
- "r":0.6506666667,
407
- "f":0.6524064171
408
  },
409
  "obj":{
410
- "p":0.8052434457,
411
- "r":0.8349514563,
412
- "f":0.819828408
413
  },
414
  "aux":{
415
- "p":0.8797653959,
416
- "r":0.8746355685,
417
- "f":0.8771929825
418
  },
419
  "acl:relcl":{
420
- "p":0.6117021277,
421
- "r":0.6216216216,
422
- "f":0.6166219839
423
  },
424
- "obl:loc":{
425
- "p":0.734375,
426
- "r":0.6714285714,
427
- "f":0.7014925373
428
  },
429
  "det":{
430
- "p":0.9151712887,
431
- "r":0.9242174629,
432
- "f":0.9196721311
433
  },
434
  "amod":{
435
- "p":0.8367346939,
436
- "r":0.8395904437,
437
- "f":0.8381601363
438
  },
439
  "nmod:poss":{
440
- "p":0.6960784314,
441
- "r":0.702970297,
442
- "f":0.6995073892
443
  },
444
  "ccomp":{
445
- "p":0.5967741935,
446
- "r":0.5967741935,
447
- "f":0.5967741935
448
  },
449
  "nummod":{
450
- "p":0.8548387097,
451
- "r":0.8833333333,
452
- "f":0.868852459
453
  },
454
  "flat":{
455
- "p":0.8101265823,
456
- "r":0.8476821192,
457
- "f":0.8284789644
458
  },
459
  "compound:prt":{
460
- "p":0.4193548387,
461
- "r":0.3170731707,
462
- "f":0.3611111111
463
  },
464
  "advcl":{
465
- "p":0.6260869565,
466
- "r":0.6206896552,
467
- "f":0.6233766234
468
  },
469
  "mark":{
470
- "p":0.8770833333,
471
- "r":0.864476386,
472
- "f":0.8707342296
473
  },
474
  "cop":{
475
- "p":0.7647058824,
476
- "r":0.8171428571,
477
- "f":0.7900552486
478
  },
479
  "dep":{
480
- "p":0.2048192771,
481
- "r":0.320754717,
482
- "f":0.25
483
  },
484
  "nmod":{
485
- "p":0.6310679612,
486
- "r":0.634765625,
487
- "f":0.6329113924
488
  },
489
  "iobj":{
490
- "p":0.6470588235,
491
- "r":0.5,
492
- "f":0.5641025641
493
  },
494
  "xcomp":{
495
- "p":0.5588235294,
496
- "r":0.3220338983,
497
- "f":0.4086021505
498
- },
499
- "appos":{
500
- "p":0.53125,
501
- "r":0.5151515152,
502
- "f":0.5230769231
503
  },
504
  "list":{
505
- "p":0.3333333333,
506
- "r":0.2777777778,
507
- "f":0.303030303
508
  },
509
  "vocative":{
510
  "p":0.0,
@@ -512,52 +508,68 @@
512
  "f":0.0
513
  },
514
  "fixed":{
515
- "p":0.85,
516
- "r":0.8095238095,
517
- "f":0.8292682927
518
  },
519
  "expl":{
520
- "p":0.8181818182,
521
- "r":0.7941176471,
522
- "f":0.8059701493
523
  },
524
- "obl:tmod":{
525
- "p":0.6,
526
- "r":0.3333333333,
527
  "f":0.4285714286
528
  },
 
 
 
 
 
529
  "discourse":{
530
  "p":0.0,
531
  "r":0.0,
532
  "f":0.0
 
 
 
 
 
533
  }
534
  },
 
 
 
 
 
535
  "ents_per_type":{
536
  "PER":{
537
- "p":0.898089172,
538
- "r":0.8493975904,
539
- "f":0.8730650155
540
  },
541
  "ORG":{
542
- "p":0.7529411765,
543
- "r":0.7111111111,
544
- "f":0.7314285714
545
  },
546
  "MISC":{
547
- "p":0.6771653543,
548
- "r":0.7610619469,
549
- "f":0.7166666667
550
  },
551
  "LOC":{
552
- "p":0.8487394958,
553
- "r":0.9099099099,
554
- "f":0.8782608696
555
  }
556
- }
 
557
  },
558
  "sources":[
559
  {
560
- "name":"UD Danish DDT v2.5",
561
  "url":"https://github.com/UniversalDependencies/UD_Danish-DDT",
562
  "license":"CC BY-SA 4.0",
563
  "author":"Johannsen, Anders; Mart\u00ednez Alonso, H\u00e9ctor; Plank, Barbara"
 
1
  {
2
  "lang":"da",
3
  "name":"core_news_lg",
4
+ "version":"3.2.0",
5
  "description":"Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
+ "spacy_version":">=3.2.0,<3.3.0",
11
+ "spacy_git_version":"bb26550e2",
12
  "vectors":{
13
  "width":300,
14
  "vectors":500000,
 
183
  "acl:relcl",
184
  "advcl",
185
  "advmod",
186
+ "advmod:lmod",
187
  "amod",
188
  "appos",
189
  "aux",
 
207
  "nummod",
208
  "obj",
209
  "obl",
210
+ "obl:lmod",
211
  "obl:tmod",
212
  "punct",
213
  "xcomp"
 
251
  ],
252
  "performance":{
253
  "token_acc":0.9994672349,
254
+ "token_p":0.9977732598,
255
+ "token_r":0.9974835463,
256
+ "token_f":0.997628382,
257
+ "pos_acc":0.9650363196,
258
+ "morph_acc":0.9571912833,
259
+ "morph_micro_p":0.9722335025,
260
+ "morph_micro_r":0.9667861289,
261
+ "morph_micro_f":0.969502164,
 
 
 
 
 
262
  "morph_per_feat":{
263
  "Mood":{
264
+ "p":0.9781160799,
265
+ "r":0.9799809342,
266
+ "f":0.979047619
267
  },
268
  "Tense":{
269
+ "p":0.9765506808,
270
+ "r":0.9721385542,
271
+ "f":0.9743396226
272
  },
273
  "VerbForm":{
274
+ "p":0.9697156984,
275
+ "r":0.9602203182,
276
+ "f":0.9649446494
277
  },
278
  "Voice":{
279
+ "p":0.9797752809,
280
+ "r":0.9775784753,
281
+ "f":0.9786756453
282
  },
283
  "Definite":{
284
+ "p":0.9666401906,
285
+ "r":0.9616752272,
286
+ "f":0.9641513171
287
  },
288
  "Gender":{
289
+ "p":0.9588903743,
290
+ "r":0.9534729146,
291
+ "f":0.956173971
292
  },
293
  "Number":{
294
+ "p":0.9666403993,
295
+ "r":0.9598330725,
296
+ "f":0.9632247088
297
  },
298
  "AdpType":{
299
+ "p":0.9991071429,
300
+ "r":0.9893899204,
301
+ "f":0.994224789
302
  },
303
  "PartType":{
304
  "p":1.0,
 
306
  "f":1.0
307
  },
308
  "Case":{
309
+ "p":0.9792332268,
310
+ "r":0.9684044234,
311
+ "f":0.9737887212
312
  },
313
  "Person":{
314
+ "p":0.9805996473,
315
+ "r":0.9875666075,
316
+ "f":0.9840707965
317
  },
318
  "PronType":{
319
+ "p":0.9860082305,
320
+ "r":0.9851973684,
321
+ "f":0.9856026327
322
  },
323
  "NumType":{
324
+ "p":0.9731543624,
325
+ "r":0.9602649007,
326
+ "f":0.9666666667
327
  },
328
  "Degree":{
329
+ "p":0.9587878788,
330
+ "r":0.9530120482,
331
+ "f":0.9558912387
332
  },
333
  "Reflex":{
334
  "p":1.0,
 
336
  "f":1.0
337
  },
338
  "Number[psor]":{
339
+ "p":0.9885057471,
340
  "r":1.0,
341
+ "f":0.9942196532
342
  },
343
  "Poss":{
344
+ "p":1.0,
345
  "r":1.0,
346
+ "f":1.0
347
  },
348
  "Foreign":{
349
+ "p":0.6666666667,
350
+ "r":0.4,
351
+ "f":0.5
352
  },
353
  "Abbr":{
354
  "p":1.0,
 
361
  "f":1.0
362
  },
363
  "Polite":{
364
+ "p":1.0,
365
  "r":0.5,
366
+ "f":0.6666666667
367
  }
368
  },
369
+ "sents_p":0.9142335766,
370
+ "sents_r":0.8882978723,
371
+ "sents_f":0.9010791367,
372
+ "dep_uas":0.8225959658,
373
+ "dep_las":0.7833277461,
374
  "dep_las_per_type":{
375
  "advmod":{
376
+ "p":0.6842105263,
377
+ "r":0.697740113,
378
+ "f":0.6909090909
379
  },
380
  "root":{
381
+ "p":0.8513761468,
382
+ "r":0.8226950355,
383
+ "f":0.8367899008
384
  },
385
  "nsubj":{
386
+ "p":0.8508583691,
387
+ "r":0.8364978903,
388
+ "f":0.8436170213
389
  },
390
  "case":{
391
+ "p":0.8953603159,
392
+ "r":0.8944773176,
393
+ "f":0.8949185989
394
  },
395
  "obl":{
396
+ "p":0.71973466,
397
+ "r":0.6739130435,
398
+ "f":0.6960705694
399
  },
400
  "cc":{
401
+ "p":0.7885714286,
402
+ "r":0.8023255814,
403
+ "f":0.795389049
404
  },
405
  "conj":{
406
+ "p":0.647696477,
407
+ "r":0.6373333333,
408
+ "f":0.6424731183
409
  },
410
  "obj":{
411
+ "p":0.8161764706,
412
+ "r":0.8621359223,
413
+ "f":0.8385269122
414
  },
415
  "aux":{
416
+ "p":0.8742690058,
417
+ "r":0.8717201166,
418
+ "f":0.8729927007
419
  },
420
  "acl:relcl":{
421
+ "p":0.5773195876,
422
+ "r":0.6054054054,
423
+ "f":0.5910290237
424
  },
425
+ "advmod:lmod":{
426
+ "p":0.6714285714,
427
+ "r":0.7014925373,
428
+ "f":0.6861313869
429
  },
430
  "det":{
431
+ "p":0.9253731343,
432
+ "r":0.9192751236,
433
+ "f":0.9223140496
434
  },
435
  "amod":{
436
+ "p":0.8313458262,
437
+ "r":0.8327645051,
438
+ "f":0.832054561
439
  },
440
  "nmod:poss":{
441
+ "p":0.6326530612,
442
+ "r":0.6138613861,
443
+ "f":0.6231155779
444
  },
445
  "ccomp":{
446
+ "p":0.676056338,
447
+ "r":0.7741935484,
448
+ "f":0.7218045113
449
  },
450
  "nummod":{
451
+ "p":0.8620689655,
452
+ "r":0.8333333333,
453
+ "f":0.8474576271
454
  },
455
  "flat":{
456
+ "p":0.8,
457
+ "r":0.8741721854,
458
+ "f":0.835443038
459
  },
460
  "compound:prt":{
461
+ "p":0.4,
462
+ "r":0.3414634146,
463
+ "f":0.3684210526
464
  },
465
  "advcl":{
466
+ "p":0.6181818182,
467
+ "r":0.5862068966,
468
+ "f":0.6017699115
469
  },
470
  "mark":{
471
+ "p":0.8782051282,
472
+ "r":0.8439425051,
473
+ "f":0.8607329843
474
  },
475
  "cop":{
476
+ "p":0.7842105263,
477
+ "r":0.8514285714,
478
+ "f":0.8164383562
479
  },
480
  "dep":{
481
+ "p":0.1707317073,
482
+ "r":0.2641509434,
483
+ "f":0.2074074074
484
  },
485
  "nmod":{
486
+ "p":0.6296992481,
487
+ "r":0.654296875,
488
+ "f":0.6417624521
489
  },
490
  "iobj":{
491
+ "p":0.8,
492
+ "r":0.5454545455,
493
+ "f":0.6486486486
494
  },
495
  "xcomp":{
496
+ "p":0.5333333333,
497
+ "r":0.406779661,
498
+ "f":0.4615384615
 
 
 
 
 
499
  },
500
  "list":{
501
+ "p":0.5454545455,
502
+ "r":0.3333333333,
503
+ "f":0.4137931034
504
  },
505
  "vocative":{
506
  "p":0.0,
 
508
  "f":0.0
509
  },
510
  "fixed":{
511
+ "p":0.8648648649,
512
+ "r":0.7804878049,
513
+ "f":0.8205128205
514
  },
515
  "expl":{
516
+ "p":0.8,
517
+ "r":0.8235294118,
518
+ "f":0.8115942029
519
  },
520
+ "appos":{
521
+ "p":0.4054054054,
522
+ "r":0.4545454545,
523
  "f":0.4285714286
524
  },
525
+ "obl:tmod":{
526
+ "p":0.5555555556,
527
+ "r":0.2777777778,
528
+ "f":0.3703703704
529
+ },
530
  "discourse":{
531
  "p":0.0,
532
  "r":0.0,
533
  "f":0.0
534
+ },
535
+ "obl:lmod":{
536
+ "p":0.0,
537
+ "r":0.0,
538
+ "f":0.0
539
  }
540
  },
541
+ "tag_acc":0.9650363196,
542
+ "lemma_acc":0.8491041162,
543
+ "ents_p":0.8219461698,
544
+ "ents_r":0.8270833333,
545
+ "ents_f":0.8245067497,
546
  "ents_per_type":{
547
  "PER":{
548
+ "p":0.9171974522,
549
+ "r":0.8674698795,
550
+ "f":0.8916408669
551
  },
552
  "ORG":{
553
+ "p":0.7840909091,
554
+ "r":0.7666666667,
555
+ "f":0.7752808989
556
  },
557
  "MISC":{
558
+ "p":0.6776859504,
559
+ "r":0.7256637168,
560
+ "f":0.7008547009
561
  },
562
  "LOC":{
563
+ "p":0.8717948718,
564
+ "r":0.9189189189,
565
+ "f":0.8947368421
566
  }
567
+ },
568
+ "speed":8840.2170640394
569
  },
570
  "sources":[
571
  {
572
+ "name":"UD Danish DDT v2.8",
573
  "url":"https://github.com/UniversalDependencies/UD_Danish-DDT",
574
  "license":"CC BY-SA 4.0",
575
  "author":"Johannsen, Anders; Mart\u00ednez Alonso, H\u00e9ctor; Plank, Barbara"
morphologizer/cfg CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "labels_morph":{
3
  "AdpType=Prep|POS=ADP":"AdpType=Prep",
4
  "Definite=Ind|Gender=Com|Number=Sing|POS=NOUN":"Definite=Ind|Gender=Com|Number=Sing",
@@ -316,5 +317,6 @@
316
  "Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs":95,
317
  "POS=DET|PronType=Dem":90,
318
  "Definite=Def|Number=Plur|POS=NOUN":92
319
- }
 
320
  }
 
1
  {
2
+ "extend":false,
3
  "labels_morph":{
4
  "AdpType=Prep|POS=ADP":"AdpType=Prep",
5
  "Definite=Ind|Gender=Com|Number=Sing|POS=NOUN":"Definite=Ind|Gender=Com|Number=Sing",
 
317
  "Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs":95,
318
  "POS=DET|PronType=Dem":90,
319
  "Definite=Def|Number=Plur|POS=NOUN":92
320
+ },
321
+ "overwrite":true
322
  }
morphologizer/model CHANGED
Binary files a/morphologizer/model and b/morphologizer/model differ
 
ner/model CHANGED
Binary files a/ner/model and b/ner/model differ
 
parser/model CHANGED
Binary files a/parser/model and b/parser/model differ
 
parser/moves CHANGED
@@ -1 +1 @@
1
- ��moves�2{"0":{"":41514},"1":{"":34295},"2":{"case":7489,"nsubj":6009,"det":4334,"amod":3968,"advmod":3657,"mark":3529,"aux":2432,"cc":2261,"punct":2182,"cop":1329,"obl":894,"nummod":799,"nmod:poss":651,"nmod":460,"expl":291,"ccomp":202,"obj":195,"xcomp":122,"case||nmod":73,"obl:tmod":53,"dep":49,"acl:relcl":43},"3":{"punct":8601,"obl":3949,"obj":3758,"nmod":3565,"conj":2745,"advmod":2095,"flat":1295,"nsubj":1172,"acl:relcl":1131,"advcl":808,"amod":628,"obl:loc":467,"fixed":390,"dep":322,"xcomp":272,"appos":268,"compound:prt":261,"ccomp":252,"acl:relcl||nsubj":237,"case":202,"nummod":167,"list":161,"nmod:poss":156,"punct||conj":151,"mark":137,"cc":135,"iobj":107,"expl":77,"cop":69,"nmod||case":60,"aux":48,"obl:tmod":45,"cc||case":43,"advcl||advmod":43,"cc||conj":40,"case||obl":38,"punct||case":33},"4":{"ROOT":4367}}�cfg��neg_key�
 
1
+ ��moves�D{"0":{"":41514},"1":{"":34295},"2":{"case":7489,"nsubj":6009,"det":4334,"amod":3968,"advmod":3657,"mark":3529,"aux":2432,"cc":2261,"punct":2182,"cop":1329,"obl":894,"nummod":799,"nmod:poss":651,"nmod":460,"expl":291,"ccomp":202,"obj":195,"xcomp":122,"case||nmod":73,"obl:tmod":53,"dep":49,"acl:relcl":43},"3":{"punct":8601,"obl":3949,"obj":3758,"nmod":3565,"conj":2745,"advmod":2095,"flat":1295,"nsubj":1172,"acl:relcl":1131,"advcl":808,"amod":628,"advmod:lmod":423,"fixed":390,"dep":322,"xcomp":272,"appos":268,"compound:prt":261,"ccomp":252,"acl:relcl||nsubj":237,"case":202,"nummod":167,"list":161,"nmod:poss":156,"punct||conj":151,"mark":137,"cc":135,"iobj":107,"expl":77,"cop":69,"nmod||case":60,"aux":48,"obl:tmod":45,"obl:lmod":44,"cc||case":43,"advcl||advmod":43,"cc||conj":40,"case||obl":38,"punct||case":33},"4":{"ROOT":4367}}�cfg��neg_key�
senter/cfg CHANGED
@@ -1,3 +1,3 @@
1
  {
2
-
3
  }
 
1
  {
2
+ "overwrite":false
3
  }
senter/model CHANGED
Binary files a/senter/model and b/senter/model differ
 
tok2vec/model CHANGED
Binary files a/tok2vec/model and b/tok2vec/model differ
 
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
 
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:efaceca843effeab261a6ce55a6d3a99ae99949b7e28bdf0541fdc4c2d3e2c5a
3
- size 8625506
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:42fe610567bec6fa69da580a4753d083f1f4429efd32b3c1fa638b6a07a6757e
3
+ size 10070327
vocab/vectors.cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "mode":"default"
3
+ }