osanseviero HF staff commited on
Commit
53b5461
1 Parent(s): 2dfbac7

Update spaCy pipeline

Browse files
LICENSES_SOURCES CHANGED
@@ -1,4 +1,4 @@
1
- # UD French Sequoia v2.5
2
 
3
  * Author: Candito, Marie; Seddah, Djamé; Perrier, Guy; Guillaume, Bruno
4
  * URL: https://github.com/UniversalDependencies/UD_French-Sequoia
 
1
+ # UD French Sequoia v2.8
2
 
3
  * Author: Candito, Marie; Seddah, Djamé; Perrier, Guy; Guillaume, Bruno
4
  * URL: https://github.com/UniversalDependencies/UD_French-Sequoia
README.md CHANGED
@@ -4,7 +4,7 @@ tags:
4
  - token-classification
5
  language:
6
  - fr
7
- license: lgpllr
8
  model-index:
9
  - name: fr_core_news_md
10
  results:
@@ -14,47 +14,47 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.8342645439
18
  - name: NER Recall
19
  type: recall
20
- value: 0.8338136984
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.8340390602
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
- value: 0.9429704807
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
- value: 0.8845208845
38
  - name: SENTER Recall
39
  type: recall
40
- value: 0.9042340262
41
  - name: SENTER F Score
42
  type: f_score
43
- value: 0.8791208791
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
- value: 0.8916150686
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
- value: 0.8916150686
58
  ---
59
  ### Details: https://spacy.io/models/fr#fr_core_news_md
60
 
@@ -63,12 +63,12 @@ French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `fr_core_news_md` |
66
- | **Version** | `3.1.0` |
67
- | **spaCy** | `>=3.1.0,<3.2.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 500000 keys, 20000 unique vectors (300 dimensions) |
71
- | **Sources** | [UD French Sequoia v2.5](https://github.com/UniversalDependencies/UD_French-Sequoia) (Candito, Marie; Seddah, Djamé; Perrier, Guy; Guillaume, Bruno)<br />[WikiNER](https://figshare.com/articles/Learning_multilingual_named_entity_recognition_from_Wikipedia/5462500) (Joel Nothman, Nicky Ringland, Will Radford, Tara Murphy, James R Curran)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
72
  | **License** | `LGPL-LR` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
@@ -76,11 +76,11 @@ French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
76
 
77
  <details>
78
 
79
- <summary>View label scheme (240 labels for 4 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
- | **`morphologizer`** | `POS=PROPN`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Dem`, `Gender=Fem\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=SCONJ`, `POS=ADP`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Ord\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=NOUN`, `POS=PUNCT`, `Gender=Masc\|Number=Sing\|POS=PROPN`, `Number=Plur\|POS=ADJ`, `Gender=Masc\|Number=Plur\|POS=NOUN`, `Definite=Ind\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Number=Sing\|POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `POS=ADV`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Definite=Def\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PROPN`, `Definite=Def\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Card\|POS=NUM`, `Definite=Def\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Masc\|Number=Plur\|POS=ADJ`, `POS=CCONJ`, `Gender=Fem\|Number=Plur\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=ADJ`, `POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Number=Sing\|POS=DET\|Poss=Yes`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=ADP\|PronType=Art`, `Definite=Def\|Number=Plur\|POS=ADP\|PronType=Art`, `Definite=Ind\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=VERB\|VerbForm=Inf`, `Gender=Fem\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3`, `Number=Plur\|POS=DET`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Dem`, `POS=ADV\|PronType=Int`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Masc\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Number=Plur\|POS=DET\|Poss=Yes`, `POS=AUX\|VerbForm=Inf`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=ADV\|Polarity=Neg`, `Definite=Ind\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3`, `POS=PRON\|Person=3\|Reflex=Yes`, `Gender=Masc\|POS=NOUN`, `POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=PRON\|Person=3`, `Number=Plur\|POS=NOUN`, `NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=3`, `Number=Sing\|POS=NOUN`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Gender=Fem\|NumType=Ord\|Number=Sing\|POS=ADJ`, `Number=Plur\|POS=PROPN`, `Number=Sing\|POS=PROPN`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=DET`, `Gender=Fem\|Number=Sing\|POS=DET\|Poss=Yes`, `Gender=Masc\|POS=PRON`, `POS=NOUN`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON`, `Gender=Masc\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Number=Sing\|POS=PRON`, `Number=Sing\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|VerbForm=Fin`, `Number=Plur\|POS=DET\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=PRON`, `POS=NUM`, `Gender=Fem\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON`, `Number=Plur\|POS=PRON\|Person=3`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=INTJ`, `Number=Plur\|POS=PRON\|Person=2`, `NumType=Card\|POS=PRON`, `Definite=Ind\|Gender=Fem\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `NumType=Card\|POS=NOUN`, `POS=PRON\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3`, `Gender=Fem\|Number=Sing\|POS=DET`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=DET`, `Mood=Sub\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Definite=Ind\|Gender=Masc\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Plur\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Dem`, `Number=Sing\|POS=DET`, `Gender=Masc\|NumType=Card\|Number=Plur\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|POS=PRON`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `POS=SYM`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=DET\|PronType=Int`, `POS=DET`, `Gender=Masc\|Number=Plur\|POS=PRON`, `POS=PART`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|POS=VERB\|Person=3\|VerbForm=Fin`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Masc\|Number=Plur\|POS=DET`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Rel`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Rel`, `POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Fut\|VerbForm=Fin`, `Mood=Imp\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|Reflex=Yes`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|Reflex=Yes`, `Gender=Masc\|NumType=Card\|Number=Sing\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Number=Sing\|POS=PRON\|Person=1\|Reflex=Yes`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Gender=Fem\|POS=ADV`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=PROPN`, `Gender=Masc\|NumType=Card\|POS=NUM` |
84
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux:pass`, `aux:tense`, `case`, `cc`, `ccomp`, `conj`, `cop`, `dep`, `det`, `expl:comp`, `expl:pass`, `expl:subj`, `fixed`, `flat:foreign`, `flat:name`, `iobj`, `mark`, `nmod`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl:agent`, `obl:arg`, `obl:mod`, `parataxis`, `punct`, `vocative`, `xcomp` |
85
  | **`senter`** | `I`, `S` |
86
  | **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
@@ -92,15 +92,21 @@ French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
92
  | Type | Score |
93
  | --- | --- |
94
  | `TOKEN_ACC` | 99.90 |
95
- | `TAG_ACC` | 94.30 |
96
- | `POS_ACC` | 97.18 |
97
- | `MORPH_ACC` | 96.31 |
98
- | `LEMMA_ACC` | 90.79 |
99
- | `DEP_UAS` | 89.16 |
100
- | `DEP_LAS` | 85.46 |
101
- | `SENTS_P` | 88.45 |
102
- | `SENTS_R` | 90.42 |
103
- | `SENTS_F` | 87.91 |
104
- | `ENTS_P` | 83.43 |
105
- | `ENTS_R` | 83.38 |
106
- | `ENTS_F` | 83.40 |
 
 
 
 
 
 
 
4
  - token-classification
5
  language:
6
  - fr
7
+ license: lgpl-lr
8
  model-index:
9
  - name: fr_core_news_md
10
  results:
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.8332265556
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.8329930747
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.8331097988
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
+ value: 0.9448023502
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
+ value: 0.8782816229
38
  - name: SENTER Recall
39
  type: recall
40
+ value: 0.8932038835
41
  - name: SENTER F Score
42
  type: f_score
43
+ value: 0.8856799037
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
+ value: 0.8967353554
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
+ value: 0.8967353554
58
  ---
59
  ### Details: https://spacy.io/models/fr#fr_core_news_md
60
 
 
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `fr_core_news_md` |
66
+ | **Version** | `3.2.0` |
67
+ | **spaCy** | `>=3.2.0,<3.3.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 500000 keys, 20000 unique vectors (300 dimensions) |
71
+ | **Sources** | [UD French Sequoia v2.8](https://github.com/UniversalDependencies/UD_French-Sequoia) (Candito, Marie; Seddah, Djamé; Perrier, Guy; Guillaume, Bruno)<br />[WikiNER](https://figshare.com/articles/Learning_multilingual_named_entity_recognition_from_Wikipedia/5462500) (Joel Nothman, Nicky Ringland, Will Radford, Tara Murphy, James R Curran)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
72
  | **License** | `LGPL-LR` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
 
76
 
77
  <details>
78
 
79
+ <summary>View label scheme (238 labels for 4 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
+ | **`morphologizer`** | `POS=PROPN`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Dem`, `Gender=Fem\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=SCONJ`, `POS=ADP`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Ord\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=NOUN`, `POS=PUNCT`, `Gender=Masc\|Number=Sing\|POS=PROPN`, `Number=Plur\|POS=ADJ`, `Gender=Masc\|Number=Plur\|POS=NOUN`, `Definite=Ind\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Number=Sing\|POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `POS=ADV`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Definite=Def\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PROPN`, `Definite=Def\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Card\|POS=NUM`, `Definite=Def\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Masc\|Number=Plur\|POS=ADJ`, `POS=CCONJ`, `Gender=Fem\|Number=Plur\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=ADJ`, `POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Number=Sing\|POS=DET\|Poss=Yes`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=ADP\|PronType=Art`, `Definite=Def\|Number=Plur\|POS=ADP\|PronType=Art`, `Definite=Ind\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=VERB\|VerbForm=Inf`, `Gender=Fem\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3`, `Number=Plur\|POS=DET`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Dem`, `POS=ADV\|PronType=Int`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Masc\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Number=Plur\|POS=DET\|Poss=Yes`, `POS=AUX\|VerbForm=Inf`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=ADV\|Polarity=Neg`, `Definite=Ind\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3`, `POS=PRON\|Person=3\|Reflex=Yes`, `Gender=Masc\|POS=NOUN`, `POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=PRON\|Person=3`, `Number=Plur\|POS=NOUN`, `NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=3`, `Number=Sing\|POS=NOUN`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Gender=Fem\|NumType=Ord\|Number=Sing\|POS=ADJ`, `Number=Plur\|POS=PROPN`, `Number=Sing\|POS=PROPN`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=DET`, `Gender=Fem\|Number=Sing\|POS=DET\|Poss=Yes`, `Gender=Masc\|POS=PRON`, `POS=NOUN`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON`, `Gender=Masc\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Number=Sing\|POS=PRON`, `Number=Sing\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|VerbForm=Fin`, `Number=Plur\|POS=DET\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=PRON`, `POS=NUM`, `Gender=Fem\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON`, `Number=Plur\|POS=PRON\|Person=3`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=INTJ`, `Number=Plur\|POS=PRON\|Person=2`, `NumType=Card\|POS=PRON`, `Definite=Ind\|Gender=Fem\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `NumType=Card\|POS=NOUN`, `POS=PRON\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3`, `Gender=Fem\|Number=Sing\|POS=DET`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=DET`, `Mood=Sub\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Definite=Ind\|Gender=Masc\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Plur\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Dem`, `Number=Sing\|POS=DET`, `Gender=Masc\|NumType=Card\|Number=Plur\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|POS=PRON`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `POS=SYM`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=DET\|PronType=Int`, `POS=DET`, `Gender=Masc\|Number=Plur\|POS=PRON`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|POS=VERB\|Person=3\|VerbForm=Fin`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Masc\|Number=Plur\|POS=DET`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Rel`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Rel`, `POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Fut\|VerbForm=Fin`, `Mood=Imp\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|Reflex=Yes`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|Reflex=Yes`, `Gender=Masc\|NumType=Card\|Number=Sing\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Number=Sing\|POS=PRON\|Person=1\|Reflex=Yes`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=PROPN`, `Gender=Masc\|NumType=Card\|POS=NUM` |
84
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux:pass`, `aux:tense`, `case`, `cc`, `ccomp`, `conj`, `cop`, `dep`, `det`, `expl:comp`, `expl:pass`, `expl:subj`, `fixed`, `flat:foreign`, `flat:name`, `iobj`, `mark`, `nmod`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl:agent`, `obl:arg`, `obl:mod`, `parataxis`, `punct`, `vocative`, `xcomp` |
85
  | **`senter`** | `I`, `S` |
86
  | **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
 
92
  | Type | Score |
93
  | --- | --- |
94
  | `TOKEN_ACC` | 99.90 |
95
+ | `TOKEN_P` | 98.44 |
96
+ | `TOKEN_R` | 98.96 |
97
+ | `TOKEN_F` | 98.70 |
98
+ | `POS_ACC` | 97.35 |
99
+ | `MORPH_ACC` | 96.46 |
100
+ | `MORPH_MICRO_P` | 98.70 |
101
+ | `MORPH_MICRO_R` | 97.89 |
102
+ | `MORPH_MICRO_F` | 98.30 |
103
+ | `SENTS_P` | 87.83 |
104
+ | `SENTS_R` | 89.32 |
105
+ | `SENTS_F` | 88.57 |
106
+ | `DEP_UAS` | 89.67 |
107
+ | `DEP_LAS` | 85.80 |
108
+ | `TAG_ACC` | 94.48 |
109
+ | `LEMMA_ACC` | 90.70 |
110
+ | `ENTS_P` | 83.32 |
111
+ | `ENTS_R` | 83.30 |
112
+ | `ENTS_F` | 83.31 |
accuracy.json CHANGED
@@ -1,70 +1,68 @@
1
  {
2
  "token_acc": 0.9989751998,
3
- "tag_acc": 0.9429704807,
4
- "pos_acc": 0.971820102,
5
- "morph_acc": 0.9631079967,
6
- "lemma_acc": 0.9078954127,
7
- "dep_uas": 0.8916150686,
8
- "dep_las": 0.8546474637,
9
- "sents_p": 0.8845208845,
10
- "sents_r": 0.9042340262,
11
- "sents_f": 0.8791208791,
12
- "speed": 5248.8393910966,
13
  "morph_per_feat": {
14
  "Definite": {
15
- "p": 0.9904831625,
16
- "r": 0.9868708972,
17
- "f": 0.9886737304
18
  },
19
  "Number": {
20
- "p": 0.9908599142,
21
- "r": 0.9844329133,
22
- "f": 0.987635958
23
  },
24
  "PronType": {
25
- "p": 0.9967762734,
26
- "r": 0.9884910486,
27
- "f": 0.9926163724
28
  },
29
  "Gender": {
30
- "p": 0.9807792208,
31
- "r": 0.9757105943,
32
- "f": 0.978238342
33
  },
34
  "Mood": {
35
- "p": 0.9692028986,
36
- "r": 0.9502664298,
37
- "f": 0.9596412556
38
  },
39
  "Person": {
40
- "p": 0.9872286079,
41
- "r": 0.9711055276,
42
- "f": 0.9791006966
43
  },
44
  "Tense": {
45
- "p": 0.9689119171,
46
- "r": 0.9550561798,
47
- "f": 0.9619341564
48
  },
49
  "VerbForm": {
50
- "p": 0.9823973177,
51
- "r": 0.9701986755,
52
- "f": 0.9762598917
53
  },
54
  "NumType": {
55
- "p": 0.9858657244,
56
- "r": 0.9620689655,
57
- "f": 0.9738219895
58
  },
59
  "Reflex": {
60
- "p": 1.0,
61
  "r": 1.0,
62
- "f": 1.0
63
  },
64
  "Voice": {
65
- "p": 0.906779661,
66
  "r": 0.9553571429,
67
- "f": 0.9304347826
68
  },
69
  "Poss": {
70
  "p": 1.0,
@@ -77,171 +75,176 @@
77
  "f": 0.9880952381
78
  }
79
  },
 
 
 
 
 
80
  "dep_las_per_type": {
81
  "det": {
82
- "p": 0.9781553398,
83
- "r": 0.9757869249,
84
- "f": 0.976969697
85
  },
86
  "nsubj": {
87
- "p": 0.875,
88
- "r": 0.8602409639,
89
- "f": 0.8675577157
90
  },
91
  "aux:tense": {
92
- "p": 0.9349593496,
93
- "r": 0.92,
94
- "f": 0.9274193548
95
  },
96
  "root": {
97
- "p": 0.8902439024,
98
- "r": 0.8859223301,
99
- "f": 0.8880778589
100
  },
101
  "obj": {
102
- "p": 0.8489425982,
103
- "r": 0.8338278932,
104
- "f": 0.8413173653
105
  },
106
  "cc": {
107
- "p": 0.8772727273,
108
- "r": 0.8894009217,
109
- "f": 0.8832951945
110
  },
111
  "case": {
112
- "p": 0.9715061058,
113
- "r": 0.9754768392,
114
- "f": 0.9734874235
115
  },
116
  "obl:mod": {
117
- "p": 0.6903225806,
118
- "r": 0.6369047619,
119
- "f": 0.6625386997
120
  },
121
  "nmod": {
122
- "p": 0.7932011331,
123
- "r": 0.8383233533,
124
- "f": 0.8151382824
125
  },
126
  "conj": {
127
- "p": 0.5338345865,
128
- "r": 0.5590551181,
129
- "f": 0.5461538462
130
  },
131
  "nummod": {
132
- "p": 0.9090909091,
133
- "r": 0.8928571429,
134
- "f": 0.9009009009
135
  },
136
  "amod": {
137
- "p": 0.9288389513,
138
- "r": 0.9051094891,
139
- "f": 0.9168207024
140
  },
141
  "acl": {
142
- "p": 0.6722222222,
143
  "r": 0.6994219653,
144
- "f": 0.6855524079
145
  },
146
  "mark": {
147
- "p": 0.8818181818,
148
- "r": 0.8546255507,
149
- "f": 0.8680089485
150
  },
151
  "xcomp": {
152
- "p": 0.8466666667,
153
- "r": 0.8410596026,
154
- "f": 0.8438538206
155
  },
156
  "flat:name": {
157
- "p": 0.9126213592,
158
- "r": 0.8952380952,
159
- "f": 0.9038461538
160
  },
161
  "cop": {
162
- "p": 0.8876404494,
163
- "r": 0.8777777778,
164
- "f": 0.8826815642
165
  },
166
  "advmod": {
167
- "p": 0.8459119497,
168
- "r": 0.8432601881,
169
- "f": 0.8445839874
170
  },
171
  "obl:arg": {
172
- "p": 0.6774193548,
173
- "r": 0.6681818182,
174
- "f": 0.6727688787
175
  },
176
  "appos": {
177
- "p": 0.5526315789,
178
- "r": 0.5060240964,
179
- "f": 0.5283018868
180
  },
181
  "nsubj:pass": {
182
- "p": 0.9024390244,
183
- "r": 0.8705882353,
184
- "f": 0.8862275449
185
  },
186
  "aux:pass": {
187
- "p": 0.947826087,
188
- "r": 0.9732142857,
189
- "f": 0.9603524229
190
  },
191
  "acl:relcl": {
192
- "p": 0.6455696203,
193
- "r": 0.5930232558,
194
- "f": 0.6181818182
195
  },
196
  "advcl": {
197
- "p": 0.5714285714,
198
- "r": 0.5128205128,
199
- "f": 0.5405405405
200
  },
201
  "fixed": {
202
- "p": 0.7789473684,
203
- "r": 0.7326732673,
204
- "f": 0.7551020408
205
  },
206
  "dep": {
207
- "p": 0.2608695652,
208
- "r": 0.5806451613,
209
- "f": 0.36
210
  },
211
  "expl:subj": {
212
- "p": 0.78125,
213
- "r": 0.78125,
214
- "f": 0.78125
215
  },
216
  "expl:comp": {
217
- "p": 0.6097560976,
218
- "r": 0.8333333333,
219
- "f": 0.7042253521
220
  },
221
  "expl:pass": {
222
- "p": 0.5,
223
  "r": 0.1428571429,
224
- "f": 0.2222222222
225
  },
226
  "ccomp": {
227
- "p": 0.7037037037,
228
- "r": 0.7450980392,
229
- "f": 0.7238095238
230
  },
231
  "parataxis": {
232
  "p": 0.5,
233
- "r": 0.3928571429,
234
- "f": 0.44
235
  },
236
  "iobj": {
237
- "p": 0.6666666667,
238
- "r": 0.48,
239
- "f": 0.5581395349
240
  },
241
  "obl:agent": {
242
- "p": 0.9210526316,
243
- "r": 0.8333333333,
244
- "f": 0.875
245
  },
246
  "nsubj:caus": {
247
  "p": 0.0,
@@ -264,9 +267,9 @@
264
  "f": 0.0
265
  },
266
  "vocative": {
267
- "p": 0.8333333333,
268
  "r": 0.625,
269
- "f": 0.7142857143
270
  },
271
  "dislocated": {
272
  "p": 0.0,
@@ -294,29 +297,32 @@
294
  "f": 0.0
295
  }
296
  },
297
- "ents_p": 0.8342645439,
298
- "ents_r": 0.8338136984,
299
- "ents_f": 0.8340390602,
 
 
300
  "ents_per_type": {
301
  "PER": {
302
- "p": 0.8973945931,
303
- "r": 0.9180339615,
304
- "f": 0.9075969541
305
  },
306
  "LOC": {
307
- "p": 0.8399022936,
308
- "r": 0.8486154104,
309
- "f": 0.8442363712
310
  },
311
  "ORG": {
312
- "p": 0.7633558341,
313
- "r": 0.7553435115,
314
- "f": 0.7593285372
315
  },
316
  "MISC": {
317
- "p": 0.7282367838,
318
- "r": 0.6707974923,
319
- "f": 0.6983380132
320
  }
321
- }
 
322
  }
 
1
  {
2
  "token_acc": 0.9989751998,
3
+ "token_p": 0.9844389844,
4
+ "token_r": 0.9896058454,
5
+ "token_f": 0.9870156531,
6
+ "pos_acc": 0.9734577127,
7
+ "morph_acc": 0.9646282355,
8
+ "morph_micro_p": 0.9870385159,
9
+ "morph_micro_r": 0.9789204338,
10
+ "morph_micro_f": 0.9829627137,
 
 
11
  "morph_per_feat": {
12
  "Definite": {
13
+ "p": 0.9890270666,
14
+ "r": 0.9868613139,
15
+ "f": 0.9879430033
16
  },
17
  "Number": {
18
+ "p": 0.9940597735,
19
+ "r": 0.9858247423,
20
+ "f": 0.9899251317
21
  },
22
  "PronType": {
23
+ "p": 0.9954896907,
24
+ "r": 0.9884836852,
25
+ "f": 0.9919743178
26
  },
27
  "Gender": {
28
+ "p": 0.9824380165,
29
+ "r": 0.9721441349,
30
+ "f": 0.9772639692
31
  },
32
  "Mood": {
33
+ "p": 0.972972973,
34
+ "r": 0.9591474245,
35
+ "f": 0.9660107335
36
  },
37
  "Person": {
38
+ "p": 0.9897828863,
39
+ "r": 0.9748427673,
40
+ "f": 0.9822560203
41
  },
42
  "Tense": {
43
+ "p": 0.9672131148,
44
+ "r": 0.9642492339,
45
+ "f": 0.9657289003
46
  },
47
  "VerbForm": {
48
+ "p": 0.9816971714,
49
+ "r": 0.9768211921,
50
+ "f": 0.979253112
51
  },
52
  "NumType": {
53
+ "p": 0.9929577465,
54
+ "r": 0.9624573379,
55
+ "f": 0.9774696707
56
  },
57
  "Reflex": {
58
+ "p": 0.9777777778,
59
  "r": 1.0,
60
+ "f": 0.9887640449
61
  },
62
  "Voice": {
63
+ "p": 0.9224137931,
64
  "r": 0.9553571429,
65
+ "f": 0.9385964912
66
  },
67
  "Poss": {
68
  "p": 1.0,
 
75
  "f": 0.9880952381
76
  }
77
  },
78
+ "sents_p": 0.8782816229,
79
+ "sents_r": 0.8932038835,
80
+ "sents_f": 0.8856799037,
81
+ "dep_uas": 0.8967353554,
82
+ "dep_las": 0.8580193321,
83
  "dep_las_per_type": {
84
  "det": {
85
+ "p": 0.9805668016,
86
+ "r": 0.9774011299,
87
+ "f": 0.9789814066
88
  },
89
  "nsubj": {
90
+ "p": 0.8762376238,
91
+ "r": 0.8530120482,
92
+ "f": 0.8644688645
93
  },
94
  "aux:tense": {
95
+ "p": 0.9285714286,
96
+ "r": 0.936,
97
+ "f": 0.9322709163
98
  },
99
  "root": {
100
+ "p": 0.8865248227,
101
+ "r": 0.9101941748,
102
+ "f": 0.8982035928
103
  },
104
  "obj": {
105
+ "p": 0.849112426,
106
+ "r": 0.8516320475,
107
+ "f": 0.8503703704
108
  },
109
  "cc": {
110
+ "p": 0.8909090909,
111
+ "r": 0.9032258065,
112
+ "f": 0.8970251716
113
  },
114
  "case": {
115
+ "p": 0.9695740365,
116
+ "r": 0.9768392371,
117
+ "f": 0.9731930777
118
  },
119
  "obl:mod": {
120
+ "p": 0.6749226006,
121
+ "r": 0.6507462687,
122
+ "f": 0.6626139818
123
  },
124
  "nmod": {
125
+ "p": 0.8055028463,
126
+ "r": 0.8481518482,
127
+ "f": 0.8262773723
128
  },
129
  "conj": {
130
+ "p": 0.5241935484,
131
+ "r": 0.5118110236,
132
+ "f": 0.5179282869
133
  },
134
  "nummod": {
135
+ "p": 0.9141104294,
136
+ "r": 0.8816568047,
137
+ "f": 0.8975903614
138
  },
139
  "amod": {
140
+ "p": 0.9235074627,
141
+ "r": 0.9016393443,
142
+ "f": 0.9124423963
143
  },
144
  "acl": {
145
+ "p": 0.6994219653,
146
  "r": 0.6994219653,
147
+ "f": 0.6994219653
148
  },
149
  "mark": {
150
+ "p": 0.8689956332,
151
+ "r": 0.8766519824,
152
+ "f": 0.8728070175
153
  },
154
  "xcomp": {
155
+ "p": 0.8531468531,
156
+ "r": 0.8079470199,
157
+ "f": 0.8299319728
158
  },
159
  "flat:name": {
160
+ "p": 0.9504950495,
161
+ "r": 0.9142857143,
162
+ "f": 0.932038835
163
  },
164
  "cop": {
165
+ "p": 0.8602150538,
166
+ "r": 0.8888888889,
167
+ "f": 0.8743169399
168
  },
169
  "advmod": {
170
+ "p": 0.8566978193,
171
+ "r": 0.8620689655,
172
+ "f": 0.859375
173
  },
174
  "obl:arg": {
175
+ "p": 0.7562189055,
176
+ "r": 0.6909090909,
177
+ "f": 0.7220902613
178
  },
179
  "appos": {
180
+ "p": 0.4938271605,
181
+ "r": 0.4819277108,
182
+ "f": 0.487804878
183
  },
184
  "nsubj:pass": {
185
+ "p": 0.875,
186
+ "r": 0.8235294118,
187
+ "f": 0.8484848485
188
  },
189
  "aux:pass": {
190
+ "p": 0.9469026549,
191
+ "r": 0.9553571429,
192
+ "f": 0.9511111111
193
  },
194
  "acl:relcl": {
195
+ "p": 0.5764705882,
196
+ "r": 0.5697674419,
197
+ "f": 0.5730994152
198
  },
199
  "advcl": {
200
+ "p": 0.4698795181,
201
+ "r": 0.5,
202
+ "f": 0.4844720497
203
  },
204
  "fixed": {
205
+ "p": 0.8705882353,
206
+ "r": 0.74,
207
+ "f": 0.8
208
  },
209
  "dep": {
210
+ "p": 0.3392857143,
211
+ "r": 0.6551724138,
212
+ "f": 0.4470588235
213
  },
214
  "expl:subj": {
215
+ "p": 0.7647058824,
216
+ "r": 0.8125,
217
+ "f": 0.7878787879
218
  },
219
  "expl:comp": {
220
+ "p": 0.6585365854,
221
+ "r": 0.9,
222
+ "f": 0.7605633803
223
  },
224
  "expl:pass": {
225
+ "p": 0.3333333333,
226
  "r": 0.1428571429,
227
+ "f": 0.2
228
  },
229
  "ccomp": {
230
+ "p": 0.7058823529,
231
+ "r": 0.7058823529,
232
+ "f": 0.7058823529
233
  },
234
  "parataxis": {
235
  "p": 0.5,
236
+ "r": 0.3571428571,
237
+ "f": 0.4166666667
238
  },
239
  "iobj": {
240
+ "p": 0.7222222222,
241
+ "r": 0.52,
242
+ "f": 0.6046511628
243
  },
244
  "obl:agent": {
245
+ "p": 0.8684210526,
246
+ "r": 0.7857142857,
247
+ "f": 0.825
248
  },
249
  "nsubj:caus": {
250
  "p": 0.0,
 
267
  "f": 0.0
268
  },
269
  "vocative": {
270
+ "p": 1.0,
271
  "r": 0.625,
272
+ "f": 0.7692307692
273
  },
274
  "dislocated": {
275
  "p": 0.0,
 
297
  "f": 0.0
298
  }
299
  },
300
+ "tag_acc": 0.9448023502,
301
+ "lemma_acc": 0.9070078093,
302
+ "ents_p": 0.8332265556,
303
+ "ents_r": 0.8329930747,
304
+ "ents_f": 0.8331097988,
305
  "ents_per_type": {
306
  "PER": {
307
+ "p": 0.8976195492,
308
+ "r": 0.9158845024,
309
+ "f": 0.9066600468
310
  },
311
  "LOC": {
312
+ "p": 0.8411137734,
313
+ "r": 0.8516690371,
314
+ "f": 0.8463584968
315
  },
316
  "ORG": {
317
+ "p": 0.763618677,
318
+ "r": 0.7490458015,
319
+ "f": 0.7562620424
320
  },
321
  "MISC": {
322
+ "p": 0.7152963371,
323
+ "r": 0.6633620061,
324
+ "f": 0.6883509834
325
  }
326
+ },
327
+ "speed": 4392.7454027339
328
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
 
config.cfg CHANGED
@@ -1,10 +1,8 @@
1
  [paths]
2
- train = "corpus/fr-dep-news/train.spacy"
3
- dev = "corpus/fr-dep-news/dev.spacy"
4
- vectors = "corpus/fr_vectors"
5
- raw = null
6
  init_tok2vec = null
7
- vocab_data = null
8
 
9
  [system]
10
  gpu_allocator = null
@@ -24,6 +22,7 @@ tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
24
 
25
  [components.attribute_ruler]
26
  factory = "attribute_ruler"
 
27
  validate = false
28
 
29
  [components.lemmatizer]
@@ -31,9 +30,13 @@ factory = "lemmatizer"
31
  mode = "rule"
32
  model = null
33
  overwrite = false
 
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
 
 
 
37
 
38
  [components.morphologizer.model]
39
  @architectures = "spacy.Tagger.v1"
@@ -48,6 +51,7 @@ upstream = "tok2vec"
48
  factory = "ner"
49
  incorrect_spans_key = null
50
  moves = null
 
51
  update_with_oracle_cut_size = 100
52
 
53
  [components.ner.model]
@@ -65,8 +69,8 @@ nO = null
65
  [components.ner.model.tok2vec.embed]
66
  @architectures = "spacy.MultiHashEmbed.v2"
67
  width = 96
68
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
69
- rows = [5000,2500,2500,2500]
70
  include_static_vectors = true
71
 
72
  [components.ner.model.tok2vec.encode]
@@ -81,6 +85,7 @@ factory = "parser"
81
  learn_tokens = false
82
  min_action_freq = 30
83
  moves = null
 
84
  update_with_oracle_cut_size = 100
85
 
86
  [components.parser.model]
@@ -99,6 +104,8 @@ upstream = "tok2vec"
99
 
100
  [components.senter]
101
  factory = "senter"
 
 
102
 
103
  [components.senter.model]
104
  @architectures = "spacy.Tagger.v1"
@@ -110,8 +117,8 @@ nO = null
110
  [components.senter.model.tok2vec.embed]
111
  @architectures = "spacy.MultiHashEmbed.v2"
112
  width = 16
113
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
114
- rows = [1000,500,500,500]
115
  include_static_vectors = true
116
 
117
  [components.senter.model.tok2vec.encode]
@@ -130,8 +137,8 @@ factory = "tok2vec"
130
  [components.tok2vec.model.embed]
131
  @architectures = "spacy.MultiHashEmbed.v2"
132
  width = ${components.tok2vec.model.encode:width}
133
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
134
- rows = [5000,2500,2500,2500]
135
  include_static_vectors = true
136
 
137
  [components.tok2vec.model.encode]
@@ -145,22 +152,19 @@ maxout_pieces = 3
145
 
146
  [corpora.dev]
147
  @readers = "spacy.Corpus.v1"
148
- limit = 0
149
- max_length = 0
150
- path = ${paths:dev}
151
  gold_preproc = false
 
 
152
  augmenter = null
153
 
154
  [corpora.train]
155
  @readers = "spacy.Corpus.v1"
156
- path = ${paths:train}
157
- max_length = 5000
158
  gold_preproc = false
 
159
  limit = 0
160
-
161
- [corpora.train.augmenter]
162
- @augmenters = "spacy.lower_case.v1"
163
- level = 0.1
164
 
165
  [training]
166
  train_corpus = "corpora.train"
@@ -191,9 +195,8 @@ compound = 1.001
191
  t = 0.0
192
 
193
  [training.logger]
194
- @loggers = "spacy.WandbLogger.v1"
195
- project_name = "spacy-v3.0.0a2"
196
- remove_config_values = []
197
 
198
  [training.optimizer]
199
  @optimizers = "Adam.v1"
@@ -216,16 +219,17 @@ dep_las_per_type = null
216
  sents_p = null
217
  sents_r = null
218
  sents_f = 0.02
219
- lemma_acc = 0.33
220
- ents_f = 0.33
221
  ents_p = 0.0
222
  ents_r = 0.0
223
  ents_per_type = null
 
224
 
225
  [pretraining]
226
 
227
  [initialize]
228
- vocab_data = ${paths.vocab_data}
229
  vectors = ${paths.vectors}
230
  init_tok2vec = ${paths.init_tok2vec}
231
  before_init = null
 
1
  [paths]
2
+ train = null
3
+ dev = null
4
+ vectors = null
 
5
  init_tok2vec = null
 
6
 
7
  [system]
8
  gpu_allocator = null
 
22
 
23
  [components.attribute_ruler]
24
  factory = "attribute_ruler"
25
+ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
26
  validate = false
27
 
28
  [components.lemmatizer]
 
30
  mode = "rule"
31
  model = null
32
  overwrite = false
33
+ scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
37
+ extend = false
38
+ overwrite = true
39
+ scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
41
  [components.morphologizer.model]
42
  @architectures = "spacy.Tagger.v1"
 
51
  factory = "ner"
52
  incorrect_spans_key = null
53
  moves = null
54
+ scorer = {"@scorers":"spacy.ner_scorer.v1"}
55
  update_with_oracle_cut_size = 100
56
 
57
  [components.ner.model]
 
69
  [components.ner.model.tok2vec.embed]
70
  @architectures = "spacy.MultiHashEmbed.v2"
71
  width = 96
72
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
73
+ rows = [5000,2500,2500,2500,100]
74
  include_static_vectors = true
75
 
76
  [components.ner.model.tok2vec.encode]
 
85
  learn_tokens = false
86
  min_action_freq = 30
87
  moves = null
88
+ scorer = {"@scorers":"spacy.parser_scorer.v1"}
89
  update_with_oracle_cut_size = 100
90
 
91
  [components.parser.model]
 
104
 
105
  [components.senter]
106
  factory = "senter"
107
+ overwrite = false
108
+ scorer = {"@scorers":"spacy.senter_scorer.v1"}
109
 
110
  [components.senter.model]
111
  @architectures = "spacy.Tagger.v1"
 
117
  [components.senter.model.tok2vec.embed]
118
  @architectures = "spacy.MultiHashEmbed.v2"
119
  width = 16
120
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
121
+ rows = [1000,500,500,500,50]
122
  include_static_vectors = true
123
 
124
  [components.senter.model.tok2vec.encode]
 
137
  [components.tok2vec.model.embed]
138
  @architectures = "spacy.MultiHashEmbed.v2"
139
  width = ${components.tok2vec.model.encode:width}
140
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
141
+ rows = [5000,2500,2500,2500,100]
142
  include_static_vectors = true
143
 
144
  [components.tok2vec.model.encode]
 
152
 
153
  [corpora.dev]
154
  @readers = "spacy.Corpus.v1"
155
+ path = ${paths.dev}
 
 
156
  gold_preproc = false
157
+ max_length = 0
158
+ limit = 0
159
  augmenter = null
160
 
161
  [corpora.train]
162
  @readers = "spacy.Corpus.v1"
163
+ path = ${paths.train}
 
164
  gold_preproc = false
165
+ max_length = 0
166
  limit = 0
167
+ augmenter = null
 
 
 
168
 
169
  [training]
170
  train_corpus = "corpora.train"
 
195
  t = 0.0
196
 
197
  [training.logger]
198
+ @loggers = "spacy.ConsoleLogger.v1"
199
+ progress_bar = false
 
200
 
201
  [training.optimizer]
202
  @optimizers = "Adam.v1"
 
219
  sents_p = null
220
  sents_r = null
221
  sents_f = 0.02
222
+ lemma_acc = 0.5
223
+ ents_f = 0.16
224
  ents_p = 0.0
225
  ents_r = 0.0
226
  ents_per_type = null
227
+ speed = 0.0
228
 
229
  [pretraining]
230
 
231
  [initialize]
232
+ vocab_data = null
233
  vectors = ${paths.vectors}
234
  init_tok2vec = ${paths.init_tok2vec}
235
  before_init = null
fr_core_news_md-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2982a140518bc3fb4de3927551747411c89df8967a2da740df4880c93704f0d8
3
- size 46122237
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc89844588a50d13c21f203e80e6fa6df3857d1a29b19c92a8e7338ec487f522
3
+ size 46938345
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"fr",
3
  "name":"core_news_md",
4
- "version":"3.1.0",
5
  "description":"French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"LGPL-LR",
10
- "spacy_version":">=3.1.0,<3.2.0",
11
- "spacy_git_version":"caba63b74",
12
  "vectors":{
13
  "width":300,
14
  "vectors":20000,
@@ -173,7 +173,6 @@
173
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int",
174
  "POS=DET",
175
  "Gender=Masc|Number=Plur|POS=PRON",
176
- "POS=PART",
177
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin",
178
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin",
179
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
@@ -213,7 +212,6 @@
213
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin",
214
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin",
215
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin",
216
- "Gender=Fem|POS=ADV",
217
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin",
218
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
219
  "Gender=Fem|Number=Plur|POS=PROPN",
@@ -296,71 +294,69 @@
296
  ],
297
  "performance":{
298
  "token_acc":0.9989751998,
299
- "tag_acc":0.9429704807,
300
- "pos_acc":0.971820102,
301
- "morph_acc":0.9631079967,
302
- "lemma_acc":0.9078954127,
303
- "dep_uas":0.8916150686,
304
- "dep_las":0.8546474637,
305
- "sents_p":0.8845208845,
306
- "sents_r":0.9042340262,
307
- "sents_f":0.8791208791,
308
- "speed":5248.8393910966,
309
  "morph_per_feat":{
310
  "Definite":{
311
- "p":0.9904831625,
312
- "r":0.9868708972,
313
- "f":0.9886737304
314
  },
315
  "Number":{
316
- "p":0.9908599142,
317
- "r":0.9844329133,
318
- "f":0.987635958
319
  },
320
  "PronType":{
321
- "p":0.9967762734,
322
- "r":0.9884910486,
323
- "f":0.9926163724
324
  },
325
  "Gender":{
326
- "p":0.9807792208,
327
- "r":0.9757105943,
328
- "f":0.978238342
329
  },
330
  "Mood":{
331
- "p":0.9692028986,
332
- "r":0.9502664298,
333
- "f":0.9596412556
334
  },
335
  "Person":{
336
- "p":0.9872286079,
337
- "r":0.9711055276,
338
- "f":0.9791006966
339
  },
340
  "Tense":{
341
- "p":0.9689119171,
342
- "r":0.9550561798,
343
- "f":0.9619341564
344
  },
345
  "VerbForm":{
346
- "p":0.9823973177,
347
- "r":0.9701986755,
348
- "f":0.9762598917
349
  },
350
  "NumType":{
351
- "p":0.9858657244,
352
- "r":0.9620689655,
353
- "f":0.9738219895
354
  },
355
  "Reflex":{
356
- "p":1.0,
357
  "r":1.0,
358
- "f":1.0
359
  },
360
  "Voice":{
361
- "p":0.906779661,
362
  "r":0.9553571429,
363
- "f":0.9304347826
364
  },
365
  "Poss":{
366
  "p":1.0,
@@ -373,171 +369,176 @@
373
  "f":0.9880952381
374
  }
375
  },
 
 
 
 
 
376
  "dep_las_per_type":{
377
  "det":{
378
- "p":0.9781553398,
379
- "r":0.9757869249,
380
- "f":0.976969697
381
  },
382
  "nsubj":{
383
- "p":0.875,
384
- "r":0.8602409639,
385
- "f":0.8675577157
386
  },
387
  "aux:tense":{
388
- "p":0.9349593496,
389
- "r":0.92,
390
- "f":0.9274193548
391
  },
392
  "root":{
393
- "p":0.8902439024,
394
- "r":0.8859223301,
395
- "f":0.8880778589
396
  },
397
  "obj":{
398
- "p":0.8489425982,
399
- "r":0.8338278932,
400
- "f":0.8413173653
401
  },
402
  "cc":{
403
- "p":0.8772727273,
404
- "r":0.8894009217,
405
- "f":0.8832951945
406
  },
407
  "case":{
408
- "p":0.9715061058,
409
- "r":0.9754768392,
410
- "f":0.9734874235
411
  },
412
  "obl:mod":{
413
- "p":0.6903225806,
414
- "r":0.6369047619,
415
- "f":0.6625386997
416
  },
417
  "nmod":{
418
- "p":0.7932011331,
419
- "r":0.8383233533,
420
- "f":0.8151382824
421
  },
422
  "conj":{
423
- "p":0.5338345865,
424
- "r":0.5590551181,
425
- "f":0.5461538462
426
  },
427
  "nummod":{
428
- "p":0.9090909091,
429
- "r":0.8928571429,
430
- "f":0.9009009009
431
  },
432
  "amod":{
433
- "p":0.9288389513,
434
- "r":0.9051094891,
435
- "f":0.9168207024
436
  },
437
  "acl":{
438
- "p":0.6722222222,
439
  "r":0.6994219653,
440
- "f":0.6855524079
441
  },
442
  "mark":{
443
- "p":0.8818181818,
444
- "r":0.8546255507,
445
- "f":0.8680089485
446
  },
447
  "xcomp":{
448
- "p":0.8466666667,
449
- "r":0.8410596026,
450
- "f":0.8438538206
451
  },
452
  "flat:name":{
453
- "p":0.9126213592,
454
- "r":0.8952380952,
455
- "f":0.9038461538
456
  },
457
  "cop":{
458
- "p":0.8876404494,
459
- "r":0.8777777778,
460
- "f":0.8826815642
461
  },
462
  "advmod":{
463
- "p":0.8459119497,
464
- "r":0.8432601881,
465
- "f":0.8445839874
466
  },
467
  "obl:arg":{
468
- "p":0.6774193548,
469
- "r":0.6681818182,
470
- "f":0.6727688787
471
  },
472
  "appos":{
473
- "p":0.5526315789,
474
- "r":0.5060240964,
475
- "f":0.5283018868
476
  },
477
  "nsubj:pass":{
478
- "p":0.9024390244,
479
- "r":0.8705882353,
480
- "f":0.8862275449
481
  },
482
  "aux:pass":{
483
- "p":0.947826087,
484
- "r":0.9732142857,
485
- "f":0.9603524229
486
  },
487
  "acl:relcl":{
488
- "p":0.6455696203,
489
- "r":0.5930232558,
490
- "f":0.6181818182
491
  },
492
  "advcl":{
493
- "p":0.5714285714,
494
- "r":0.5128205128,
495
- "f":0.5405405405
496
  },
497
  "fixed":{
498
- "p":0.7789473684,
499
- "r":0.7326732673,
500
- "f":0.7551020408
501
  },
502
  "dep":{
503
- "p":0.2608695652,
504
- "r":0.5806451613,
505
- "f":0.36
506
  },
507
  "expl:subj":{
508
- "p":0.78125,
509
- "r":0.78125,
510
- "f":0.78125
511
  },
512
  "expl:comp":{
513
- "p":0.6097560976,
514
- "r":0.8333333333,
515
- "f":0.7042253521
516
  },
517
  "expl:pass":{
518
- "p":0.5,
519
  "r":0.1428571429,
520
- "f":0.2222222222
521
  },
522
  "ccomp":{
523
- "p":0.7037037037,
524
- "r":0.7450980392,
525
- "f":0.7238095238
526
  },
527
  "parataxis":{
528
  "p":0.5,
529
- "r":0.3928571429,
530
- "f":0.44
531
  },
532
  "iobj":{
533
- "p":0.6666666667,
534
- "r":0.48,
535
- "f":0.5581395349
536
  },
537
  "obl:agent":{
538
- "p":0.9210526316,
539
- "r":0.8333333333,
540
- "f":0.875
541
  },
542
  "nsubj:caus":{
543
  "p":0.0,
@@ -560,9 +561,9 @@
560
  "f":0.0
561
  },
562
  "vocative":{
563
- "p":0.8333333333,
564
  "r":0.625,
565
- "f":0.7142857143
566
  },
567
  "dislocated":{
568
  "p":0.0,
@@ -590,35 +591,38 @@
590
  "f":0.0
591
  }
592
  },
593
- "ents_p":0.8342645439,
594
- "ents_r":0.8338136984,
595
- "ents_f":0.8340390602,
 
 
596
  "ents_per_type":{
597
  "PER":{
598
- "p":0.8973945931,
599
- "r":0.9180339615,
600
- "f":0.9075969541
601
  },
602
  "LOC":{
603
- "p":0.8399022936,
604
- "r":0.8486154104,
605
- "f":0.8442363712
606
  },
607
  "ORG":{
608
- "p":0.7633558341,
609
- "r":0.7553435115,
610
- "f":0.7593285372
611
  },
612
  "MISC":{
613
- "p":0.7282367838,
614
- "r":0.6707974923,
615
- "f":0.6983380132
616
  }
617
- }
 
618
  },
619
  "sources":[
620
  {
621
- "name":"UD French Sequoia v2.5",
622
  "url":"https://github.com/UniversalDependencies/UD_French-Sequoia",
623
  "license":"LGPL-LR",
624
  "author":"Candito, Marie; Seddah, Djam\u00e9; Perrier, Guy; Guillaume, Bruno"
 
1
  {
2
  "lang":"fr",
3
  "name":"core_news_md",
4
+ "version":"3.2.0",
5
  "description":"French pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"LGPL-LR",
10
+ "spacy_version":">=3.2.0,<3.3.0",
11
+ "spacy_git_version":"bb26550e2",
12
  "vectors":{
13
  "width":300,
14
  "vectors":20000,
 
173
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int",
174
  "POS=DET",
175
  "Gender=Masc|Number=Plur|POS=PRON",
 
176
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin",
177
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin",
178
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
 
212
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin",
213
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin",
214
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin",
 
215
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin",
216
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
217
  "Gender=Fem|Number=Plur|POS=PROPN",
 
294
  ],
295
  "performance":{
296
  "token_acc":0.9989751998,
297
+ "token_p":0.9844389844,
298
+ "token_r":0.9896058454,
299
+ "token_f":0.9870156531,
300
+ "pos_acc":0.9734577127,
301
+ "morph_acc":0.9646282355,
302
+ "morph_micro_p":0.9870385159,
303
+ "morph_micro_r":0.9789204338,
304
+ "morph_micro_f":0.9829627137,
 
 
305
  "morph_per_feat":{
306
  "Definite":{
307
+ "p":0.9890270666,
308
+ "r":0.9868613139,
309
+ "f":0.9879430033
310
  },
311
  "Number":{
312
+ "p":0.9940597735,
313
+ "r":0.9858247423,
314
+ "f":0.9899251317
315
  },
316
  "PronType":{
317
+ "p":0.9954896907,
318
+ "r":0.9884836852,
319
+ "f":0.9919743178
320
  },
321
  "Gender":{
322
+ "p":0.9824380165,
323
+ "r":0.9721441349,
324
+ "f":0.9772639692
325
  },
326
  "Mood":{
327
+ "p":0.972972973,
328
+ "r":0.9591474245,
329
+ "f":0.9660107335
330
  },
331
  "Person":{
332
+ "p":0.9897828863,
333
+ "r":0.9748427673,
334
+ "f":0.9822560203
335
  },
336
  "Tense":{
337
+ "p":0.9672131148,
338
+ "r":0.9642492339,
339
+ "f":0.9657289003
340
  },
341
  "VerbForm":{
342
+ "p":0.9816971714,
343
+ "r":0.9768211921,
344
+ "f":0.979253112
345
  },
346
  "NumType":{
347
+ "p":0.9929577465,
348
+ "r":0.9624573379,
349
+ "f":0.9774696707
350
  },
351
  "Reflex":{
352
+ "p":0.9777777778,
353
  "r":1.0,
354
+ "f":0.9887640449
355
  },
356
  "Voice":{
357
+ "p":0.9224137931,
358
  "r":0.9553571429,
359
+ "f":0.9385964912
360
  },
361
  "Poss":{
362
  "p":1.0,
 
369
  "f":0.9880952381
370
  }
371
  },
372
+ "sents_p":0.8782816229,
373
+ "sents_r":0.8932038835,
374
+ "sents_f":0.8856799037,
375
+ "dep_uas":0.8967353554,
376
+ "dep_las":0.8580193321,
377
  "dep_las_per_type":{
378
  "det":{
379
+ "p":0.9805668016,
380
+ "r":0.9774011299,
381
+ "f":0.9789814066
382
  },
383
  "nsubj":{
384
+ "p":0.8762376238,
385
+ "r":0.8530120482,
386
+ "f":0.8644688645
387
  },
388
  "aux:tense":{
389
+ "p":0.9285714286,
390
+ "r":0.936,
391
+ "f":0.9322709163
392
  },
393
  "root":{
394
+ "p":0.8865248227,
395
+ "r":0.9101941748,
396
+ "f":0.8982035928
397
  },
398
  "obj":{
399
+ "p":0.849112426,
400
+ "r":0.8516320475,
401
+ "f":0.8503703704
402
  },
403
  "cc":{
404
+ "p":0.8909090909,
405
+ "r":0.9032258065,
406
+ "f":0.8970251716
407
  },
408
  "case":{
409
+ "p":0.9695740365,
410
+ "r":0.9768392371,
411
+ "f":0.9731930777
412
  },
413
  "obl:mod":{
414
+ "p":0.6749226006,
415
+ "r":0.6507462687,
416
+ "f":0.6626139818
417
  },
418
  "nmod":{
419
+ "p":0.8055028463,
420
+ "r":0.8481518482,
421
+ "f":0.8262773723
422
  },
423
  "conj":{
424
+ "p":0.5241935484,
425
+ "r":0.5118110236,
426
+ "f":0.5179282869
427
  },
428
  "nummod":{
429
+ "p":0.9141104294,
430
+ "r":0.8816568047,
431
+ "f":0.8975903614
432
  },
433
  "amod":{
434
+ "p":0.9235074627,
435
+ "r":0.9016393443,
436
+ "f":0.9124423963
437
  },
438
  "acl":{
439
+ "p":0.6994219653,
440
  "r":0.6994219653,
441
+ "f":0.6994219653
442
  },
443
  "mark":{
444
+ "p":0.8689956332,
445
+ "r":0.8766519824,
446
+ "f":0.8728070175
447
  },
448
  "xcomp":{
449
+ "p":0.8531468531,
450
+ "r":0.8079470199,
451
+ "f":0.8299319728
452
  },
453
  "flat:name":{
454
+ "p":0.9504950495,
455
+ "r":0.9142857143,
456
+ "f":0.932038835
457
  },
458
  "cop":{
459
+ "p":0.8602150538,
460
+ "r":0.8888888889,
461
+ "f":0.8743169399
462
  },
463
  "advmod":{
464
+ "p":0.8566978193,
465
+ "r":0.8620689655,
466
+ "f":0.859375
467
  },
468
  "obl:arg":{
469
+ "p":0.7562189055,
470
+ "r":0.6909090909,
471
+ "f":0.7220902613
472
  },
473
  "appos":{
474
+ "p":0.4938271605,
475
+ "r":0.4819277108,
476
+ "f":0.487804878
477
  },
478
  "nsubj:pass":{
479
+ "p":0.875,
480
+ "r":0.8235294118,
481
+ "f":0.8484848485
482
  },
483
  "aux:pass":{
484
+ "p":0.9469026549,
485
+ "r":0.9553571429,
486
+ "f":0.9511111111
487
  },
488
  "acl:relcl":{
489
+ "p":0.5764705882,
490
+ "r":0.5697674419,
491
+ "f":0.5730994152
492
  },
493
  "advcl":{
494
+ "p":0.4698795181,
495
+ "r":0.5,
496
+ "f":0.4844720497
497
  },
498
  "fixed":{
499
+ "p":0.8705882353,
500
+ "r":0.74,
501
+ "f":0.8
502
  },
503
  "dep":{
504
+ "p":0.3392857143,
505
+ "r":0.6551724138,
506
+ "f":0.4470588235
507
  },
508
  "expl:subj":{
509
+ "p":0.7647058824,
510
+ "r":0.8125,
511
+ "f":0.7878787879
512
  },
513
  "expl:comp":{
514
+ "p":0.6585365854,
515
+ "r":0.9,
516
+ "f":0.7605633803
517
  },
518
  "expl:pass":{
519
+ "p":0.3333333333,
520
  "r":0.1428571429,
521
+ "f":0.2
522
  },
523
  "ccomp":{
524
+ "p":0.7058823529,
525
+ "r":0.7058823529,
526
+ "f":0.7058823529
527
  },
528
  "parataxis":{
529
  "p":0.5,
530
+ "r":0.3571428571,
531
+ "f":0.4166666667
532
  },
533
  "iobj":{
534
+ "p":0.7222222222,
535
+ "r":0.52,
536
+ "f":0.6046511628
537
  },
538
  "obl:agent":{
539
+ "p":0.8684210526,
540
+ "r":0.7857142857,
541
+ "f":0.825
542
  },
543
  "nsubj:caus":{
544
  "p":0.0,
 
561
  "f":0.0
562
  },
563
  "vocative":{
564
+ "p":1.0,
565
  "r":0.625,
566
+ "f":0.7692307692
567
  },
568
  "dislocated":{
569
  "p":0.0,
 
591
  "f":0.0
592
  }
593
  },
594
+ "tag_acc":0.9448023502,
595
+ "lemma_acc":0.9070078093,
596
+ "ents_p":0.8332265556,
597
+ "ents_r":0.8329930747,
598
+ "ents_f":0.8331097988,
599
  "ents_per_type":{
600
  "PER":{
601
+ "p":0.8976195492,
602
+ "r":0.9158845024,
603
+ "f":0.9066600468
604
  },
605
  "LOC":{
606
+ "p":0.8411137734,
607
+ "r":0.8516690371,
608
+ "f":0.8463584968
609
  },
610
  "ORG":{
611
+ "p":0.763618677,
612
+ "r":0.7490458015,
613
+ "f":0.7562620424
614
  },
615
  "MISC":{
616
+ "p":0.7152963371,
617
+ "r":0.6633620061,
618
+ "f":0.6883509834
619
  }
620
+ },
621
+ "speed":4392.7454027339
622
  },
623
  "sources":[
624
  {
625
+ "name":"UD French Sequoia v2.8",
626
  "url":"https://github.com/UniversalDependencies/UD_French-Sequoia",
627
  "license":"LGPL-LR",
628
  "author":"Candito, Marie; Seddah, Djam\u00e9; Perrier, Guy; Guillaume, Bruno"
morphologizer/cfg CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "labels_morph":{
3
  "POS=PROPN":"",
4
  "Gender=Fem|Number=Sing|POS=DET|PronType=Dem":"Gender=Fem|Number=Sing|PronType=Dem",
@@ -153,7 +154,6 @@
153
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int":"Gender=Fem|Number=Plur|PronType=Int",
154
  "POS=DET":"",
155
  "Gender=Masc|Number=Plur|POS=PRON":"Gender=Masc|Number=Plur",
156
- "POS=PART":"",
157
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
158
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":"Mood=Ind|Person=3|VerbForm=Fin",
159
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
@@ -193,7 +193,6 @@
193
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Imp|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
194
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
195
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=2|Tense=Imp|VerbForm=Fin",
196
- "Gender=Fem|POS=ADV":"Gender=Fem",
197
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=2|Tense=Imp|VerbForm=Fin",
198
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Number=Plur|Tense=Past|VerbForm=Part",
199
  "Gender=Fem|Number=Plur|POS=PROPN":"Gender=Fem|Number=Plur",
@@ -353,7 +352,6 @@
353
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int":90,
354
  "POS=DET":90,
355
  "Gender=Masc|Number=Plur|POS=PRON":95,
356
- "POS=PART":94,
357
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":87,
358
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":100,
359
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
@@ -393,10 +391,10 @@
393
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":100,
394
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":87,
395
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":100,
396
- "Gender=Fem|POS=ADV":86,
397
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin":87,
398
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
399
  "Gender=Fem|Number=Plur|POS=PROPN":96,
400
  "Gender=Masc|NumType=Card|POS=NUM":93
401
- }
 
402
  }
 
1
  {
2
+ "extend":false,
3
  "labels_morph":{
4
  "POS=PROPN":"",
5
  "Gender=Fem|Number=Sing|POS=DET|PronType=Dem":"Gender=Fem|Number=Sing|PronType=Dem",
 
154
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int":"Gender=Fem|Number=Plur|PronType=Int",
155
  "POS=DET":"",
156
  "Gender=Masc|Number=Plur|POS=PRON":"Gender=Masc|Number=Plur",
 
157
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
158
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":"Mood=Ind|Person=3|VerbForm=Fin",
159
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
 
193
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Imp|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
194
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
195
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=2|Tense=Imp|VerbForm=Fin",
 
196
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=2|Tense=Imp|VerbForm=Fin",
197
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Number=Plur|Tense=Past|VerbForm=Part",
198
  "Gender=Fem|Number=Plur|POS=PROPN":"Gender=Fem|Number=Plur",
 
352
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int":90,
353
  "POS=DET":90,
354
  "Gender=Masc|Number=Plur|POS=PRON":95,
 
355
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":87,
356
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":100,
357
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
 
391
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":100,
392
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":87,
393
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":100,
 
394
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin":87,
395
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
396
  "Gender=Fem|Number=Plur|POS=PROPN":96,
397
  "Gender=Masc|NumType=Card|POS=NUM":93
398
+ },
399
+ "overwrite":true
400
  }
morphologizer/model CHANGED
Binary files a/morphologizer/model and b/morphologizer/model differ
 
ner/model CHANGED
Binary files a/ner/model and b/ner/model differ
 
parser/model CHANGED
Binary files a/parser/model and b/parser/model differ
 
parser/moves CHANGED
@@ -1 +1 @@
1
- ��moves��{"0":{"":25247},"1":{"":21688},"2":{"case":7258,"det":6062,"nsubj":1972,"punct":1645,"advmod":1210,"cc":1205,"mark":1051,"aux:tense":673,"amod":662,"nummod":595,"aux:pass":544,"obl:mod":483,"nsubj:pass":425,"cop":365,"expl:comp":204,"obj":170,"expl:subj":163,"iobj":139,"advcl":123,"nmod":92,"expl:pass":40,"vocative":35,"dep":0},"3":{"nmod":5132,"punct":3954,"amod":2083,"conj":1517,"obj":1410,"obl:mod":1184,"obl:arg":1078,"acl":782,"xcomp":739,"flat:name":657,"advmod":562,"fixed":418,"appos":408,"acl:relcl":365,"advcl":306,"ccomp":238,"obl:agent":206,"dep":138,"nummod":117,"parataxis":92,"nsubj":75,"flat:foreign":63},"4":{"ROOT":2219}}�cfg��neg_key�
 
1
+ ��moves��{"0":{"":25255},"1":{"":21680},"2":{"case":7258,"det":6062,"nsubj":1982,"punct":1645,"advmod":1210,"cc":1205,"mark":1051,"aux:tense":673,"amod":662,"nummod":595,"aux:pass":544,"obl:mod":483,"nsubj:pass":425,"cop":365,"expl:comp":204,"obj":170,"expl:subj":164,"iobj":139,"advcl":123,"nmod":92,"expl:pass":40,"vocative":35,"dep":0},"3":{"nmod":5132,"punct":3954,"amod":2083,"conj":1517,"obj":1410,"obl:mod":1184,"obl:arg":1078,"acl":782,"xcomp":739,"flat:name":657,"advmod":562,"fixed":409,"appos":408,"acl:relcl":365,"advcl":306,"ccomp":238,"obl:agent":206,"dep":138,"nummod":117,"parataxis":92,"nsubj":75,"flat:foreign":63},"4":{"ROOT":2219}}�cfg��neg_key�
senter/cfg CHANGED
@@ -1,3 +1,3 @@
1
  {
2
-
3
  }
 
1
  {
2
+ "overwrite":false
3
  }
senter/model CHANGED
Binary files a/senter/model and b/senter/model differ
 
tok2vec/model CHANGED
Binary files a/tok2vec/model and b/tok2vec/model differ
 
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
 
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:66010bb54a8e9c0a19c43ac74caebe515c456d921dbded4ad4e10e559d554de5
3
- size 8159993
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:960cc1797e729519ed569b69bfca8107e6be1c051008015967e4636720ecfc67
3
+ size 10419856
vocab/vectors.cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "mode":"default"
3
+ }