osanseviero HF staff commited on
Commit
a325d13
1 Parent(s): d498d46

Update spaCy pipeline

Browse files
LICENSES_SOURCES CHANGED
@@ -1,4 +1,4 @@
1
- # UD French Sequoia v2.5
2
 
3
  * Author: Candito, Marie; Seddah, Djamé; Perrier, Guy; Guillaume, Bruno
4
  * URL: https://github.com/UniversalDependencies/UD_French-Sequoia
1
+ # UD French Sequoia v2.8
2
 
3
  * Author: Candito, Marie; Seddah, Djamé; Perrier, Guy; Guillaume, Bruno
4
  * URL: https://github.com/UniversalDependencies/UD_French-Sequoia
README.md CHANGED
@@ -4,7 +4,7 @@ tags:
4
  - token-classification
5
  language:
6
  - fr
7
- license: lgpllr
8
  model-index:
9
  - name: fr_dep_news_trf
10
  results:
@@ -14,34 +14,34 @@ model-index:
14
  metrics:
15
  - name: POS Accuracy
16
  type: accuracy
17
- value: 0.9575257732
18
  - task:
19
  name: SENTER
20
  type: token-classification
21
  metrics:
22
  - name: SENTER Precision
23
  type: precision
24
- value: 0.9478672986
25
  - name: SENTER Recall
26
  type: recall
27
- value: 0.9708737864
28
  - name: SENTER F Score
29
  type: f_score
30
- value: 0.9592326139
31
  - task:
32
  name: UNLABELED_DEPENDENCIES
33
  type: token-classification
34
  metrics:
35
  - name: Unlabeled Dependencies Accuracy
36
  type: accuracy
37
- value: 0.9479347449
38
  - task:
39
  name: LABELED_DEPENDENCIES
40
  type: token-classification
41
  metrics:
42
  - name: Labeled Dependencies Accuracy
43
  type: accuracy
44
- value: 0.9479347449
45
  ---
46
  ### Details: https://spacy.io/models/fr#fr_dep_news_trf
47
 
@@ -50,12 +50,12 @@ French transformer pipeline (camembert-base). Components: transformer, morpholog
50
  | Feature | Description |
51
  | --- | --- |
52
  | **Name** | `fr_dep_news_trf` |
53
- | **Version** | `3.1.0` |
54
- | **spaCy** | `>=3.1.0,<3.2.0` |
55
  | **Default Pipeline** | `transformer`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer` |
56
  | **Components** | `transformer`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer` |
57
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
58
- | **Sources** | [UD French Sequoia v2.5](https://github.com/UniversalDependencies/UD_French-Sequoia) (Candito, Marie; Seddah, Djamé; Perrier, Guy; Guillaume, Bruno)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion)<br />[camembert-base](https://huggingface.co/camembert-base) (Martin, Louis and Muller, Benjamin and Suarez, Pedro Javier Ortiz and Dupont, Yoann and Romary, Laurent and de la Clergerie, Eric Villemonte and Seddah, Djame and Sagot, Benoit}) |
59
  | **License** | `LGPL-LR` |
60
  | **Author** | [Explosion](https://explosion.ai) |
61
 
@@ -63,11 +63,11 @@ French transformer pipeline (camembert-base). Components: transformer, morpholog
63
 
64
  <details>
65
 
66
- <summary>View label scheme (234 labels for 2 components)</summary>
67
 
68
  | Component | Labels |
69
  | --- | --- |
70
- | **`morphologizer`** | `POS=PROPN`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Dem`, `Gender=Fem\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=SCONJ`, `POS=ADP`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Ord\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=NOUN`, `POS=PUNCT`, `Gender=Masc\|Number=Sing\|POS=PROPN`, `Number=Plur\|POS=ADJ`, `Gender=Masc\|Number=Plur\|POS=NOUN`, `Definite=Ind\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Number=Sing\|POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `POS=ADV`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Definite=Def\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PROPN`, `Definite=Def\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Card\|POS=NUM`, `Definite=Def\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Masc\|Number=Plur\|POS=ADJ`, `POS=CCONJ`, `Gender=Fem\|Number=Plur\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=ADJ`, `POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Number=Sing\|POS=DET\|Poss=Yes`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=ADP\|PronType=Art`, `Definite=Def\|Number=Plur\|POS=ADP\|PronType=Art`, `Definite=Ind\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=VERB\|VerbForm=Inf`, `Gender=Fem\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3`, `Number=Plur\|POS=DET`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Dem`, `POS=ADV\|PronType=Int`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Masc\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Number=Plur\|POS=DET\|Poss=Yes`, `POS=AUX\|VerbForm=Inf`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=ADV\|Polarity=Neg`, `Definite=Ind\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3`, `POS=PRON\|Person=3\|Reflex=Yes`, `Gender=Masc\|POS=NOUN`, `POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=PRON\|Person=3`, `Number=Plur\|POS=NOUN`, `NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=3`, `Number=Sing\|POS=NOUN`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Gender=Fem\|NumType=Ord\|Number=Sing\|POS=ADJ`, `Number=Plur\|POS=PROPN`, `Number=Sing\|POS=PROPN`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=DET`, `Gender=Fem\|Number=Sing\|POS=DET\|Poss=Yes`, `Gender=Masc\|POS=PRON`, `POS=NOUN`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON`, `Gender=Masc\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Number=Sing\|POS=PRON`, `Number=Sing\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|VerbForm=Fin`, `Number=Plur\|POS=DET\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=PRON`, `POS=NUM`, `Gender=Fem\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON`, `Number=Plur\|POS=PRON\|Person=3`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=INTJ`, `Number=Plur\|POS=PRON\|Person=2`, `NumType=Card\|POS=PRON`, `Definite=Ind\|Gender=Fem\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `NumType=Card\|POS=NOUN`, `POS=PRON\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3`, `Gender=Fem\|Number=Sing\|POS=DET`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=DET`, `Mood=Sub\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Definite=Ind\|Gender=Masc\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Plur\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Dem`, `Number=Sing\|POS=DET`, `Gender=Masc\|NumType=Card\|Number=Plur\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|POS=PRON`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `POS=SYM`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=DET\|PronType=Int`, `POS=DET`, `Gender=Masc\|Number=Plur\|POS=PRON`, `POS=PART`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|POS=VERB\|Person=3\|VerbForm=Fin`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Masc\|Number=Plur\|POS=DET`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Rel`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Rel`, `POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Fut\|VerbForm=Fin`, `Mood=Imp\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|Reflex=Yes`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|Reflex=Yes`, `Gender=Masc\|NumType=Card\|Number=Sing\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Number=Sing\|POS=PRON\|Person=1\|Reflex=Yes`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Gender=Fem\|POS=ADV`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=PROPN`, `Gender=Masc\|NumType=Card\|POS=NUM` |
71
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux:pass`, `aux:tense`, `case`, `cc`, `ccomp`, `conj`, `cop`, `dep`, `det`, `expl:comp`, `expl:pass`, `expl:subj`, `fixed`, `flat:foreign`, `flat:name`, `iobj`, `mark`, `nmod`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl:agent`, `obl:arg`, `obl:mod`, `parataxis`, `punct`, `vocative`, `xcomp` |
72
 
73
  </details>
@@ -77,12 +77,18 @@ French transformer pipeline (camembert-base). Components: transformer, morpholog
77
  | Type | Score |
78
  | --- | --- |
79
  | `TOKEN_ACC` | 99.90 |
80
- | `TAG_ACC` | 95.75 |
81
- | `POS_ACC` | 98.66 |
82
- | `MORPH_ACC` | 97.89 |
83
- | `LEMMA_ACC` | 91.13 |
84
- | `DEP_UAS` | 94.79 |
85
- | `DEP_LAS` | 92.57 |
86
- | `SENTS_P` | 94.79 |
87
- | `SENTS_R` | 97.09 |
88
- | `SENTS_F` | 95.92 |
 
 
 
 
 
 
4
  - token-classification
5
  language:
6
  - fr
7
+ license: lgpl-lr
8
  model-index:
9
  - name: fr_dep_news_trf
10
  results:
14
  metrics:
15
  - name: POS Accuracy
16
  type: accuracy
17
+ value: 0.9573063834
18
  - task:
19
  name: SENTER
20
  type: token-classification
21
  metrics:
22
  - name: SENTER Precision
23
  type: precision
24
+ value: 0.8835616438
25
  - name: SENTER Recall
26
  type: recall
27
+ value: 0.9393203883
28
  - name: SENTER F Score
29
  type: f_score
30
+ value: 0.9105882353
31
  - task:
32
  name: UNLABELED_DEPENDENCIES
33
  type: token-classification
34
  metrics:
35
  - name: Unlabeled Dependencies Accuracy
36
  type: accuracy
37
+ value: 0.9492309471
38
  - task:
39
  name: LABELED_DEPENDENCIES
40
  type: token-classification
41
  metrics:
42
  - name: Labeled Dependencies Accuracy
43
  type: accuracy
44
+ value: 0.9492309471
45
  ---
46
  ### Details: https://spacy.io/models/fr#fr_dep_news_trf
47
 
50
  | Feature | Description |
51
  | --- | --- |
52
  | **Name** | `fr_dep_news_trf` |
53
+ | **Version** | `3.2.0` |
54
+ | **spaCy** | `>=3.2.0,<3.3.0` |
55
  | **Default Pipeline** | `transformer`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer` |
56
  | **Components** | `transformer`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer` |
57
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
58
+ | **Sources** | [UD French Sequoia v2.8](https://github.com/UniversalDependencies/UD_French-Sequoia) (Candito, Marie; Seddah, Djamé; Perrier, Guy; Guillaume, Bruno)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion)<br />[camembert-base](https://huggingface.co/camembert-base) (Martin, Louis and Muller, Benjamin and Suarez, Pedro Javier Ortiz and Dupont, Yoann and Romary, Laurent and de la Clergerie, Eric Villemonte and Seddah, Djame and Sagot, Benoit}) |
59
  | **License** | `LGPL-LR` |
60
  | **Author** | [Explosion](https://explosion.ai) |
61
 
63
 
64
  <details>
65
 
66
+ <summary>View label scheme (232 labels for 2 components)</summary>
67
 
68
  | Component | Labels |
69
  | --- | --- |
70
+ | **`morphologizer`** | `POS=PROPN`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Dem`, `Gender=Fem\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=SCONJ`, `POS=ADP`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Ord\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=NOUN`, `POS=PUNCT`, `Gender=Masc\|Number=Sing\|POS=PROPN`, `Number=Plur\|POS=ADJ`, `Gender=Masc\|Number=Plur\|POS=NOUN`, `Definite=Ind\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Number=Sing\|POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `POS=ADV`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Definite=Def\|Gender=Fem\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PROPN`, `Definite=Def\|Number=Sing\|POS=DET\|PronType=Art`, `NumType=Card\|POS=NUM`, `Definite=Def\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Masc\|Number=Plur\|POS=ADJ`, `POS=CCONJ`, `Gender=Fem\|Number=Plur\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=ADJ`, `POS=ADJ`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `POS=PRON\|PronType=Rel`, `Number=Sing\|POS=DET\|Poss=Yes`, `Definite=Def\|Gender=Masc\|Number=Sing\|POS=ADP\|PronType=Art`, `Definite=Def\|Number=Plur\|POS=ADP\|PronType=Art`, `Definite=Ind\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=VERB\|VerbForm=Inf`, `Gender=Fem\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3`, `Number=Plur\|POS=DET`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=ADJ`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Dem`, `POS=ADV\|PronType=Int`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Gender=Fem\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Masc\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Masc\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Number=Plur\|POS=DET\|Poss=Yes`, `POS=AUX\|VerbForm=Inf`, `Gender=Masc\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=ADV\|Polarity=Neg`, `Definite=Ind\|Number=Sing\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3`, `POS=PRON\|Person=3\|Reflex=Yes`, `Gender=Masc\|POS=NOUN`, `POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=PRON\|Person=3`, `Number=Plur\|POS=NOUN`, `NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Gender=Masc\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=3`, `Number=Sing\|POS=NOUN`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Gender=Fem\|NumType=Ord\|Number=Sing\|POS=ADJ`, `Number=Plur\|POS=PROPN`, `Number=Sing\|POS=PROPN`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=DET`, `Gender=Fem\|Number=Sing\|POS=DET\|Poss=Yes`, `Gender=Masc\|POS=PRON`, `POS=NOUN`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON`, `Gender=Masc\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Number=Sing\|POS=PRON`, `Number=Sing\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|VerbForm=Fin`, `Number=Plur\|POS=DET\|PronType=Dem`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON`, `Gender=Masc\|Number=Sing\|POS=PRON\|Person=3\|PronType=Dem`, `Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=3\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|NumType=Ord\|Number=Sing\|POS=ADJ`, `POS=PRON`, `POS=NUM`, `Gender=Fem\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON`, `Number=Plur\|POS=PRON\|Person=3`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Sing\|POS=PRON\|Person=1`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Past\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON`, `Gender=Fem\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `POS=INTJ`, `Number=Plur\|POS=PRON\|Person=2`, `NumType=Card\|POS=PRON`, `Definite=Ind\|Gender=Fem\|Number=Plur\|POS=DET\|PronType=Art`, `Gender=Fem\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `NumType=Card\|POS=NOUN`, `POS=PRON\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3`, `Gender=Fem\|Number=Sing\|POS=DET`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=DET`, `Mood=Sub\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Definite=Ind\|Gender=Masc\|Number=Plur\|POS=DET\|PronType=Art`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=PRON\|PronType=Dem`, `Gender=Masc\|Number=Plur\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Dem`, `Number=Sing\|POS=DET`, `Gender=Masc\|NumType=Card\|Number=Plur\|POS=NOUN`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Dem`, `Mood=Ind\|POS=VERB\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|POS=PRON`, `Gender=Masc\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|Number=Sing\|POS=PRON\|PronType=Rel`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `POS=SYM`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Fem\|Number=Plur\|POS=DET\|PronType=Int`, `POS=DET`, `Gender=Masc\|Number=Plur\|POS=PRON`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|POS=VERB\|Person=3\|VerbForm=Fin`, `Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Mood=Cnd\|Number=Plur\|POS=VERB\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Gender=Fem\|Number=Sing\|POS=DET\|PronType=Int`, `Gender=Masc\|Number=Plur\|POS=DET`, `Gender=Fem\|Number=Plur\|POS=PRON\|PronType=Rel`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Masc\|Number=Plur\|POS=PRON\|PronType=Rel`, `POS=VERB\|Tense=Past\|VerbForm=Part\|Voice=Pass`, `Gender=Fem\|NumType=Ord\|Number=Plur\|POS=ADJ`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Fut\|VerbForm=Fin`, `Mood=Imp\|POS=VERB\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|Reflex=Yes`, `Mood=Cnd\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|Reflex=Yes`, `Gender=Masc\|NumType=Card\|Number=Sing\|POS=NOUN`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Number=Sing\|POS=PRON\|Person=1\|Reflex=Yes`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=AUX\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Imp\|VerbForm=Fin`, `Mood=Sub\|Number=Sing\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Gender=Masc\|POS=PROPN`, `Mood=Cnd\|Number=Plur\|POS=AUX\|Person=3\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Sub\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Mood=Ind\|Number=Sing\|POS=VERB\|Person=1\|Tense=Fut\|VerbForm=Fin`, `Gender=Fem\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Mood=Cnd\|Number=Sing\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Imp\|Number=Plur\|POS=VERB\|Person=1\|Tense=Pres\|VerbForm=Fin`, `Mood=Sub\|Number=Plur\|POS=AUX\|Person=2\|Tense=Pres\|VerbForm=Fin`, `Mood=Ind\|Number=Plur\|POS=VERB\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Mood=Ind\|Number=Sing\|POS=AUX\|Person=2\|Tense=Imp\|VerbForm=Fin`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Gender=Fem\|Number=Plur\|POS=PROPN`, `Gender=Masc\|NumType=Card\|POS=NUM` |
71
  | **`parser`** | `ROOT`, `acl`, `acl:relcl`, `advcl`, `advmod`, `amod`, `appos`, `aux:pass`, `aux:tense`, `case`, `cc`, `ccomp`, `conj`, `cop`, `dep`, `det`, `expl:comp`, `expl:pass`, `expl:subj`, `fixed`, `flat:foreign`, `flat:name`, `iobj`, `mark`, `nmod`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl:agent`, `obl:arg`, `obl:mod`, `parataxis`, `punct`, `vocative`, `xcomp` |
72
 
73
  </details>
77
  | Type | Score |
78
  | --- | --- |
79
  | `TOKEN_ACC` | 99.90 |
80
+ | `TOKEN_P` | 98.44 |
81
+ | `TOKEN_R` | 98.96 |
82
+ | `TOKEN_F` | 98.70 |
83
+ | `POS_ACC` | 98.64 |
84
+ | `MORPH_ACC` | 97.92 |
85
+ | `MORPH_MICRO_P` | 99.34 |
86
+ | `MORPH_MICRO_R` | 99.08 |
87
+ | `MORPH_MICRO_F` | 99.21 |
88
+ | `SENTS_P` | 88.36 |
89
+ | `SENTS_R` | 93.93 |
90
+ | `SENTS_F` | 91.06 |
91
+ | `DEP_UAS` | 94.92 |
92
+ | `DEP_LAS` | 92.96 |
93
+ | `TAG_ACC` | 95.73 |
94
+ | `LEMMA_ACC` | 91.18 |
accuracy.json CHANGED
@@ -1,60 +1,58 @@
1
  {
2
  "token_acc": 0.9989751998,
3
- "tag_acc": 0.9575257732,
4
- "pos_acc": 0.9865979381,
5
- "morph_acc": 0.9788681579,
6
- "lemma_acc": 0.9112857657,
7
- "dep_uas": 0.9479347449,
8
- "dep_las": 0.9257096606,
9
- "sents_p": 0.9478672986,
10
- "sents_r": 0.9708737864,
11
- "sents_f": 0.9592326139,
12
- "speed": 3075.0647850323,
13
  "morph_per_feat": {
14
  "Definite": {
15
- "p": 0.9985380117,
16
- "r": 0.996353027,
17
- "f": 0.9974443227
18
  },
19
  "Number": {
20
- "p": 0.9973953488,
21
- "r": 0.9935137139,
22
- "f": 0.9954507474
23
  },
24
  "PronType": {
25
- "p": 0.998714653,
26
- "r": 0.9936061381,
27
- "f": 0.9961538462
28
  },
29
  "Gender": {
30
- "p": 0.9904269082,
31
- "r": 0.9891472868,
32
- "f": 0.9897866839
33
  },
34
  "Mood": {
35
- "p": 0.9964412811,
36
- "r": 0.9946714032,
37
- "f": 0.9955555556
38
  },
39
  "Person": {
40
- "p": 0.9974779319,
41
- "r": 0.993718593,
42
- "f": 0.9955947137
43
  },
44
  "Tense": {
45
- "p": 0.9878419453,
46
  "r": 0.9959141982,
47
- "f": 0.991861648
48
  },
49
  "VerbForm": {
50
- "p": 0.9909315746,
51
  "r": 0.9950331126,
52
- "f": 0.9929781082
53
  },
54
  "NumType": {
55
- "p": 0.9963369963,
56
- "r": 0.9379310345,
57
- "f": 0.9662522202
58
  },
59
  "Reflex": {
60
  "p": 1.0,
@@ -62,9 +60,9 @@
62
  "f": 1.0
63
  },
64
  "Voice": {
65
- "p": 0.9652173913,
66
  "r": 0.9910714286,
67
- "f": 0.9779735683
68
  },
69
  "Poss": {
70
  "p": 1.0,
@@ -77,151 +75,161 @@
77
  "f": 1.0
78
  }
79
  },
 
 
 
 
 
80
  "dep_las_per_type": {
81
  "det": {
82
- "p": 0.9887278583,
83
  "r": 0.9911218725,
84
- "f": 0.989923418
85
  },
86
  "nsubj": {
87
- "p": 0.9406175772,
88
- "r": 0.9542168675,
89
- "f": 0.9473684211
90
  },
91
  "aux:tense": {
92
- "p": 0.984,
93
  "r": 0.984,
94
- "f": 0.984
95
  },
96
  "root": {
97
- "p": 0.9481132075,
98
- "r": 0.9757281553,
99
- "f": 0.961722488
100
  },
101
  "obj": {
102
- "p": 0.9454545455,
103
- "r": 0.9258160237,
104
- "f": 0.9355322339
105
  },
106
  "cc": {
107
- "p": 0.9449541284,
108
- "r": 0.9493087558,
109
- "f": 0.9471264368
110
  },
111
  "case": {
112
- "p": 0.9850034083,
113
- "r": 0.9843324251,
114
- "f": 0.9846678024
115
  },
116
  "obl:mod": {
117
- "p": 0.8303571429,
118
- "r": 0.8303571429,
119
- "f": 0.8303571429
120
  },
121
  "nmod": {
122
- "p": 0.9073514602,
123
- "r": 0.8992015968,
124
- "f": 0.9032581454
125
  },
126
  "conj": {
127
- "p": 0.7751004016,
128
- "r": 0.7598425197,
129
- "f": 0.7673956262
130
  },
131
  "nummod": {
132
- "p": 0.8735632184,
133
- "r": 0.9047619048,
134
- "f": 0.8888888889
135
  },
136
  "amod": {
137
- "p": 0.9521178637,
138
- "r": 0.9434306569,
139
- "f": 0.9477543538
140
  },
141
  "acl": {
142
- "p": 0.8324022346,
143
- "r": 0.8612716763,
144
- "f": 0.8465909091
145
  },
146
  "mark": {
147
- "p": 0.9688888889,
148
- "r": 0.9603524229,
149
- "f": 0.9646017699
150
  },
151
  "xcomp": {
152
- "p": 0.9276315789,
153
- "r": 0.9337748344,
154
- "f": 0.9306930693
155
  },
156
  "flat:name": {
157
- "p": 0.9619047619,
158
- "r": 0.9619047619,
159
- "f": 0.9619047619
160
  },
161
  "cop": {
162
- "p": 0.9444444444,
163
  "r": 0.9444444444,
164
- "f": 0.9444444444
165
  },
166
  "advmod": {
167
- "p": 0.9301587302,
168
  "r": 0.9184952978,
169
- "f": 0.9242902208
170
  },
171
  "obl:arg": {
172
- "p": 0.8944954128,
173
- "r": 0.8863636364,
174
- "f": 0.8904109589
175
  },
176
  "appos": {
177
- "p": 0.7159090909,
178
  "r": 0.7590361446,
179
- "f": 0.7368421053
180
  },
181
  "nsubj:pass": {
182
- "p": 1.0,
183
  "r": 0.9647058824,
184
- "f": 0.9820359281
185
  },
186
  "aux:pass": {
187
- "p": 0.9652173913,
188
  "r": 0.9910714286,
189
- "f": 0.9779735683
190
  },
191
  "acl:relcl": {
192
- "p": 0.8470588235,
193
- "r": 0.8372093023,
194
- "f": 0.8421052632
195
  },
196
  "advcl": {
197
- "p": 0.7407407407,
198
  "r": 0.7692307692,
199
- "f": 0.7547169811
200
  },
201
  "fixed": {
202
- "p": 0.9642857143,
203
- "r": 0.801980198,
204
  "f": 0.8756756757
205
  },
206
  "dep": {
207
- "p": 0.2463768116,
208
- "r": 0.5483870968,
209
- "f": 0.34
210
  },
211
  "expl:subj": {
212
- "p": 0.9,
213
- "r": 0.84375,
214
- "f": 0.8709677419
215
  },
216
  "expl:comp": {
217
- "p": 0.7435897436,
218
- "r": 0.9666666667,
219
- "f": 0.8405797101
220
  },
221
  "expl:pass": {
222
- "p": 0.6,
223
  "r": 0.4285714286,
224
- "f": 0.5
 
 
 
 
 
225
  },
226
  "ccomp": {
227
  "p": 0.96,
@@ -229,19 +237,14 @@
229
  "f": 0.9504950495
230
  },
231
  "parataxis": {
232
- "p": 0.5833333333,
233
- "r": 0.5,
234
- "f": 0.5384615385
235
  },
236
  "iobj": {
237
- "p": 0.7142857143,
238
- "r": 0.6,
239
- "f": 0.652173913
240
- },
241
- "obl:agent": {
242
- "p": 1.0,
243
- "r": 0.9523809524,
244
- "f": 0.9756097561
245
  },
246
  "nsubj:caus": {
247
  "p": 0.0,
@@ -264,9 +267,9 @@
264
  "f": 0.0
265
  },
266
  "vocative": {
267
- "p": 0.8333333333,
268
  "r": 0.625,
269
- "f": 0.7142857143
270
  },
271
  "dislocated": {
272
  "p": 0.0,
@@ -293,5 +296,8 @@
293
  "r": 0.0,
294
  "f": 0.0
295
  }
296
- }
 
 
 
297
  }
1
  {
2
  "token_acc": 0.9989751998,
3
+ "token_p": 0.9844389844,
4
+ "token_r": 0.9896058454,
5
+ "token_f": 0.9870156531,
6
+ "pos_acc": 0.9863875425,
7
+ "morph_acc": 0.9791731106,
8
+ "morph_micro_p": 0.9934029687,
9
+ "morph_micro_r": 0.9908005361,
10
+ "morph_micro_f": 0.9921000458,
 
 
11
  "morph_per_feat": {
12
  "Definite": {
13
+ "p": 0.9985358712,
14
+ "r": 0.995620438,
15
+ "f": 0.9970760234
16
  },
17
  "Number": {
18
+ "p": 0.9959409594,
19
+ "r": 0.9937407953,
20
+ "f": 0.9948396609
21
  },
22
  "PronType": {
23
+ "p": 0.9974293059,
24
+ "r": 0.9929622521,
25
+ "f": 0.9951907663
26
  },
27
  "Gender": {
28
+ "p": 0.988963039,
29
+ "r": 0.9846664963,
30
+ "f": 0.9868100909
31
  },
32
  "Mood": {
33
+ "p": 0.9946619217,
34
+ "r": 0.9928952043,
35
+ "f": 0.9937777778
36
  },
37
  "Person": {
38
+ "p": 0.9974683544,
39
+ "r": 0.9911949686,
40
+ "f": 0.9943217666
41
  },
42
  "Tense": {
43
+ "p": 0.9868421053,
44
  "r": 0.9959141982,
45
+ "f": 0.9913573971
46
  },
47
  "VerbForm": {
48
+ "p": 0.9901153213,
49
  "r": 0.9950331126,
50
+ "f": 0.9925681255
51
  },
52
  "NumType": {
53
+ "p": 0.9927797834,
54
+ "r": 0.9385665529,
55
+ "f": 0.9649122807
56
  },
57
  "Reflex": {
58
  "p": 1.0,
60
  "f": 1.0
61
  },
62
  "Voice": {
63
+ "p": 0.9568965517,
64
  "r": 0.9910714286,
65
+ "f": 0.9736842105
66
  },
67
  "Poss": {
68
  "p": 1.0,
75
  "f": 1.0
76
  }
77
  },
78
+ "sents_p": 0.8835616438,
79
+ "sents_r": 0.9393203883,
80
+ "sents_f": 0.9105882353,
81
+ "dep_uas": 0.9492309471,
82
+ "dep_las": 0.9295953757,
83
  "dep_las_per_type": {
84
  "det": {
85
+ "p": 0.9871382637,
86
  "r": 0.9911218725,
87
+ "f": 0.9891260572
88
  },
89
  "nsubj": {
90
+ "p": 0.945368171,
91
+ "r": 0.9590361446,
92
+ "f": 0.95215311
93
  },
94
  "aux:tense": {
95
+ "p": 0.9761904762,
96
  "r": 0.984,
97
+ "f": 0.9800796813
98
  },
99
  "root": {
100
+ "p": 0.9179954442,
101
+ "r": 0.9781553398,
102
+ "f": 0.9471210341
103
  },
104
  "obj": {
105
+ "p": 0.9451219512,
106
+ "r": 0.9198813056,
107
+ "f": 0.9323308271
108
  },
109
  "cc": {
110
+ "p": 0.9495412844,
111
+ "r": 0.9539170507,
112
+ "f": 0.9517241379
113
  },
114
  "case": {
115
+ "p": 0.9850238257,
116
+ "r": 0.9856948229,
117
+ "f": 0.9853592101
118
  },
119
  "obl:mod": {
120
+ "p": 0.8553459119,
121
+ "r": 0.8119402985,
122
+ "f": 0.8330781011
123
  },
124
  "nmod": {
125
+ "p": 0.9012961117,
126
+ "r": 0.9030969031,
127
+ "f": 0.9021956088
128
  },
129
  "conj": {
130
+ "p": 0.832,
131
+ "r": 0.8188976378,
132
+ "f": 0.8253968254
133
  },
134
  "nummod": {
135
+ "p": 0.9069767442,
136
+ "r": 0.9230769231,
137
+ "f": 0.9149560117
138
  },
139
  "amod": {
140
+ "p": 0.9664804469,
141
+ "r": 0.9453551913,
142
+ "f": 0.955801105
143
  },
144
  "acl": {
145
+ "p": 0.85,
146
+ "r": 0.8843930636,
147
+ "f": 0.8668555241
148
  },
149
  "mark": {
150
+ "p": 0.9733333333,
151
+ "r": 0.9647577093,
152
+ "f": 0.9690265487
153
  },
154
  "xcomp": {
155
+ "p": 0.9350649351,
156
+ "r": 0.9536423841,
157
+ "f": 0.9442622951
158
  },
159
  "flat:name": {
160
+ "p": 0.953271028,
161
+ "r": 0.9714285714,
162
+ "f": 0.9622641509
163
  },
164
  "cop": {
165
+ "p": 0.9659090909,
166
  "r": 0.9444444444,
167
+ "f": 0.9550561798
168
  },
169
  "advmod": {
170
+ "p": 0.9361022364,
171
  "r": 0.9184952978,
172
+ "f": 0.9272151899
173
  },
174
  "obl:arg": {
175
+ "p": 0.8878923767,
176
+ "r": 0.9,
177
+ "f": 0.8939051919
178
  },
179
  "appos": {
180
+ "p": 0.7078651685,
181
  "r": 0.7590361446,
182
+ "f": 0.7325581395
183
  },
184
  "nsubj:pass": {
185
+ "p": 0.9761904762,
186
  "r": 0.9647058824,
187
+ "f": 0.9704142012
188
  },
189
  "aux:pass": {
190
+ "p": 0.9910714286,
191
  "r": 0.9910714286,
192
+ "f": 0.9910714286
193
  },
194
  "acl:relcl": {
195
+ "p": 0.8795180723,
196
+ "r": 0.8488372093,
197
+ "f": 0.8639053254
198
  },
199
  "advcl": {
200
+ "p": 0.7594936709,
201
  "r": 0.7692307692,
202
+ "f": 0.7643312102
203
  },
204
  "fixed": {
205
+ "p": 0.9529411765,
206
+ "r": 0.81,
207
  "f": 0.8756756757
208
  },
209
  "dep": {
210
+ "p": 0.2388059701,
211
+ "r": 0.5517241379,
212
+ "f": 0.3333333333
213
  },
214
  "expl:subj": {
215
+ "p": 0.8666666667,
216
+ "r": 0.8125,
217
+ "f": 0.8387096774
218
  },
219
  "expl:comp": {
220
+ "p": 0.75,
221
+ "r": 1.0,
222
+ "f": 0.8571428571
223
  },
224
  "expl:pass": {
225
+ "p": 0.75,
226
  "r": 0.4285714286,
227
+ "f": 0.5454545455
228
+ },
229
+ "obl:agent": {
230
+ "p": 0.975,
231
+ "r": 0.9285714286,
232
+ "f": 0.9512195122
233
  },
234
  "ccomp": {
235
  "p": 0.96,
237
  "f": 0.9504950495
238
  },
239
  "parataxis": {
240
+ "p": 0.7307692308,
241
+ "r": 0.6785714286,
242
+ "f": 0.7037037037
243
  },
244
  "iobj": {
245
+ "p": 0.7222222222,
246
+ "r": 0.52,
247
+ "f": 0.6046511628
 
 
 
 
 
248
  },
249
  "nsubj:caus": {
250
  "p": 0.0,
267
  "f": 0.0
268
  },
269
  "vocative": {
270
+ "p": 1.0,
271
  "r": 0.625,
272
+ "f": 0.7692307692
273
  },
274
  "dislocated": {
275
  "p": 0.0,
296
  "r": 0.0,
297
  "f": 0.0
298
  }
299
+ },
300
+ "tag_acc": 0.9573063834,
301
+ "lemma_acc": 0.911837238,
302
+ "speed": 1637.6043941996
303
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
config.cfg CHANGED
@@ -1,10 +1,8 @@
1
  [paths]
2
- train = "corpus/fr-dep-news/train.spacy"
3
- dev = "corpus/fr-dep-news/dev.spacy"
4
  vectors = null
5
- raw = null
6
  init_tok2vec = null
7
- vocab_data = null
8
 
9
  [system]
10
  gpu_allocator = "pytorch"
@@ -24,6 +22,7 @@ tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
24
 
25
  [components.attribute_ruler]
26
  factory = "attribute_ruler"
 
27
  validate = false
28
 
29
  [components.lemmatizer]
@@ -31,9 +30,13 @@ factory = "lemmatizer"
31
  mode = "rule"
32
  model = null
33
  overwrite = false
 
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
 
 
 
37
 
38
  [components.morphologizer.model]
39
  @architectures = "spacy.Tagger.v1"
@@ -50,6 +53,7 @@ factory = "parser"
50
  learn_tokens = false
51
  min_action_freq = 30
52
  moves = null
 
53
  update_with_oracle_cut_size = 100
54
 
55
  [components.parser.model]
@@ -73,37 +77,39 @@ max_batch_items = 4096
73
  set_extra_annotations = {"@annotation_setters":"spacy-transformers.null_annotation_setter.v1"}
74
 
75
  [components.transformer.model]
76
- @architectures = "spacy-transformers.TransformerModel.v1"
77
  name = "camembert-base"
 
78
 
79
  [components.transformer.model.get_spans]
80
  @span_getters = "spacy-transformers.strided_spans.v1"
81
  window = 128
82
  stride = 96
83
 
 
 
84
  [components.transformer.model.tokenizer_config]
85
  use_fast = true
86
 
 
 
87
  [corpora]
88
 
89
  [corpora.dev]
90
  @readers = "spacy.Corpus.v1"
91
- limit = 0
92
- max_length = 0
93
- path = ${paths:dev}
94
  gold_preproc = false
 
 
95
  augmenter = null
96
 
97
  [corpora.train]
98
  @readers = "spacy.Corpus.v1"
99
- path = ${paths:train}
100
- max_length = 500
101
  gold_preproc = false
 
102
  limit = 0
103
-
104
- [corpora.train.augmenter]
105
- @augmenters = "spacy.lower_case.v1"
106
- level = 0.1
107
 
108
  [training]
109
  train_corpus = "corpora.train"
@@ -148,21 +154,25 @@ total_steps = 20000
148
  initial_rate = 0.00005
149
 
150
  [training.score_weights]
151
- pos_acc = 0.12
152
- morph_acc = 0.12
153
  morph_per_feat = null
154
  dep_uas = 0.0
155
- dep_las = 0.23
156
  dep_las_per_type = null
157
  sents_p = null
158
  sents_r = null
159
- sents_f = 0.03
160
  lemma_acc = 0.5
 
 
 
 
161
 
162
  [pretraining]
163
 
164
  [initialize]
165
- vocab_data = ${paths.vocab_data}
166
  vectors = ${paths.vectors}
167
  init_tok2vec = ${paths.init_tok2vec}
168
  before_init = null
1
  [paths]
2
+ train = null
3
+ dev = null
4
  vectors = null
 
5
  init_tok2vec = null
 
6
 
7
  [system]
8
  gpu_allocator = "pytorch"
22
 
23
  [components.attribute_ruler]
24
  factory = "attribute_ruler"
25
+ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
26
  validate = false
27
 
28
  [components.lemmatizer]
30
  mode = "rule"
31
  model = null
32
  overwrite = false
33
+ scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
37
+ extend = false
38
+ overwrite = true
39
+ scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
41
  [components.morphologizer.model]
42
  @architectures = "spacy.Tagger.v1"
53
  learn_tokens = false
54
  min_action_freq = 30
55
  moves = null
56
+ scorer = {"@scorers":"spacy.parser_scorer.v1"}
57
  update_with_oracle_cut_size = 100
58
 
59
  [components.parser.model]
77
  set_extra_annotations = {"@annotation_setters":"spacy-transformers.null_annotation_setter.v1"}
78
 
79
  [components.transformer.model]
80
+ @architectures = "spacy-transformers.TransformerModel.v3"
81
  name = "camembert-base"
82
+ mixed_precision = false
83
 
84
  [components.transformer.model.get_spans]
85
  @span_getters = "spacy-transformers.strided_spans.v1"
86
  window = 128
87
  stride = 96
88
 
89
+ [components.transformer.model.grad_scaler_config]
90
+
91
  [components.transformer.model.tokenizer_config]
92
  use_fast = true
93
 
94
+ [components.transformer.model.transformer_config]
95
+
96
  [corpora]
97
 
98
  [corpora.dev]
99
  @readers = "spacy.Corpus.v1"
100
+ path = ${paths.dev}
 
 
101
  gold_preproc = false
102
+ max_length = 0
103
+ limit = 0
104
  augmenter = null
105
 
106
  [corpora.train]
107
  @readers = "spacy.Corpus.v1"
108
+ path = ${paths.train}
 
109
  gold_preproc = false
110
+ max_length = 0
111
  limit = 0
112
+ augmenter = null
 
 
 
113
 
114
  [training]
115
  train_corpus = "corpora.train"
154
  initial_rate = 0.00005
155
 
156
  [training.score_weights]
157
+ pos_acc = 0.08
158
+ morph_acc = 0.08
159
  morph_per_feat = null
160
  dep_uas = 0.0
161
+ dep_las = 0.16
162
  dep_las_per_type = null
163
  sents_p = null
164
  sents_r = null
165
+ sents_f = 0.02
166
  lemma_acc = 0.5
167
+ ents_f = 0.16
168
+ ents_p = 0.0
169
+ ents_r = 0.0
170
+ speed = 0.0
171
 
172
  [pretraining]
173
 
174
  [initialize]
175
+ vocab_data = null
176
  vectors = ${paths.vectors}
177
  init_tok2vec = ${paths.init_tok2vec}
178
  before_init = null
fr_dep_news_trf-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:086fc3c7e5a5f337f435eaa52d2103489e5785bf73e6758f66347fe1109be676
3
- size 400698074
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:620ddfdd1ca028837fbc8686f8f22738dbc7d34754c7a293a9cef07a166ef8bf
3
+ size 400717231
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"fr",
3
  "name":"dep_news_trf",
4
- "version":"3.1.0",
5
  "description":"French transformer pipeline (camembert-base). Components: transformer, morphologizer, parser, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"LGPL-LR",
10
- "spacy_version":">=3.1.0,<3.2.0",
11
- "spacy_git_version":"caba63b74",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
@@ -173,7 +173,6 @@
173
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int",
174
  "POS=DET",
175
  "Gender=Masc|Number=Plur|POS=PRON",
176
- "POS=PART",
177
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin",
178
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin",
179
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
@@ -213,7 +212,6 @@
213
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin",
214
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin",
215
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin",
216
- "Gender=Fem|POS=ADV",
217
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin",
218
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
219
  "Gender=Fem|Number=Plur|POS=PROPN",
@@ -283,61 +281,59 @@
283
  ],
284
  "performance":{
285
  "token_acc":0.9989751998,
286
- "tag_acc":0.9575257732,
287
- "pos_acc":0.9865979381,
288
- "morph_acc":0.9788681579,
289
- "lemma_acc":0.9112857657,
290
- "dep_uas":0.9479347449,
291
- "dep_las":0.9257096606,
292
- "sents_p":0.9478672986,
293
- "sents_r":0.9708737864,
294
- "sents_f":0.9592326139,
295
- "speed":3075.0647850323,
296
  "morph_per_feat":{
297
  "Definite":{
298
- "p":0.9985380117,
299
- "r":0.996353027,
300
- "f":0.9974443227
301
  },
302
  "Number":{
303
- "p":0.9973953488,
304
- "r":0.9935137139,
305
- "f":0.9954507474
306
  },
307
  "PronType":{
308
- "p":0.998714653,
309
- "r":0.9936061381,
310
- "f":0.9961538462
311
  },
312
  "Gender":{
313
- "p":0.9904269082,
314
- "r":0.9891472868,
315
- "f":0.9897866839
316
  },
317
  "Mood":{
318
- "p":0.9964412811,
319
- "r":0.9946714032,
320
- "f":0.9955555556
321
  },
322
  "Person":{
323
- "p":0.9974779319,
324
- "r":0.993718593,
325
- "f":0.9955947137
326
  },
327
  "Tense":{
328
- "p":0.9878419453,
329
  "r":0.9959141982,
330
- "f":0.991861648
331
  },
332
  "VerbForm":{
333
- "p":0.9909315746,
334
  "r":0.9950331126,
335
- "f":0.9929781082
336
  },
337
  "NumType":{
338
- "p":0.9963369963,
339
- "r":0.9379310345,
340
- "f":0.9662522202
341
  },
342
  "Reflex":{
343
  "p":1.0,
@@ -345,9 +341,9 @@
345
  "f":1.0
346
  },
347
  "Voice":{
348
- "p":0.9652173913,
349
  "r":0.9910714286,
350
- "f":0.9779735683
351
  },
352
  "Poss":{
353
  "p":1.0,
@@ -360,151 +356,161 @@
360
  "f":1.0
361
  }
362
  },
 
 
 
 
 
363
  "dep_las_per_type":{
364
  "det":{
365
- "p":0.9887278583,
366
  "r":0.9911218725,
367
- "f":0.989923418
368
  },
369
  "nsubj":{
370
- "p":0.9406175772,
371
- "r":0.9542168675,
372
- "f":0.9473684211
373
  },
374
  "aux:tense":{
375
- "p":0.984,
376
  "r":0.984,
377
- "f":0.984
378
  },
379
  "root":{
380
- "p":0.9481132075,
381
- "r":0.9757281553,
382
- "f":0.961722488
383
  },
384
  "obj":{
385
- "p":0.9454545455,
386
- "r":0.9258160237,
387
- "f":0.9355322339
388
  },
389
  "cc":{
390
- "p":0.9449541284,
391
- "r":0.9493087558,
392
- "f":0.9471264368
393
  },
394
  "case":{
395
- "p":0.9850034083,
396
- "r":0.9843324251,
397
- "f":0.9846678024
398
  },
399
  "obl:mod":{
400
- "p":0.8303571429,
401
- "r":0.8303571429,
402
- "f":0.8303571429
403
  },
404
  "nmod":{
405
- "p":0.9073514602,
406
- "r":0.8992015968,
407
- "f":0.9032581454
408
  },
409
  "conj":{
410
- "p":0.7751004016,
411
- "r":0.7598425197,
412
- "f":0.7673956262
413
  },
414
  "nummod":{
415
- "p":0.8735632184,
416
- "r":0.9047619048,
417
- "f":0.8888888889
418
  },
419
  "amod":{
420
- "p":0.9521178637,
421
- "r":0.9434306569,
422
- "f":0.9477543538
423
  },
424
  "acl":{
425
- "p":0.8324022346,
426
- "r":0.8612716763,
427
- "f":0.8465909091
428
  },
429
  "mark":{
430
- "p":0.9688888889,
431
- "r":0.9603524229,
432
- "f":0.9646017699
433
  },
434
  "xcomp":{
435
- "p":0.9276315789,
436
- "r":0.9337748344,
437
- "f":0.9306930693
438
  },
439
  "flat:name":{
440
- "p":0.9619047619,
441
- "r":0.9619047619,
442
- "f":0.9619047619
443
  },
444
  "cop":{
445
- "p":0.9444444444,
446
  "r":0.9444444444,
447
- "f":0.9444444444
448
  },
449
  "advmod":{
450
- "p":0.9301587302,
451
  "r":0.9184952978,
452
- "f":0.9242902208
453
  },
454
  "obl:arg":{
455
- "p":0.8944954128,
456
- "r":0.8863636364,
457
- "f":0.8904109589
458
  },
459
  "appos":{
460
- "p":0.7159090909,
461
  "r":0.7590361446,
462
- "f":0.7368421053
463
  },
464
  "nsubj:pass":{
465
- "p":1.0,
466
  "r":0.9647058824,
467
- "f":0.9820359281
468
  },
469
  "aux:pass":{
470
- "p":0.9652173913,
471
  "r":0.9910714286,
472
- "f":0.9779735683
473
  },
474
  "acl:relcl":{
475
- "p":0.8470588235,
476
- "r":0.8372093023,
477
- "f":0.8421052632
478
  },
479
  "advcl":{
480
- "p":0.7407407407,
481
  "r":0.7692307692,
482
- "f":0.7547169811
483
  },
484
  "fixed":{
485
- "p":0.9642857143,
486
- "r":0.801980198,
487
  "f":0.8756756757
488
  },
489
  "dep":{
490
- "p":0.2463768116,
491
- "r":0.5483870968,
492
- "f":0.34
493
  },
494
  "expl:subj":{
495
- "p":0.9,
496
- "r":0.84375,
497
- "f":0.8709677419
498
  },
499
  "expl:comp":{
500
- "p":0.7435897436,
501
- "r":0.9666666667,
502
- "f":0.8405797101
503
  },
504
  "expl:pass":{
505
- "p":0.6,
506
  "r":0.4285714286,
507
- "f":0.5
 
 
 
 
 
508
  },
509
  "ccomp":{
510
  "p":0.96,
@@ -512,19 +518,14 @@
512
  "f":0.9504950495
513
  },
514
  "parataxis":{
515
- "p":0.5833333333,
516
- "r":0.5,
517
- "f":0.5384615385
518
  },
519
  "iobj":{
520
- "p":0.7142857143,
521
- "r":0.6,
522
- "f":0.652173913
523
- },
524
- "obl:agent":{
525
- "p":1.0,
526
- "r":0.9523809524,
527
- "f":0.9756097561
528
  },
529
  "nsubj:caus":{
530
  "p":0.0,
@@ -547,9 +548,9 @@
547
  "f":0.0
548
  },
549
  "vocative":{
550
- "p":0.8333333333,
551
  "r":0.625,
552
- "f":0.7142857143
553
  },
554
  "dislocated":{
555
  "p":0.0,
@@ -576,11 +577,14 @@
576
  "r":0.0,
577
  "f":0.0
578
  }
579
- }
 
 
 
580
  },
581
  "sources":[
582
  {
583
- "name":"UD French Sequoia v2.5",
584
  "url":"https://github.com/UniversalDependencies/UD_French-Sequoia",
585
  "license":"LGPL-LR",
586
  "author":"Candito, Marie; Seddah, Djam\u00e9; Perrier, Guy; Guillaume, Bruno"
@@ -599,8 +603,8 @@
599
  }
600
  ],
601
  "requirements":[
602
- "spacy-transformers>=1.0.3,<1.1.0",
603
- "sentencepiece==0.1.91",
604
  "protobuf"
605
  ]
606
  }
1
  {
2
  "lang":"fr",
3
  "name":"dep_news_trf",
4
+ "version":"3.2.0",
5
  "description":"French transformer pipeline (camembert-base). Components: transformer, morphologizer, parser, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"LGPL-LR",
10
+ "spacy_version":">=3.2.0,<3.3.0",
11
+ "spacy_git_version":"bb26550e2",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
173
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int",
174
  "POS=DET",
175
  "Gender=Masc|Number=Plur|POS=PRON",
 
176
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin",
177
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin",
178
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass",
212
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin",
213
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin",
214
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin",
 
215
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin",
216
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part",
217
  "Gender=Fem|Number=Plur|POS=PROPN",
281
  ],
282
  "performance":{
283
  "token_acc":0.9989751998,
284
+ "token_p":0.9844389844,
285
+ "token_r":0.9896058454,
286
+ "token_f":0.9870156531,
287
+ "pos_acc":0.9863875425,
288
+ "morph_acc":0.9791731106,
289
+ "morph_micro_p":0.9934029687,
290
+ "morph_micro_r":0.9908005361,
291
+ "morph_micro_f":0.9921000458,
 
 
292
  "morph_per_feat":{
293
  "Definite":{
294
+ "p":0.9985358712,
295
+ "r":0.995620438,
296
+ "f":0.9970760234
297
  },
298
  "Number":{
299
+ "p":0.9959409594,
300
+ "r":0.9937407953,
301
+ "f":0.9948396609
302
  },
303
  "PronType":{
304
+ "p":0.9974293059,
305
+ "r":0.9929622521,
306
+ "f":0.9951907663
307
  },
308
  "Gender":{
309
+ "p":0.988963039,
310
+ "r":0.9846664963,
311
+ "f":0.9868100909
312
  },
313
  "Mood":{
314
+ "p":0.9946619217,
315
+ "r":0.9928952043,
316
+ "f":0.9937777778
317
  },
318
  "Person":{
319
+ "p":0.9974683544,
320
+ "r":0.9911949686,
321
+ "f":0.9943217666
322
  },
323
  "Tense":{
324
+ "p":0.9868421053,
325
  "r":0.9959141982,
326
+ "f":0.9913573971
327
  },
328
  "VerbForm":{
329
+ "p":0.9901153213,
330
  "r":0.9950331126,
331
+ "f":0.9925681255
332
  },
333
  "NumType":{
334
+ "p":0.9927797834,
335
+ "r":0.9385665529,
336
+ "f":0.9649122807
337
  },
338
  "Reflex":{
339
  "p":1.0,
341
  "f":1.0
342
  },
343
  "Voice":{
344
+ "p":0.9568965517,
345
  "r":0.9910714286,
346
+ "f":0.9736842105
347
  },
348
  "Poss":{
349
  "p":1.0,
356
  "f":1.0
357
  }
358
  },
359
+ "sents_p":0.8835616438,
360
+ "sents_r":0.9393203883,
361
+ "sents_f":0.9105882353,
362
+ "dep_uas":0.9492309471,
363
+ "dep_las":0.9295953757,
364
  "dep_las_per_type":{
365
  "det":{
366
+ "p":0.9871382637,
367
  "r":0.9911218725,
368
+ "f":0.9891260572
369
  },
370
  "nsubj":{
371
+ "p":0.945368171,
372
+ "r":0.9590361446,
373
+ "f":0.95215311
374
  },
375
  "aux:tense":{
376
+ "p":0.9761904762,
377
  "r":0.984,
378
+ "f":0.9800796813
379
  },
380
  "root":{
381
+ "p":0.9179954442,
382
+ "r":0.9781553398,
383
+ "f":0.9471210341
384
  },
385
  "obj":{
386
+ "p":0.9451219512,
387
+ "r":0.9198813056,
388
+ "f":0.9323308271
389
  },
390
  "cc":{
391
+ "p":0.9495412844,
392
+ "r":0.9539170507,
393
+ "f":0.9517241379
394
  },
395
  "case":{
396
+ "p":0.9850238257,
397
+ "r":0.9856948229,
398
+ "f":0.9853592101
399
  },
400
  "obl:mod":{
401
+ "p":0.8553459119,
402
+ "r":0.8119402985,
403
+ "f":0.8330781011
404
  },
405
  "nmod":{
406
+ "p":0.9012961117,
407
+ "r":0.9030969031,
408
+ "f":0.9021956088
409
  },
410
  "conj":{
411
+ "p":0.832,
412
+ "r":0.8188976378,
413
+ "f":0.8253968254
414
  },
415
  "nummod":{
416
+ "p":0.9069767442,
417
+ "r":0.9230769231,
418
+ "f":0.9149560117
419
  },
420
  "amod":{
421
+ "p":0.9664804469,
422
+ "r":0.9453551913,
423
+ "f":0.955801105
424
  },
425
  "acl":{
426
+ "p":0.85,
427
+ "r":0.8843930636,
428
+ "f":0.8668555241
429
  },
430
  "mark":{
431
+ "p":0.9733333333,
432
+ "r":0.9647577093,
433
+ "f":0.9690265487
434
  },
435
  "xcomp":{
436
+ "p":0.9350649351,
437
+ "r":0.9536423841,
438
+ "f":0.9442622951
439
  },
440
  "flat:name":{
441
+ "p":0.953271028,
442
+ "r":0.9714285714,
443
+ "f":0.9622641509
444
  },
445
  "cop":{
446
+ "p":0.9659090909,
447
  "r":0.9444444444,
448
+ "f":0.9550561798
449
  },
450
  "advmod":{
451
+ "p":0.9361022364,
452
  "r":0.9184952978,
453
+ "f":0.9272151899
454
  },
455
  "obl:arg":{
456
+ "p":0.8878923767,
457
+ "r":0.9,
458
+ "f":0.8939051919
459
  },
460
  "appos":{
461
+ "p":0.7078651685,
462
  "r":0.7590361446,
463
+ "f":0.7325581395
464
  },
465
  "nsubj:pass":{
466
+ "p":0.9761904762,
467
  "r":0.9647058824,
468
+ "f":0.9704142012
469
  },
470
  "aux:pass":{
471
+ "p":0.9910714286,
472
  "r":0.9910714286,
473
+ "f":0.9910714286
474
  },
475
  "acl:relcl":{
476
+ "p":0.8795180723,
477
+ "r":0.8488372093,
478
+ "f":0.8639053254
479
  },
480
  "advcl":{
481
+ "p":0.7594936709,
482
  "r":0.7692307692,
483
+ "f":0.7643312102
484
  },
485
  "fixed":{
486
+ "p":0.9529411765,
487
+ "r":0.81,
488
  "f":0.8756756757
489
  },
490
  "dep":{
491
+ "p":0.2388059701,
492
+ "r":0.5517241379,
493
+ "f":0.3333333333
494
  },
495
  "expl:subj":{
496
+ "p":0.8666666667,
497
+ "r":0.8125,
498
+ "f":0.8387096774
499
  },
500
  "expl:comp":{
501
+ "p":0.75,
502
+ "r":1.0,
503
+ "f":0.8571428571
504
  },
505
  "expl:pass":{
506
+ "p":0.75,
507
  "r":0.4285714286,
508
+ "f":0.5454545455
509
+ },
510
+ "obl:agent":{
511
+ "p":0.975,
512
+ "r":0.9285714286,
513
+ "f":0.9512195122
514
  },
515
  "ccomp":{
516
  "p":0.96,
518
  "f":0.9504950495
519
  },
520
  "parataxis":{
521
+ "p":0.7307692308,
522
+ "r":0.6785714286,
523
+ "f":0.7037037037
524
  },
525
  "iobj":{
526
+ "p":0.7222222222,
527
+ "r":0.52,
528
+ "f":0.6046511628
 
 
 
 
 
529
  },
530
  "nsubj:caus":{
531
  "p":0.0,
548
  "f":0.0
549
  },
550
  "vocative":{
551
+ "p":1.0,
552
  "r":0.625,
553
+ "f":0.7692307692
554
  },
555
  "dislocated":{
556
  "p":0.0,
577
  "r":0.0,
578
  "f":0.0
579
  }
580
+ },
581
+ "tag_acc":0.9573063834,
582
+ "lemma_acc":0.911837238,
583
+ "speed":1637.6043941996
584
  },
585
  "sources":[
586
  {
587
+ "name":"UD French Sequoia v2.8",
588
  "url":"https://github.com/UniversalDependencies/UD_French-Sequoia",
589
  "license":"LGPL-LR",
590
  "author":"Candito, Marie; Seddah, Djam\u00e9; Perrier, Guy; Guillaume, Bruno"
603
  }
604
  ],
605
  "requirements":[
606
+ "spacy-transformers>=1.1.2,<1.2.0",
607
+ "sentencepiece>=0.1.91,!=0.1.92",
608
  "protobuf"
609
  ]
610
  }
morphologizer/cfg CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "labels_morph":{
3
  "POS=PROPN":"",
4
  "Gender=Fem|Number=Sing|POS=DET|PronType=Dem":"Gender=Fem|Number=Sing|PronType=Dem",
@@ -153,7 +154,6 @@
153
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int":"Gender=Fem|Number=Plur|PronType=Int",
154
  "POS=DET":"",
155
  "Gender=Masc|Number=Plur|POS=PRON":"Gender=Masc|Number=Plur",
156
- "POS=PART":"",
157
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
158
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":"Mood=Ind|Person=3|VerbForm=Fin",
159
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
@@ -193,7 +193,6 @@
193
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Imp|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
194
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
195
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=2|Tense=Imp|VerbForm=Fin",
196
- "Gender=Fem|POS=ADV":"Gender=Fem",
197
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=2|Tense=Imp|VerbForm=Fin",
198
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Number=Plur|Tense=Past|VerbForm=Part",
199
  "Gender=Fem|Number=Plur|POS=PROPN":"Gender=Fem|Number=Plur",
@@ -353,7 +352,6 @@
353
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int":90,
354
  "POS=DET":90,
355
  "Gender=Masc|Number=Plur|POS=PRON":95,
356
- "POS=PART":94,
357
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":87,
358
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":100,
359
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
@@ -393,10 +391,10 @@
393
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":100,
394
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":87,
395
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":100,
396
- "Gender=Fem|POS=ADV":86,
397
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin":87,
398
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
399
  "Gender=Fem|Number=Plur|POS=PROPN":96,
400
  "Gender=Masc|NumType=Card|POS=NUM":93
401
- }
 
402
  }
1
  {
2
+ "extend":false,
3
  "labels_morph":{
4
  "POS=PROPN":"",
5
  "Gender=Fem|Number=Sing|POS=DET|PronType=Dem":"Gender=Fem|Number=Sing|PronType=Dem",
154
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int":"Gender=Fem|Number=Plur|PronType=Int",
155
  "POS=DET":"",
156
  "Gender=Masc|Number=Plur|POS=PRON":"Gender=Masc|Number=Plur",
 
157
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=3|Tense=Pres|VerbForm=Fin",
158
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":"Mood=Ind|Person=3|VerbForm=Fin",
159
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":"Number=Sing|Tense=Past|VerbForm=Part|Voice=Pass",
193
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":"Mood=Imp|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin",
194
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":"Mood=Sub|Number=Plur|Person=2|Tense=Pres|VerbForm=Fin",
195
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Plur|Person=2|Tense=Imp|VerbForm=Fin",
 
196
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin":"Mood=Ind|Number=Sing|Person=2|Tense=Imp|VerbForm=Fin",
197
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":"Number=Plur|Tense=Past|VerbForm=Part",
198
  "Gender=Fem|Number=Plur|POS=PROPN":"Gender=Fem|Number=Plur",
352
  "Gender=Fem|Number=Plur|POS=DET|PronType=Int":90,
353
  "POS=DET":90,
354
  "Gender=Masc|Number=Plur|POS=PRON":95,
 
355
  "Mood=Sub|Number=Plur|POS=AUX|Person=3|Tense=Pres|VerbForm=Fin":87,
356
  "Mood=Ind|POS=VERB|Person=3|VerbForm=Fin":100,
357
  "Number=Sing|POS=VERB|Tense=Past|VerbForm=Part|Voice=Pass":100,
391
  "Mood=Imp|Number=Plur|POS=VERB|Person=1|Tense=Pres|VerbForm=Fin":100,
392
  "Mood=Sub|Number=Plur|POS=AUX|Person=2|Tense=Pres|VerbForm=Fin":87,
393
  "Mood=Ind|Number=Plur|POS=VERB|Person=2|Tense=Imp|VerbForm=Fin":100,
 
394
  "Mood=Ind|Number=Sing|POS=AUX|Person=2|Tense=Imp|VerbForm=Fin":87,
395
  "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part":100,
396
  "Gender=Fem|Number=Plur|POS=PROPN":96,
397
  "Gender=Masc|NumType=Card|POS=NUM":93
398
+ },
399
+ "overwrite":true
400
  }
morphologizer/model CHANGED
Binary files a/morphologizer/model and b/morphologizer/model differ
parser/model CHANGED
Binary files a/parser/model and b/parser/model differ
parser/moves CHANGED
@@ -1 +1 @@
1
- ��moves��{"0":{"":25245},"1":{"":21689},"2":{"case":7257,"det":6061,"nsubj":1972,"punct":1645,"advmod":1210,"cc":1205,"mark":1051,"aux:tense":673,"amod":662,"nummod":595,"aux:pass":544,"obl:mod":483,"nsubj:pass":425,"cop":365,"expl:comp":204,"obj":170,"expl:subj":163,"iobj":139,"advcl":123,"nmod":92,"expl:pass":40,"vocative":35,"dep":0},"3":{"nmod":5132,"punct":3954,"amod":2083,"conj":1517,"obj":1410,"obl:mod":1184,"obl:arg":1079,"acl":782,"xcomp":739,"flat:name":657,"advmod":562,"fixed":418,"appos":408,"acl:relcl":365,"advcl":306,"ccomp":238,"obl:agent":206,"dep":138,"nummod":117,"parataxis":92,"nsubj":75,"flat:foreign":63},"4":{"ROOT":2219}}�cfg��neg_key�
1
+ ��moves��{"0":{"":25253},"1":{"":21681},"2":{"case":7257,"det":6061,"nsubj":1982,"punct":1645,"advmod":1210,"cc":1205,"mark":1051,"aux:tense":673,"amod":662,"nummod":595,"aux:pass":544,"obl:mod":483,"nsubj:pass":425,"cop":365,"expl:comp":204,"obj":170,"expl:subj":164,"iobj":139,"advcl":123,"nmod":92,"expl:pass":40,"vocative":35,"dep":0},"3":{"nmod":5132,"punct":3954,"amod":2083,"conj":1517,"obj":1410,"obl:mod":1184,"obl:arg":1079,"acl":782,"xcomp":739,"flat:name":657,"advmod":562,"fixed":409,"appos":408,"acl:relcl":365,"advcl":306,"ccomp":238,"obl:agent":206,"dep":138,"nummod":117,"parataxis":92,"nsubj":75,"flat:foreign":63},"4":{"ROOT":2219}}�cfg��neg_key�
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
transformer/{model/pytorch_model.bin → model} RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e60d9b9cd129f8665a3975eed63b67730d901c0c14567d3371674736d38a2f3a
3
- size 442566449
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:12846daf03978627be431b4b4f08f9b5554803fad9cc07794ad72a49a864b73a
3
+ size 444773224
transformer/model/config.json DELETED
@@ -1,27 +0,0 @@
1
- {
2
- "_name_or_path": "/mnt/scratch/tmp/fr_dep_news_trf/b8b49951-9975-442a-8cca-bd0ecc66506e/training/syntax/model-best/transformer/model",
3
- "architectures": [
4
- "CamembertForMaskedLM"
5
- ],
6
- "attention_probs_dropout_prob": 0.1,
7
- "bos_token_id": 5,
8
- "eos_token_id": 6,
9
- "gradient_checkpointing": false,
10
- "hidden_act": "gelu",
11
- "hidden_dropout_prob": 0.1,
12
- "hidden_size": 768,
13
- "initializer_range": 0.02,
14
- "intermediate_size": 3072,
15
- "layer_norm_eps": 1e-05,
16
- "max_position_embeddings": 514,
17
- "model_type": "camembert",
18
- "num_attention_heads": 12,
19
- "num_hidden_layers": 12,
20
- "output_past": true,
21
- "pad_token_id": 1,
22
- "position_embedding_type": "absolute",
23
- "transformers_version": "4.6.1",
24
- "type_vocab_size": 1,
25
- "use_cache": true,
26
- "vocab_size": 32005
27
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
transformer/model/sentencepiece.bpe.model DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:988bc5a00281c6d210a5d34bd143d0363741a432fefe741bf71e61b1869d4314
3
- size 810912
 
 
 
transformer/model/special_tokens_map.json DELETED
@@ -1 +0,0 @@
1
- {"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "sep_token": "</s>", "pad_token": "<pad>", "cls_token": "<s>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": false}, "additional_special_tokens": ["<s>NOTUSED", "</s>NOTUSED"]}
 
transformer/model/tokenizer.json DELETED
The diff for this file is too large to render. See raw diff
transformer/model/tokenizer_config.json DELETED
@@ -1 +0,0 @@
1
- {"bos_token": "<s>", "eos_token": "</s>", "sep_token": "</s>", "cls_token": "<s>", "unk_token": "<unk>", "pad_token": "<pad>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "additional_special_tokens": ["<s>NOTUSED", "</s>NOTUSED"], "model_max_length": 512, "special_tokens_map_file": null, "name_or_path": "/mnt/scratch/tmp/fr_dep_news_trf/b8b49951-9975-442a-8cca-bd0ecc66506e/training/syntax/model-best/transformer/model"}
 
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b80c7c43be06141e88658ca4c29aacaf6f4e43ae4716328b92c0be72bb279b09
3
- size 228520
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fa26197ca28d4b7e425ab50aeb529b2be869ef9c151d20f88b642d52716ec142
3
+ size 228652
vocab/vectors.cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ {
2
+ "mode":"default"
3
+ }