EC2 Default User commited on
Commit
1d03e65
1 Parent(s): 9a8869b

Update spaCy pipeline

Browse files
.gitattributes CHANGED
@@ -19,3 +19,4 @@
19
  *strings.json filter=lfs diff=lfs merge=lfs -text
20
  vectors filter=lfs diff=lfs merge=lfs -text
21
  model filter=lfs diff=lfs merge=lfs -text
 
 
19
  *strings.json filter=lfs diff=lfs merge=lfs -text
20
  vectors filter=lfs diff=lfs merge=lfs -text
21
  model filter=lfs diff=lfs merge=lfs -text
22
+ *key2row filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -14,47 +14,41 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.8574246409
18
  - name: NER Recall
19
  type: recall
20
- value: 0.8490084135
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.8531957725
24
  - task:
25
- name: POS
26
  type: token-classification
27
  metrics:
28
- - name: POS Accuracy
29
  type: accuracy
30
- value: 0.9741780493
31
  - task:
32
- name: SENTER
33
  type: token-classification
34
  metrics:
35
- - name: SENTER Precision
36
- type: precision
37
- value: 0.9179358172
38
- - name: SENTER Recall
39
- type: recall
40
- value: 0.8906260307
41
- - name: SENTER F Score
42
  type: f_score
43
- value: 0.9040747313
44
  - task:
45
- name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
- - name: Unlabeled Dependencies Accuracy
49
- type: accuracy
50
- value: 0.9200593914
51
  - task:
52
- name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
- - name: Labeled Dependencies Accuracy
56
- type: accuracy
57
- value: 0.9200593914
58
  ---
59
  ### Details: https://spacy.io/models/en#en_core_web_lg
60
 
@@ -63,11 +57,11 @@ English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter,
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `en_core_web_lg` |
66
- | **Version** | `3.2.0` |
67
- | **spaCy** | `>=3.2.0,<3.3.0` |
68
  | **Default Pipeline** | `tok2vec`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
- | **Vectors** | 684830 keys, 684830 unique vectors (300 dimensions) |
71
  | **Sources** | [OntoNotes 5](https://catalog.ldc.upenn.edu/LDC2013T19) (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)<br />[ClearNLP Constituent-to-Dependency Conversion](https://github.com/clir/clearnlp-guidelines/blob/master/md/components/dependency_conversion.md) (Emory University)<br />[WordNet 3.0](https://wordnet.princeton.edu/) (Princeton University)<br />[GloVe Common Crawl](https://nlp.stanford.edu/projects/glove/) (Jeffrey Pennington, Richard Socher, and Christopher D. Manning) |
72
  | **License** | `MIT` |
73
  | **Author** | [Explosion](https://explosion.ai) |
@@ -76,13 +70,12 @@ English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter,
76
 
77
  <details>
78
 
79
- <summary>View label scheme (114 labels for 4 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
  | **`tagger`** | `$`, `''`, `,`, `-LRB-`, `-RRB-`, `.`, `:`, `ADD`, `AFX`, `CC`, `CD`, `DT`, `EX`, `FW`, `HYPH`, `IN`, `JJ`, `JJR`, `JJS`, `LS`, `MD`, `NFP`, `NN`, `NNP`, `NNPS`, `NNS`, `PDT`, `POS`, `PRP`, `PRP$`, `RB`, `RBR`, `RBS`, `RP`, `SYM`, `TO`, `UH`, `VB`, `VBD`, `VBG`, `VBN`, `VBP`, `VBZ`, `WDT`, `WP`, `WP$`, `WRB`, `XX`, ```` |
84
  | **`parser`** | `ROOT`, `acl`, `acomp`, `advcl`, `advmod`, `agent`, `amod`, `appos`, `attr`, `aux`, `auxpass`, `case`, `cc`, `ccomp`, `compound`, `conj`, `csubj`, `csubjpass`, `dative`, `dep`, `det`, `dobj`, `expl`, `intj`, `mark`, `meta`, `neg`, `nmod`, `npadvmod`, `nsubj`, `nsubjpass`, `nummod`, `oprd`, `parataxis`, `pcomp`, `pobj`, `poss`, `preconj`, `predet`, `prep`, `prt`, `punct`, `quantmod`, `relcl`, `xcomp` |
85
- | **`senter`** | `I`, `S` |
86
  | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PRODUCT`, `QUANTITY`, `TIME`, `WORK_OF_ART` |
87
 
88
  </details>
@@ -95,12 +88,12 @@ English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter,
95
  | `TOKEN_P` | 99.57 |
96
  | `TOKEN_R` | 99.58 |
97
  | `TOKEN_F` | 99.57 |
98
- | `TAG_ACC` | 97.42 |
99
- | `SENTS_P` | 91.79 |
100
- | `SENTS_R` | 89.06 |
101
- | `SENTS_F` | 90.41 |
102
- | `DEP_UAS` | 92.01 |
103
- | `DEP_LAS` | 90.22 |
104
- | `ENTS_P` | 85.74 |
105
- | `ENTS_R` | 84.90 |
106
  | `ENTS_F` | 85.32 |
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.8602117695
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.8462540064
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.8531758053
24
  - task:
25
+ name: TAG
26
  type: token-classification
27
  metrics:
28
+ - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
+ value: 0.9738145328
31
  - task:
32
+ name: UNLABELED_DEPENDENCIES
33
  type: token-classification
34
  metrics:
35
+ - name: Unlabeled Attachment Score (UAS)
 
 
 
 
 
 
36
  type: f_score
37
+ value: 0.9188508811
38
  - task:
39
+ name: LABELED_DEPENDENCIES
40
  type: token-classification
41
  metrics:
42
+ - name: Labeled Attachment Score (LAS)
43
+ type: f_score
44
+ value: 0.9008477499
45
  - task:
46
+ name: SENTS
47
  type: token-classification
48
  metrics:
49
+ - name: Sentences F-Score
50
+ type: f_score
51
+ value: 0.9033533215
52
  ---
53
  ### Details: https://spacy.io/models/en#en_core_web_lg
54
 
 
57
  | Feature | Description |
58
  | --- | --- |
59
  | **Name** | `en_core_web_lg` |
60
+ | **Version** | `3.3.0` |
61
+ | **spaCy** | `>=3.3.0.dev0,<3.4.0` |
62
  | **Default Pipeline** | `tok2vec`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
63
  | **Components** | `tok2vec`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
64
+ | **Vectors** | 684830 keys, 342918 unique vectors (300 dimensions) |
65
  | **Sources** | [OntoNotes 5](https://catalog.ldc.upenn.edu/LDC2013T19) (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)<br />[ClearNLP Constituent-to-Dependency Conversion](https://github.com/clir/clearnlp-guidelines/blob/master/md/components/dependency_conversion.md) (Emory University)<br />[WordNet 3.0](https://wordnet.princeton.edu/) (Princeton University)<br />[GloVe Common Crawl](https://nlp.stanford.edu/projects/glove/) (Jeffrey Pennington, Richard Socher, and Christopher D. Manning) |
66
  | **License** | `MIT` |
67
  | **Author** | [Explosion](https://explosion.ai) |
 
70
 
71
  <details>
72
 
73
+ <summary>View label scheme (112 labels for 3 components)</summary>
74
 
75
  | Component | Labels |
76
  | --- | --- |
77
  | **`tagger`** | `$`, `''`, `,`, `-LRB-`, `-RRB-`, `.`, `:`, `ADD`, `AFX`, `CC`, `CD`, `DT`, `EX`, `FW`, `HYPH`, `IN`, `JJ`, `JJR`, `JJS`, `LS`, `MD`, `NFP`, `NN`, `NNP`, `NNPS`, `NNS`, `PDT`, `POS`, `PRP`, `PRP$`, `RB`, `RBR`, `RBS`, `RP`, `SYM`, `TO`, `UH`, `VB`, `VBD`, `VBG`, `VBN`, `VBP`, `VBZ`, `WDT`, `WP`, `WP$`, `WRB`, `XX`, ```` |
78
  | **`parser`** | `ROOT`, `acl`, `acomp`, `advcl`, `advmod`, `agent`, `amod`, `appos`, `attr`, `aux`, `auxpass`, `case`, `cc`, `ccomp`, `compound`, `conj`, `csubj`, `csubjpass`, `dative`, `dep`, `det`, `dobj`, `expl`, `intj`, `mark`, `meta`, `neg`, `nmod`, `npadvmod`, `nsubj`, `nsubjpass`, `nummod`, `oprd`, `parataxis`, `pcomp`, `pobj`, `poss`, `preconj`, `predet`, `prep`, `prt`, `punct`, `quantmod`, `relcl`, `xcomp` |
 
79
  | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PRODUCT`, `QUANTITY`, `TIME`, `WORK_OF_ART` |
80
 
81
  </details>
 
88
  | `TOKEN_P` | 99.57 |
89
  | `TOKEN_R` | 99.58 |
90
  | `TOKEN_F` | 99.57 |
91
+ | `TAG_ACC` | 97.38 |
92
+ | `SENTS_P` | 91.77 |
93
+ | `SENTS_R` | 88.94 |
94
+ | `SENTS_F` | 90.34 |
95
+ | `DEP_UAS` | 91.89 |
96
+ | `DEP_LAS` | 90.08 |
97
+ | `ENTS_P` | 86.02 |
98
+ | `ENTS_R` | 84.63 |
99
  | `ENTS_F` | 85.32 |
accuracy.json CHANGED
@@ -1,229 +1,229 @@
1
  {
2
- "token_acc": 0.9993053983,
3
- "token_p": 0.9956742163,
4
- "token_r": 0.9957505887,
5
- "token_f": 0.9957124011,
6
- "tag_acc": 0.9741780493,
7
- "sents_p": 0.9179358172,
8
- "sents_r": 0.8906260307,
9
- "sents_f": 0.9040747313,
10
- "dep_uas": 0.9200593914,
11
- "dep_las": 0.9021556352,
12
  "dep_las_per_type": {
13
  "prep": {
14
- "p": 0.8578239976,
15
- "r": 0.8669702144,
16
- "f": 0.8623728558
17
  },
18
  "det": {
19
- "p": 0.9798012706,
20
- "r": 0.9809997554,
21
- "f": 0.9804001467
22
  },
23
  "pobj": {
24
- "p": 0.9617211838,
25
- "r": 0.9698409582,
26
- "f": 0.9657640043
27
  },
28
  "nsubj": {
29
- "p": 0.9608471258,
30
- "r": 0.9461555312,
31
- "f": 0.9534447363
32
  },
33
  "aux": {
34
- "p": 0.9807231056,
35
- "r": 0.9828184813,
36
- "f": 0.9817696754
37
  },
38
  "advmod": {
39
- "p": 0.8600169779,
40
- "r": 0.8523472993,
41
- "f": 0.8561649624
42
  },
43
  "relcl": {
44
- "p": 0.755259467,
45
- "r": 0.7815674891,
46
- "f": 0.7681883024
47
  },
48
  "root": {
49
- "p": 0.9196325281,
50
- "r": 0.8914836071,
51
- "f": 0.905339318
52
  },
53
  "xcomp": {
54
- "p": 0.8886906885,
55
- "r": 0.8941134243,
56
- "f": 0.8913938093
57
  },
58
  "amod": {
59
- "p": 0.9216490817,
60
- "r": 0.9168124393,
61
- "f": 0.9192243983
62
  },
63
  "compound": {
64
- "p": 0.9188224309,
65
- "r": 0.9316662954,
66
- "f": 0.9251997898
67
  },
68
  "poss": {
69
- "p": 0.9760755931,
70
- "r": 0.9772544283,
71
- "f": 0.976664655
72
  },
73
  "ccomp": {
74
- "p": 0.7797340326,
75
- "r": 0.8478615071,
76
- "f": 0.8123719387
77
  },
78
  "attr": {
79
- "p": 0.8931845357,
80
- "r": 0.9423885618,
81
- "f": 0.9171270718
82
  },
83
  "case": {
84
- "p": 0.9782823297,
85
- "r": 0.991991992,
86
- "f": 0.9850894632
87
  },
88
  "mark": {
89
- "p": 0.9105669417,
90
- "r": 0.9064652888,
91
- "f": 0.9085114859
92
  },
93
  "intj": {
94
- "p": 0.6737089202,
95
- "r": 0.6307692308,
96
- "f": 0.6515323496
97
  },
98
  "advcl": {
99
- "p": 0.67003282,
100
- "r": 0.6683455049,
101
- "f": 0.6691880988
102
  },
103
  "cc": {
104
- "p": 0.8372232916,
105
- "r": 0.8323167085,
106
- "f": 0.8347627901
107
  },
108
  "neg": {
109
- "p": 0.9408548708,
110
- "r": 0.9498243853,
111
- "f": 0.9453183521
112
  },
113
  "conj": {
114
- "p": 0.7653624433,
115
- "r": 0.7854984894,
116
- "f": 0.7752997453
117
  },
118
  "nsubjpass": {
119
- "p": 0.9327377824,
120
- "r": 0.9102564103,
121
- "f": 0.9213599792
122
  },
123
  "auxpass": {
124
- "p": 0.9543624161,
125
- "r": 0.9717539863,
126
- "f": 0.962979684
127
  },
128
  "dobj": {
129
- "p": 0.9222826087,
130
- "r": 0.9466092916,
131
- "f": 0.9342876244
132
  },
133
  "nummod": {
134
- "p": 0.9430395913,
135
- "r": 0.9323232323,
136
- "f": 0.9376507937
137
  },
138
  "npadvmod": {
139
- "p": 0.7866254349,
140
- "r": 0.7229129663,
141
- "f": 0.7534246575
142
  },
143
  "prt": {
144
- "p": 0.8190082645,
145
- "r": 0.8879928315,
146
- "f": 0.8521066208
147
  },
148
  "pcomp": {
149
- "p": 0.879020979,
150
- "r": 0.8802521008,
151
- "f": 0.8796361092
152
  },
153
  "expl": {
154
- "p": 0.9809725159,
155
- "r": 0.9935760171,
156
- "f": 0.9872340426
157
  },
158
  "acl": {
159
- "p": 0.7443997702,
160
- "r": 0.7070376432,
161
- "f": 0.7252378288
162
  },
163
  "agent": {
164
- "p": 0.8928571429,
165
- "r": 0.9408602151,
166
- "f": 0.9162303665
167
  },
168
  "dative": {
169
- "p": 0.7729591837,
170
- "r": 0.6949541284,
171
- "f": 0.731884058
172
  },
173
  "acomp": {
174
- "p": 0.9102505695,
175
- "r": 0.906122449,
176
- "f": 0.9081818182
177
  },
178
  "dep": {
179
- "p": 0.4491525424,
180
- "r": 0.1720779221,
181
- "f": 0.2488262911
182
  },
183
  "csubj": {
184
- "p": 0.7243589744,
185
- "r": 0.6686390533,
186
- "f": 0.6953846154
187
  },
188
  "quantmod": {
189
- "p": 0.8686779059,
190
- "r": 0.7952883834,
191
- "f": 0.8303647159
192
  },
193
  "nmod": {
194
- "p": 0.76,
195
- "r": 0.5789152956,
196
- "f": 0.6572120374
197
  },
198
  "appos": {
199
- "p": 0.7035040431,
200
- "r": 0.6793926247,
201
- "f": 0.6912381373
202
  },
203
  "predet": {
204
- "p": 0.8300395257,
205
- "r": 0.9012875536,
206
- "f": 0.8641975309
207
  },
208
  "preconj": {
209
- "p": 0.5784313725,
210
- "r": 0.6860465116,
211
- "f": 0.6276595745
212
  },
213
  "oprd": {
214
- "p": 0.8379310345,
215
- "r": 0.7253731343,
216
- "f": 0.7776
217
  },
218
  "parataxis": {
219
- "p": 0.6312849162,
220
- "r": 0.4902386117,
221
- "f": 0.5518925519
222
  },
223
  "meta": {
224
- "p": 0.7647058824,
225
- "r": 0.25,
226
- "f": 0.3768115942
227
  },
228
  "csubjpass": {
229
  "p": 0.5555555556,
@@ -231,100 +231,100 @@
231
  "f": 0.6666666667
232
  }
233
  },
234
- "ents_p": 0.8574246409,
235
- "ents_r": 0.8490084135,
236
- "ents_f": 0.8531957725,
237
  "ents_per_type": {
238
  "DATE": {
239
- "p": 0.8695102686,
240
- "r": 0.8736507937,
241
- "f": 0.8715756136
242
  },
243
  "GPE": {
244
- "p": 0.9231641622,
245
- "r": 0.9082287308,
246
- "f": 0.9156355456
247
  },
248
  "ORDINAL": {
249
- "p": 0.7971428571,
250
- "r": 0.8664596273,
251
- "f": 0.8303571429
252
  },
253
  "ORG": {
254
- "p": 0.8194444444,
255
- "r": 0.8290031813,
256
- "f": 0.8241960991
257
- },
258
- "QUANTITY": {
259
- "p": 0.7959183673,
260
- "r": 0.6428571429,
261
- "f": 0.7112462006
262
  },
263
  "CARDINAL": {
264
- "p": 0.8221709007,
265
- "r": 0.8466111772,
266
- "f": 0.834212068
267
  },
268
  "PERSON": {
269
- "p": 0.8823895457,
270
- "r": 0.9255874674,
271
- "f": 0.9034724435
272
  },
273
  "NORP": {
274
- "p": 0.9027888446,
275
- "r": 0.9064,
276
- "f": 0.9045908184
277
  },
278
  "LOC": {
279
- "p": 0.7185185185,
280
- "r": 0.6178343949,
281
- "f": 0.6643835616
282
  },
283
  "FAC": {
284
- "p": 0.4263565891,
285
- "r": 0.4230769231,
286
- "f": 0.4247104247
287
  },
288
  "TIME": {
289
- "p": 0.7396825397,
290
- "r": 0.6812865497,
291
- "f": 0.7092846271
292
  },
293
- "PRODUCT": {
294
- "p": 0.6022727273,
295
- "r": 0.2511848341,
296
- "f": 0.3545150502
297
  },
298
  "EVENT": {
299
- "p": 0.5882352941,
300
- "r": 0.2873563218,
301
- "f": 0.3861003861
302
  },
303
  "WORK_OF_ART": {
304
- "p": 0.4692307692,
305
- "r": 0.3144329897,
306
- "f": 0.3765432099
307
- },
308
- "LAW": {
309
- "p": 0.5272727273,
310
- "r": 0.453125,
311
- "f": 0.487394958
312
  },
313
  "MONEY": {
314
- "p": 0.8990498812,
315
- "r": 0.893742621,
316
- "f": 0.8963883955
 
 
 
 
 
317
  },
318
  "PERCENT": {
319
- "p": 0.9202551834,
320
  "r": 0.8836140888,
321
- "f": 0.9015625
322
  },
323
  "LANGUAGE": {
324
- "p": 0.8,
325
- "r": 0.625,
326
- "f": 0.701754386
 
 
 
 
 
327
  }
328
  },
329
- "speed": 7471.5995598921
330
  }
 
1
  {
2
+ "token_acc": 0.9993092439,
3
+ "token_p": 0.9956819193,
4
+ "token_r": 0.9957659295,
5
+ "token_f": 0.9957239226,
6
+ "tag_acc": 0.9738145328,
7
+ "sents_p": 0.9177103185,
8
+ "sents_r": 0.8894386173,
9
+ "sents_f": 0.9033533215,
10
+ "dep_uas": 0.9188508811,
11
+ "dep_las": 0.9008477499,
12
  "dep_las_per_type": {
13
  "prep": {
14
+ "p": 0.8537864878,
15
+ "r": 0.8645418327,
16
+ "f": 0.8591305004
17
  },
18
  "det": {
19
+ "p": 0.9790682522,
20
+ "r": 0.9802658403,
21
+ "f": 0.9796666802
22
  },
23
  "pobj": {
24
+ "p": 0.9633579437,
25
+ "r": 0.9684272531,
26
+ "f": 0.965885947
27
  },
28
  "nsubj": {
29
+ "p": 0.9564757243,
30
+ "r": 0.9502738226,
31
+ "f": 0.9533646873
32
  },
33
  "aux": {
34
+ "p": 0.9809760868,
35
+ "r": 0.9823733642,
36
+ "f": 0.9816742283
37
  },
38
  "advmod": {
39
+ "p": 0.8550492715,
40
+ "r": 0.8541140838,
41
+ "f": 0.8545814218
42
  },
43
  "relcl": {
44
+ "p": 0.7709000356,
45
+ "r": 0.7862844702,
46
+ "f": 0.7785162565
47
  },
48
  "root": {
49
+ "p": 0.9183576195,
50
+ "r": 0.889702487,
51
+ "f": 0.9038029821
52
  },
53
  "xcomp": {
54
+ "p": 0.882620883,
55
+ "r": 0.9041636755,
56
+ "f": 0.8932624113
57
  },
58
  "amod": {
59
+ "p": 0.9195970101,
60
+ "r": 0.9166180758,
61
+ "f": 0.9181051265
62
  },
63
  "compound": {
64
+ "p": 0.9193539526,
65
+ "r": 0.9320004455,
66
+ "f": 0.9256340054
67
  },
68
  "poss": {
69
+ "p": 0.9711422846,
70
+ "r": 0.9754428341,
71
+ "f": 0.9732878088
72
  },
73
  "ccomp": {
74
+ "p": 0.7727868239,
75
+ "r": 0.8409368635,
76
+ "f": 0.8054228031
77
  },
78
  "attr": {
79
+ "p": 0.8955042527,
80
+ "r": 0.9297729184,
81
+ "f": 0.912316897
82
  },
83
  "case": {
84
+ "p": 0.9758144126,
85
+ "r": 0.9894894895,
86
+ "f": 0.9826043738
87
  },
88
  "mark": {
89
+ "p": 0.9062829989,
90
+ "r": 0.9096449391,
91
+ "f": 0.9079608569
92
  },
93
  "intj": {
94
+ "p": 0.6653322658,
95
+ "r": 0.6087912088,
96
+ "f": 0.635807192
97
  },
98
  "advcl": {
99
+ "p": 0.6779661017,
100
+ "r": 0.6648199446,
101
+ "f": 0.6713286713
102
  },
103
  "cc": {
104
+ "p": 0.8292624233,
105
+ "r": 0.824303313,
106
+ "f": 0.8267754319
107
  },
108
  "neg": {
109
+ "p": 0.9393336648,
110
+ "r": 0.9478173608,
111
+ "f": 0.9435564436
112
  },
113
  "conj": {
114
+ "p": 0.763665795,
115
+ "r": 0.7720292044,
116
+ "f": 0.7678247261
117
  },
118
  "nsubjpass": {
119
+ "p": 0.9263266358,
120
+ "r": 0.9220512821,
121
+ "f": 0.9241840144
122
  },
123
  "auxpass": {
124
+ "p": 0.9499329459,
125
+ "r": 0.9681093394,
126
+ "f": 0.9589350181
127
  },
128
  "dobj": {
129
+ "p": 0.926432648,
130
+ "r": 0.9442983505,
131
+ "f": 0.9352801894
132
  },
133
  "nummod": {
134
+ "p": 0.9362134689,
135
+ "r": 0.9303030303,
136
+ "f": 0.9332488917
137
  },
138
  "npadvmod": {
139
+ "p": 0.7723030982,
140
+ "r": 0.734991119,
141
+ "f": 0.753185293
142
  },
143
  "prt": {
144
+ "p": 0.8160066007,
145
+ "r": 0.8862007168,
146
+ "f": 0.8496563574
147
  },
148
  "pcomp": {
149
+ "p": 0.8800841515,
150
+ "r": 0.8788515406,
151
+ "f": 0.8794674142
152
  },
153
  "expl": {
154
+ "p": 0.9809322034,
155
+ "r": 0.9914346895,
156
+ "f": 0.9861554846
157
  },
158
  "acl": {
159
+ "p": 0.7556456283,
160
+ "r": 0.7119476268,
161
+ "f": 0.7331460674
162
  },
163
  "agent": {
164
+ "p": 0.8991452991,
165
+ "r": 0.9426523297,
166
+ "f": 0.9203849519
167
  },
168
  "dative": {
169
+ "p": 0.810298103,
170
+ "r": 0.6857798165,
171
+ "f": 0.7428571429
172
  },
173
  "acomp": {
174
+ "p": 0.9111721612,
175
+ "r": 0.9024943311,
176
+ "f": 0.9068124858
177
  },
178
  "dep": {
179
+ "p": 0.3930131004,
180
+ "r": 0.1461038961,
181
+ "f": 0.2130177515
182
  },
183
  "csubj": {
184
+ "p": 0.7068965517,
185
+ "r": 0.7278106509,
186
+ "f": 0.7172011662
187
  },
188
  "quantmod": {
189
+ "p": 0.8746594005,
190
+ "r": 0.7822908205,
191
+ "f": 0.8259005146
192
  },
193
  "nmod": {
194
+ "p": 0.7651217596,
195
+ "r": 0.5935405241,
196
+ "f": 0.6684969115
197
  },
198
  "appos": {
199
+ "p": 0.6994459834,
200
+ "r": 0.6572668113,
201
+ "f": 0.6777007381
202
  },
203
  "predet": {
204
+ "p": 0.8380566802,
205
+ "r": 0.8884120172,
206
+ "f": 0.8625
207
  },
208
  "preconj": {
209
+ "p": 0.537037037,
210
+ "r": 0.6744186047,
211
+ "f": 0.5979381443
212
  },
213
  "oprd": {
214
+ "p": 0.8477508651,
215
+ "r": 0.7313432836,
216
+ "f": 0.7852564103
217
  },
218
  "parataxis": {
219
+ "p": 0.6187845304,
220
+ "r": 0.4859002169,
221
+ "f": 0.5443499392
222
  },
223
  "meta": {
224
+ "p": 1.0,
225
+ "r": 0.3269230769,
226
+ "f": 0.4927536232
227
  },
228
  "csubjpass": {
229
  "p": 0.5555555556,
 
231
  "f": 0.6666666667
232
  }
233
  },
234
+ "ents_p": 0.8602117695,
235
+ "ents_r": 0.8462540064,
236
+ "ents_f": 0.8531758053,
237
  "ents_per_type": {
238
  "DATE": {
239
+ "p": 0.872593068,
240
+ "r": 0.8631746032,
241
+ "f": 0.8678582828
242
  },
243
  "GPE": {
244
+ "p": 0.9257256688,
245
+ "r": 0.9073919107,
246
+ "f": 0.916467108
247
  },
248
  "ORDINAL": {
249
+ "p": 0.787965616,
250
+ "r": 0.8540372671,
251
+ "f": 0.8196721311
252
  },
253
  "ORG": {
254
+ "p": 0.8203309693,
255
+ "r": 0.8279427359,
256
+ "f": 0.8241192769
 
 
 
 
 
257
  },
258
  "CARDINAL": {
259
+ "p": 0.8304398148,
260
+ "r": 0.8531510107,
261
+ "f": 0.8416422287
262
  },
263
  "PERSON": {
264
+ "p": 0.8953229399,
265
+ "r": 0.9184073107,
266
+ "f": 0.9067182214
267
  },
268
  "NORP": {
269
+ "p": 0.8794048551,
270
+ "r": 0.8984,
271
+ "f": 0.8888009497
272
  },
273
  "LOC": {
274
+ "p": 0.7147766323,
275
+ "r": 0.6624203822,
276
+ "f": 0.6876033058
277
  },
278
  "FAC": {
279
+ "p": 0.3949579832,
280
+ "r": 0.3615384615,
281
+ "f": 0.3775100402
282
  },
283
  "TIME": {
284
+ "p": 0.71875,
285
+ "r": 0.6725146199,
286
+ "f": 0.6948640483
287
  },
288
+ "QUANTITY": {
289
+ "p": 0.8014184397,
290
+ "r": 0.6208791209,
291
+ "f": 0.6996904025
292
  },
293
  "EVENT": {
294
+ "p": 0.6354166667,
295
+ "r": 0.3505747126,
296
+ "f": 0.4518518519
297
  },
298
  "WORK_OF_ART": {
299
+ "p": 0.5,
300
+ "r": 0.3092783505,
301
+ "f": 0.3821656051
 
 
 
 
 
302
  },
303
  "MONEY": {
304
+ "p": 0.9039145907,
305
+ "r": 0.8996458087,
306
+ "f": 0.9017751479
307
+ },
308
+ "LAW": {
309
+ "p": 0.6428571429,
310
+ "r": 0.421875,
311
+ "f": 0.5094339623
312
  },
313
  "PERCENT": {
314
+ "p": 0.9187898089,
315
  "r": 0.8836140888,
316
+ "f": 0.9008587041
317
  },
318
  "LANGUAGE": {
319
+ "p": 0.75,
320
+ "r": 0.65625,
321
+ "f": 0.7
322
+ },
323
+ "PRODUCT": {
324
+ "p": 0.6097560976,
325
+ "r": 0.2369668246,
326
+ "f": 0.3412969283
327
  }
328
  },
329
+ "speed": 7281.6726563626
330
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
 
config.cfg CHANGED
@@ -55,7 +55,7 @@ nO = null
55
  @architectures = "spacy.MultiHashEmbed.v2"
56
  width = 96
57
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
58
- rows = [5000,2500,2500,2500,100]
59
  include_static_vectors = true
60
 
61
  [components.ner.model.tok2vec.encode]
@@ -93,8 +93,9 @@ overwrite = false
93
  scorer = {"@scorers":"spacy.senter_scorer.v1"}
94
 
95
  [components.senter.model]
96
- @architectures = "spacy.Tagger.v1"
97
  nO = null
 
98
 
99
  [components.senter.model.tok2vec]
100
  @architectures = "spacy.Tok2Vec.v2"
@@ -115,12 +116,14 @@ maxout_pieces = 2
115
 
116
  [components.tagger]
117
  factory = "tagger"
 
118
  overwrite = false
119
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
120
 
121
  [components.tagger.model]
122
- @architectures = "spacy.Tagger.v1"
123
  nO = null
 
124
 
125
  [components.tagger.model.tok2vec]
126
  @architectures = "spacy.Tok2VecListener.v1"
@@ -137,7 +140,7 @@ factory = "tok2vec"
137
  @architectures = "spacy.MultiHashEmbed.v2"
138
  width = ${components.tok2vec.model.encode:width}
139
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
140
- rows = [5000,2500,2500,2500,100]
141
  include_static_vectors = true
142
 
143
  [components.tok2vec.model.encode]
@@ -174,7 +177,7 @@ dropout = 0.1
174
  accumulate_gradient = 1
175
  patience = 5000
176
  max_epochs = 0
177
- max_steps = 0
178
  eval_frequency = 1000
179
  frozen_components = []
180
  before_to_disk = null
 
55
  @architectures = "spacy.MultiHashEmbed.v2"
56
  width = 96
57
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
58
+ rows = [5000,1000,2500,2500,50]
59
  include_static_vectors = true
60
 
61
  [components.ner.model.tok2vec.encode]
 
93
  scorer = {"@scorers":"spacy.senter_scorer.v1"}
94
 
95
  [components.senter.model]
96
+ @architectures = "spacy.Tagger.v2"
97
  nO = null
98
+ normalize = false
99
 
100
  [components.senter.model.tok2vec]
101
  @architectures = "spacy.Tok2Vec.v2"
 
116
 
117
  [components.tagger]
118
  factory = "tagger"
119
+ neg_prefix = "!"
120
  overwrite = false
121
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
122
 
123
  [components.tagger.model]
124
+ @architectures = "spacy.Tagger.v2"
125
  nO = null
126
+ normalize = false
127
 
128
  [components.tagger.model.tok2vec]
129
  @architectures = "spacy.Tok2VecListener.v1"
 
140
  @architectures = "spacy.MultiHashEmbed.v2"
141
  width = ${components.tok2vec.model.encode:width}
142
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
143
+ rows = [5000,1000,2500,2500,50]
144
  include_static_vectors = true
145
 
146
  [components.tok2vec.model.encode]
 
177
  accumulate_gradient = 1
178
  patience = 5000
179
  max_epochs = 0
180
+ max_steps = 100000
181
  eval_frequency = 1000
182
  frozen_components = []
183
  before_to_disk = null
en_core_web_lg-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ae37a39df924099ea9e9d9d4d2912bbed9c534089c9f5f3ac0a3564ca7815521
3
- size 777382778
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6ce19d37dfe5280400f80a5954d41afca10cbc742b97bfcf4b0e452b6eb24273
3
+ size 400651786
meta.json CHANGED
@@ -1,17 +1,17 @@
1
  {
2
  "lang":"en",
3
  "name":"core_web_lg",
4
- "version":"3.2.0",
5
  "description":"English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"MIT",
10
- "spacy_version":">=3.2.0,<3.3.0",
11
- "spacy_git_version":"bb26550e2",
12
  "vectors":{
13
  "width":300,
14
- "vectors":684830,
15
  "keys":684830,
16
  "name":"en_vectors"
17
  },
@@ -117,10 +117,6 @@
117
  "relcl",
118
  "xcomp"
119
  ],
120
- "senter":[
121
- "I",
122
- "S"
123
- ],
124
  "attribute_ruler":[
125
 
126
  ],
@@ -169,231 +165,231 @@
169
  "senter"
170
  ],
171
  "performance":{
172
- "token_acc":0.9993053983,
173
- "token_p":0.9956742163,
174
- "token_r":0.9957505887,
175
- "token_f":0.9957124011,
176
- "tag_acc":0.9741780493,
177
- "sents_p":0.9179358172,
178
- "sents_r":0.8906260307,
179
- "sents_f":0.9040747313,
180
- "dep_uas":0.9200593914,
181
- "dep_las":0.9021556352,
182
  "dep_las_per_type":{
183
  "prep":{
184
- "p":0.8578239976,
185
- "r":0.8669702144,
186
- "f":0.8623728558
187
  },
188
  "det":{
189
- "p":0.9798012706,
190
- "r":0.9809997554,
191
- "f":0.9804001467
192
  },
193
  "pobj":{
194
- "p":0.9617211838,
195
- "r":0.9698409582,
196
- "f":0.9657640043
197
  },
198
  "nsubj":{
199
- "p":0.9608471258,
200
- "r":0.9461555312,
201
- "f":0.9534447363
202
  },
203
  "aux":{
204
- "p":0.9807231056,
205
- "r":0.9828184813,
206
- "f":0.9817696754
207
  },
208
  "advmod":{
209
- "p":0.8600169779,
210
- "r":0.8523472993,
211
- "f":0.8561649624
212
  },
213
  "relcl":{
214
- "p":0.755259467,
215
- "r":0.7815674891,
216
- "f":0.7681883024
217
  },
218
  "root":{
219
- "p":0.9196325281,
220
- "r":0.8914836071,
221
- "f":0.905339318
222
  },
223
  "xcomp":{
224
- "p":0.8886906885,
225
- "r":0.8941134243,
226
- "f":0.8913938093
227
  },
228
  "amod":{
229
- "p":0.9216490817,
230
- "r":0.9168124393,
231
- "f":0.9192243983
232
  },
233
  "compound":{
234
- "p":0.9188224309,
235
- "r":0.9316662954,
236
- "f":0.9251997898
237
  },
238
  "poss":{
239
- "p":0.9760755931,
240
- "r":0.9772544283,
241
- "f":0.976664655
242
  },
243
  "ccomp":{
244
- "p":0.7797340326,
245
- "r":0.8478615071,
246
- "f":0.8123719387
247
  },
248
  "attr":{
249
- "p":0.8931845357,
250
- "r":0.9423885618,
251
- "f":0.9171270718
252
  },
253
  "case":{
254
- "p":0.9782823297,
255
- "r":0.991991992,
256
- "f":0.9850894632
257
  },
258
  "mark":{
259
- "p":0.9105669417,
260
- "r":0.9064652888,
261
- "f":0.9085114859
262
  },
263
  "intj":{
264
- "p":0.6737089202,
265
- "r":0.6307692308,
266
- "f":0.6515323496
267
  },
268
  "advcl":{
269
- "p":0.67003282,
270
- "r":0.6683455049,
271
- "f":0.6691880988
272
  },
273
  "cc":{
274
- "p":0.8372232916,
275
- "r":0.8323167085,
276
- "f":0.8347627901
277
  },
278
  "neg":{
279
- "p":0.9408548708,
280
- "r":0.9498243853,
281
- "f":0.9453183521
282
  },
283
  "conj":{
284
- "p":0.7653624433,
285
- "r":0.7854984894,
286
- "f":0.7752997453
287
  },
288
  "nsubjpass":{
289
- "p":0.9327377824,
290
- "r":0.9102564103,
291
- "f":0.9213599792
292
  },
293
  "auxpass":{
294
- "p":0.9543624161,
295
- "r":0.9717539863,
296
- "f":0.962979684
297
  },
298
  "dobj":{
299
- "p":0.9222826087,
300
- "r":0.9466092916,
301
- "f":0.9342876244
302
  },
303
  "nummod":{
304
- "p":0.9430395913,
305
- "r":0.9323232323,
306
- "f":0.9376507937
307
  },
308
  "npadvmod":{
309
- "p":0.7866254349,
310
- "r":0.7229129663,
311
- "f":0.7534246575
312
  },
313
  "prt":{
314
- "p":0.8190082645,
315
- "r":0.8879928315,
316
- "f":0.8521066208
317
  },
318
  "pcomp":{
319
- "p":0.879020979,
320
- "r":0.8802521008,
321
- "f":0.8796361092
322
  },
323
  "expl":{
324
- "p":0.9809725159,
325
- "r":0.9935760171,
326
- "f":0.9872340426
327
  },
328
  "acl":{
329
- "p":0.7443997702,
330
- "r":0.7070376432,
331
- "f":0.7252378288
332
  },
333
  "agent":{
334
- "p":0.8928571429,
335
- "r":0.9408602151,
336
- "f":0.9162303665
337
  },
338
  "dative":{
339
- "p":0.7729591837,
340
- "r":0.6949541284,
341
- "f":0.731884058
342
  },
343
  "acomp":{
344
- "p":0.9102505695,
345
- "r":0.906122449,
346
- "f":0.9081818182
347
  },
348
  "dep":{
349
- "p":0.4491525424,
350
- "r":0.1720779221,
351
- "f":0.2488262911
352
  },
353
  "csubj":{
354
- "p":0.7243589744,
355
- "r":0.6686390533,
356
- "f":0.6953846154
357
  },
358
  "quantmod":{
359
- "p":0.8686779059,
360
- "r":0.7952883834,
361
- "f":0.8303647159
362
  },
363
  "nmod":{
364
- "p":0.76,
365
- "r":0.5789152956,
366
- "f":0.6572120374
367
  },
368
  "appos":{
369
- "p":0.7035040431,
370
- "r":0.6793926247,
371
- "f":0.6912381373
372
  },
373
  "predet":{
374
- "p":0.8300395257,
375
- "r":0.9012875536,
376
- "f":0.8641975309
377
  },
378
  "preconj":{
379
- "p":0.5784313725,
380
- "r":0.6860465116,
381
- "f":0.6276595745
382
  },
383
  "oprd":{
384
- "p":0.8379310345,
385
- "r":0.7253731343,
386
- "f":0.7776
387
  },
388
  "parataxis":{
389
- "p":0.6312849162,
390
- "r":0.4902386117,
391
- "f":0.5518925519
392
  },
393
  "meta":{
394
- "p":0.7647058824,
395
- "r":0.25,
396
- "f":0.3768115942
397
  },
398
  "csubjpass":{
399
  "p":0.5555555556,
@@ -401,102 +397,102 @@
401
  "f":0.6666666667
402
  }
403
  },
404
- "ents_p":0.8574246409,
405
- "ents_r":0.8490084135,
406
- "ents_f":0.8531957725,
407
  "ents_per_type":{
408
  "DATE":{
409
- "p":0.8695102686,
410
- "r":0.8736507937,
411
- "f":0.8715756136
412
  },
413
  "GPE":{
414
- "p":0.9231641622,
415
- "r":0.9082287308,
416
- "f":0.9156355456
417
  },
418
  "ORDINAL":{
419
- "p":0.7971428571,
420
- "r":0.8664596273,
421
- "f":0.8303571429
422
  },
423
  "ORG":{
424
- "p":0.8194444444,
425
- "r":0.8290031813,
426
- "f":0.8241960991
427
- },
428
- "QUANTITY":{
429
- "p":0.7959183673,
430
- "r":0.6428571429,
431
- "f":0.7112462006
432
  },
433
  "CARDINAL":{
434
- "p":0.8221709007,
435
- "r":0.8466111772,
436
- "f":0.834212068
437
  },
438
  "PERSON":{
439
- "p":0.8823895457,
440
- "r":0.9255874674,
441
- "f":0.9034724435
442
  },
443
  "NORP":{
444
- "p":0.9027888446,
445
- "r":0.9064,
446
- "f":0.9045908184
447
  },
448
  "LOC":{
449
- "p":0.7185185185,
450
- "r":0.6178343949,
451
- "f":0.6643835616
452
  },
453
  "FAC":{
454
- "p":0.4263565891,
455
- "r":0.4230769231,
456
- "f":0.4247104247
457
  },
458
  "TIME":{
459
- "p":0.7396825397,
460
- "r":0.6812865497,
461
- "f":0.7092846271
462
  },
463
- "PRODUCT":{
464
- "p":0.6022727273,
465
- "r":0.2511848341,
466
- "f":0.3545150502
467
  },
468
  "EVENT":{
469
- "p":0.5882352941,
470
- "r":0.2873563218,
471
- "f":0.3861003861
472
  },
473
  "WORK_OF_ART":{
474
- "p":0.4692307692,
475
- "r":0.3144329897,
476
- "f":0.3765432099
477
- },
478
- "LAW":{
479
- "p":0.5272727273,
480
- "r":0.453125,
481
- "f":0.487394958
482
  },
483
  "MONEY":{
484
- "p":0.8990498812,
485
- "r":0.893742621,
486
- "f":0.8963883955
 
 
 
 
 
487
  },
488
  "PERCENT":{
489
- "p":0.9202551834,
490
  "r":0.8836140888,
491
- "f":0.9015625
492
  },
493
  "LANGUAGE":{
494
- "p":0.8,
495
- "r":0.625,
496
- "f":0.701754386
 
 
 
 
 
497
  }
498
  },
499
- "speed":7471.5995598921
500
  },
501
  "sources":[
502
  {
 
1
  {
2
  "lang":"en",
3
  "name":"core_web_lg",
4
+ "version":"3.3.0",
5
  "description":"English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"MIT",
10
+ "spacy_version":">=3.3.0.dev0,<3.4.0",
11
+ "spacy_git_version":"849bef2de",
12
  "vectors":{
13
  "width":300,
14
+ "vectors":342918,
15
  "keys":684830,
16
  "name":"en_vectors"
17
  },
 
117
  "relcl",
118
  "xcomp"
119
  ],
 
 
 
 
120
  "attribute_ruler":[
121
 
122
  ],
 
165
  "senter"
166
  ],
167
  "performance":{
168
+ "token_acc":0.9993092439,
169
+ "token_p":0.9956819193,
170
+ "token_r":0.9957659295,
171
+ "token_f":0.9957239226,
172
+ "tag_acc":0.9738145328,
173
+ "sents_p":0.9177103185,
174
+ "sents_r":0.8894386173,
175
+ "sents_f":0.9033533215,
176
+ "dep_uas":0.9188508811,
177
+ "dep_las":0.9008477499,
178
  "dep_las_per_type":{
179
  "prep":{
180
+ "p":0.8537864878,
181
+ "r":0.8645418327,
182
+ "f":0.8591305004
183
  },
184
  "det":{
185
+ "p":0.9790682522,
186
+ "r":0.9802658403,
187
+ "f":0.9796666802
188
  },
189
  "pobj":{
190
+ "p":0.9633579437,
191
+ "r":0.9684272531,
192
+ "f":0.965885947
193
  },
194
  "nsubj":{
195
+ "p":0.9564757243,
196
+ "r":0.9502738226,
197
+ "f":0.9533646873
198
  },
199
  "aux":{
200
+ "p":0.9809760868,
201
+ "r":0.9823733642,
202
+ "f":0.9816742283
203
  },
204
  "advmod":{
205
+ "p":0.8550492715,
206
+ "r":0.8541140838,
207
+ "f":0.8545814218
208
  },
209
  "relcl":{
210
+ "p":0.7709000356,
211
+ "r":0.7862844702,
212
+ "f":0.7785162565
213
  },
214
  "root":{
215
+ "p":0.9183576195,
216
+ "r":0.889702487,
217
+ "f":0.9038029821
218
  },
219
  "xcomp":{
220
+ "p":0.882620883,
221
+ "r":0.9041636755,
222
+ "f":0.8932624113
223
  },
224
  "amod":{
225
+ "p":0.9195970101,
226
+ "r":0.9166180758,
227
+ "f":0.9181051265
228
  },
229
  "compound":{
230
+ "p":0.9193539526,
231
+ "r":0.9320004455,
232
+ "f":0.9256340054
233
  },
234
  "poss":{
235
+ "p":0.9711422846,
236
+ "r":0.9754428341,
237
+ "f":0.9732878088
238
  },
239
  "ccomp":{
240
+ "p":0.7727868239,
241
+ "r":0.8409368635,
242
+ "f":0.8054228031
243
  },
244
  "attr":{
245
+ "p":0.8955042527,
246
+ "r":0.9297729184,
247
+ "f":0.912316897
248
  },
249
  "case":{
250
+ "p":0.9758144126,
251
+ "r":0.9894894895,
252
+ "f":0.9826043738
253
  },
254
  "mark":{
255
+ "p":0.9062829989,
256
+ "r":0.9096449391,
257
+ "f":0.9079608569
258
  },
259
  "intj":{
260
+ "p":0.6653322658,
261
+ "r":0.6087912088,
262
+ "f":0.635807192
263
  },
264
  "advcl":{
265
+ "p":0.6779661017,
266
+ "r":0.6648199446,
267
+ "f":0.6713286713
268
  },
269
  "cc":{
270
+ "p":0.8292624233,
271
+ "r":0.824303313,
272
+ "f":0.8267754319
273
  },
274
  "neg":{
275
+ "p":0.9393336648,
276
+ "r":0.9478173608,
277
+ "f":0.9435564436
278
  },
279
  "conj":{
280
+ "p":0.763665795,
281
+ "r":0.7720292044,
282
+ "f":0.7678247261
283
  },
284
  "nsubjpass":{
285
+ "p":0.9263266358,
286
+ "r":0.9220512821,
287
+ "f":0.9241840144
288
  },
289
  "auxpass":{
290
+ "p":0.9499329459,
291
+ "r":0.9681093394,
292
+ "f":0.9589350181
293
  },
294
  "dobj":{
295
+ "p":0.926432648,
296
+ "r":0.9442983505,
297
+ "f":0.9352801894
298
  },
299
  "nummod":{
300
+ "p":0.9362134689,
301
+ "r":0.9303030303,
302
+ "f":0.9332488917
303
  },
304
  "npadvmod":{
305
+ "p":0.7723030982,
306
+ "r":0.734991119,
307
+ "f":0.753185293
308
  },
309
  "prt":{
310
+ "p":0.8160066007,
311
+ "r":0.8862007168,
312
+ "f":0.8496563574
313
  },
314
  "pcomp":{
315
+ "p":0.8800841515,
316
+ "r":0.8788515406,
317
+ "f":0.8794674142
318
  },
319
  "expl":{
320
+ "p":0.9809322034,
321
+ "r":0.9914346895,
322
+ "f":0.9861554846
323
  },
324
  "acl":{
325
+ "p":0.7556456283,
326
+ "r":0.7119476268,
327
+ "f":0.7331460674
328
  },
329
  "agent":{
330
+ "p":0.8991452991,
331
+ "r":0.9426523297,
332
+ "f":0.9203849519
333
  },
334
  "dative":{
335
+ "p":0.810298103,
336
+ "r":0.6857798165,
337
+ "f":0.7428571429
338
  },
339
  "acomp":{
340
+ "p":0.9111721612,
341
+ "r":0.9024943311,
342
+ "f":0.9068124858
343
  },
344
  "dep":{
345
+ "p":0.3930131004,
346
+ "r":0.1461038961,
347
+ "f":0.2130177515
348
  },
349
  "csubj":{
350
+ "p":0.7068965517,
351
+ "r":0.7278106509,
352
+ "f":0.7172011662
353
  },
354
  "quantmod":{
355
+ "p":0.8746594005,
356
+ "r":0.7822908205,
357
+ "f":0.8259005146
358
  },
359
  "nmod":{
360
+ "p":0.7651217596,
361
+ "r":0.5935405241,
362
+ "f":0.6684969115
363
  },
364
  "appos":{
365
+ "p":0.6994459834,
366
+ "r":0.6572668113,
367
+ "f":0.6777007381
368
  },
369
  "predet":{
370
+ "p":0.8380566802,
371
+ "r":0.8884120172,
372
+ "f":0.8625
373
  },
374
  "preconj":{
375
+ "p":0.537037037,
376
+ "r":0.6744186047,
377
+ "f":0.5979381443
378
  },
379
  "oprd":{
380
+ "p":0.8477508651,
381
+ "r":0.7313432836,
382
+ "f":0.7852564103
383
  },
384
  "parataxis":{
385
+ "p":0.6187845304,
386
+ "r":0.4859002169,
387
+ "f":0.5443499392
388
  },
389
  "meta":{
390
+ "p":1.0,
391
+ "r":0.3269230769,
392
+ "f":0.4927536232
393
  },
394
  "csubjpass":{
395
  "p":0.5555555556,
 
397
  "f":0.6666666667
398
  }
399
  },
400
+ "ents_p":0.8602117695,
401
+ "ents_r":0.8462540064,
402
+ "ents_f":0.8531758053,
403
  "ents_per_type":{
404
  "DATE":{
405
+ "p":0.872593068,
406
+ "r":0.8631746032,
407
+ "f":0.8678582828
408
  },
409
  "GPE":{
410
+ "p":0.9257256688,
411
+ "r":0.9073919107,
412
+ "f":0.916467108
413
  },
414
  "ORDINAL":{
415
+ "p":0.787965616,
416
+ "r":0.8540372671,
417
+ "f":0.8196721311
418
  },
419
  "ORG":{
420
+ "p":0.8203309693,
421
+ "r":0.8279427359,
422
+ "f":0.8241192769
 
 
 
 
 
423
  },
424
  "CARDINAL":{
425
+ "p":0.8304398148,
426
+ "r":0.8531510107,
427
+ "f":0.8416422287
428
  },
429
  "PERSON":{
430
+ "p":0.8953229399,
431
+ "r":0.9184073107,
432
+ "f":0.9067182214
433
  },
434
  "NORP":{
435
+ "p":0.8794048551,
436
+ "r":0.8984,
437
+ "f":0.8888009497
438
  },
439
  "LOC":{
440
+ "p":0.7147766323,
441
+ "r":0.6624203822,
442
+ "f":0.6876033058
443
  },
444
  "FAC":{
445
+ "p":0.3949579832,
446
+ "r":0.3615384615,
447
+ "f":0.3775100402
448
  },
449
  "TIME":{
450
+ "p":0.71875,
451
+ "r":0.6725146199,
452
+ "f":0.6948640483
453
  },
454
+ "QUANTITY":{
455
+ "p":0.8014184397,
456
+ "r":0.6208791209,
457
+ "f":0.6996904025
458
  },
459
  "EVENT":{
460
+ "p":0.6354166667,
461
+ "r":0.3505747126,
462
+ "f":0.4518518519
463
  },
464
  "WORK_OF_ART":{
465
+ "p":0.5,
466
+ "r":0.3092783505,
467
+ "f":0.3821656051
 
 
 
 
 
468
  },
469
  "MONEY":{
470
+ "p":0.9039145907,
471
+ "r":0.8996458087,
472
+ "f":0.9017751479
473
+ },
474
+ "LAW":{
475
+ "p":0.6428571429,
476
+ "r":0.421875,
477
+ "f":0.5094339623
478
  },
479
  "PERCENT":{
480
+ "p":0.9187898089,
481
  "r":0.8836140888,
482
+ "f":0.9008587041
483
  },
484
  "LANGUAGE":{
485
+ "p":0.75,
486
+ "r":0.65625,
487
+ "f":0.7
488
+ },
489
+ "PRODUCT":{
490
+ "p":0.6097560976,
491
+ "r":0.2369668246,
492
+ "f":0.3412969283
493
  }
494
  },
495
+ "speed":7281.6726563626
496
  },
497
  "sources":[
498
  {
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0d39a012887e3db760ffe6c029e7f6327733a42bd4743475257a0306a8be3380
3
- size 7106353
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9d8a97f17d882960a52360ae2e58d9c960937534c9c010e1d912a3b82767a8f
3
+ size 6511153
parser/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e358731646a09000cce13f69498c7b129e8da3a3c94d407392b664fde7ea3e6e
3
  size 319909
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4b8abfdcfaa0d0a822556f61fa2ab7b48d5528e8ab25375e9c657af78d8e2368
3
  size 319909
parser/moves CHANGED
@@ -1,2 +1,2 @@
1
  ��moves�
2
- {"0":{"":995932},"1":{"":989662},"2":{"det":172430,"nsubj":165679,"compound":116803,"amod":106128,"aux":87078,"punct":65505,"advmod":62711,"poss":36427,"mark":27913,"nummod":22583,"auxpass":15597,"prep":13989,"nsubjpass":13867,"neg":12358,"cc":10694,"nmod":9572,"advcl":9063,"npadvmod":8135,"quantmod":7071,"intj":6557,"ccomp":5899,"dobj":3427,"expl":3360,"dep":3191,"predet":1945,"parataxis":1826,"csubj":1431,"preconj":620,"pobj||prep":615,"attr":578,"meta":448,"advmod||conj":367,"dobj||xcomp":352,"acomp":284,"nsubj||ccomp":224,"dative":206,"advmod||xcomp":149,"dobj||ccomp":70,"csubjpass":64,"dobj||conj":62,"prep||conj":51,"acl":48,"prep||nsubj":41,"prep||dobj":36,"xcomp":34,"advmod||ccomp":32,"oprd":31},"3":{"punct":183437,"pobj":182256,"prep":173845,"dobj":89650,"conj":59689,"cc":51858,"ccomp":30404,"advmod":22820,"xcomp":21045,"relcl":20968,"advcl":19833,"attr":17739,"acomp":16824,"appos":14963,"case":13361,"acl":12091,"pcomp":10345,"npadvmod":9702,"prt":8179,"agent":3884,"dative":3867,"nsubj":3465,"intj":2898,"neg":2871,"amod":2843,"nummod":2510,"oprd":2304,"dep":1518,"parataxis":1261,"quantmod":317,"nmod":296,"acl||dobj":202,"prep||dobj":190,"prep||nsubj":162,"acl||nsubj":159,"appos||nsubj":145,"relcl||dobj":134,"relcl||nsubj":111,"aux":103,"expl":96,"meta":93,"appos||dobj":86,"preconj":71,"csubj":65,"prep||nsubjpass":55,"prep||advmod":54,"prep||acomp":53,"det":51,"nsubjpass":45,"acl||nsubjpass":42,"relcl||pobj":41,"mark":40,"auxpass":39,"prep||pobj":36,"relcl||nsubjpass":32,"appos||nsubjpass":31},"4":{"ROOT":110979}}�cfg��neg_key�
 
1
  ��moves�
2
+ {"0":{"":994267},"1":{"":990803},"2":{"det":172595,"nsubj":165748,"compound":116623,"amod":105184,"aux":86667,"punct":65478,"advmod":62763,"poss":36443,"mark":27941,"nummod":22598,"auxpass":15594,"prep":14001,"nsubjpass":13856,"neg":12357,"cc":10739,"nmod":9562,"advcl":9062,"npadvmod":8168,"quantmod":7101,"intj":6464,"ccomp":5896,"dobj":3427,"expl":3360,"dep":2806,"predet":1944,"parataxis":1837,"csubj":1428,"preconj":621,"pobj||prep":616,"attr":578,"meta":376,"advmod||conj":368,"dobj||xcomp":352,"acomp":284,"nsubj||ccomp":224,"dative":206,"advmod||xcomp":149,"dobj||ccomp":70,"csubjpass":64,"dobj||conj":62,"prep||conj":51,"acl":48,"prep||nsubj":41,"prep||dobj":36,"xcomp":34,"advmod||ccomp":32,"oprd":31},"3":{"punct":183790,"pobj":182191,"prep":174008,"dobj":89615,"conj":59687,"cc":51930,"ccomp":30385,"advmod":22861,"xcomp":21021,"relcl":20969,"advcl":19828,"attr":17741,"acomp":16922,"appos":15265,"case":13388,"acl":12085,"pcomp":10324,"npadvmod":9796,"prt":8179,"agent":3903,"dative":3866,"nsubj":3470,"neg":2906,"amod":2839,"intj":2819,"nummod":2732,"oprd":2301,"dep":1487,"parataxis":1261,"quantmod":319,"nmod":294,"acl||dobj":200,"prep||dobj":190,"prep||nsubj":162,"acl||nsubj":159,"appos||nsubj":145,"relcl||dobj":134,"relcl||nsubj":111,"aux":103,"expl":96,"meta":92,"appos||dobj":86,"preconj":71,"csubj":65,"prep||nsubjpass":55,"prep||advmod":54,"prep||acomp":53,"det":51,"nsubjpass":45,"relcl||pobj":42,"acl||nsubjpass":42,"mark":40,"auxpass":39,"prep||pobj":36,"relcl||nsubjpass":32,"appos||nsubjpass":31},"4":{"ROOT":111664}}�cfg��neg_key�
senter/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:db1ce4e224f67005f7b94661e1856f5792f140896dfed59015bf9e5b12f6585e
3
- size 219901
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3a1bdccc5dc2d8c842081528c93680c54508411615b525cef695239f30bb0ed8
3
+ size 219953
tagger/cfg CHANGED
@@ -50,5 +50,6 @@
50
  "XX",
51
  "``"
52
  ],
 
53
  "overwrite":false
54
  }
 
50
  "XX",
51
  "``"
52
  ],
53
+ "neg_prefix":"!",
54
  "overwrite":false
55
  }
tagger/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1c4b0e2f0c6552bcead2c1da64ff46da4912ab273e24dbb14154702fd213190b
3
- size 19389
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4481bf82fdaea8773149ca8b637057e0dfaa4f8fa1cc5e8f19f33250568f6fc0
3
+ size 19441
tok2vec/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e0c6b29d6d78255e764d45f99bb4d4d594fd6e5709bf3cd62c3df00e702c442c
3
- size 6960804
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:71724ee469b871ec2287455264d692c9b229b1bf129aa5bc06130a4aeb9b7c0e
3
+ size 6365604
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
 
vocab/key2row CHANGED
Binary files a/vocab/key2row and b/vocab/key2row differ
 
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:266477264a21daaabda7d6b1200598290e20aaf2c72ebaf6e2a671f282f5e2bc
3
- size 9695169
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:649ca580aed1f07d3b761fa73308bc96f72b78e8bd4d51140a3a920b3429ba10
3
+ size 9694998
vocab/vectors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d90b9122eef03666021c0592972138a2d70f785920cdc86588b369ec327074a7
3
- size 821796128
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd82f972c4fca3d440c505cdd94c88efdded56457cc86851d584b751f7dea673
3
+ size 411501728