osanseviero HF staff commited on
Commit
e3cf396
1 Parent(s): b9aabe8

Update spaCy pipeline

Browse files
LICENSES_SOURCES CHANGED
@@ -1,4 +1,4 @@
1
- # UD Greek GDT v2.5
2
 
3
  * Author: Prokopidis, Prokopis
4
  * URL: https://github.com/UniversalDependencies/UD_Greek-GDT
1
+ # UD Greek GDT v2.8
2
 
3
  * Author: Prokopidis, Prokopis
4
  * URL: https://github.com/UniversalDependencies/UD_Greek-GDT
README.md CHANGED
@@ -4,7 +4,7 @@ tags:
4
  - token-classification
5
  language:
6
  - el
7
- license: cc-by-nc-sa-4.0
8
  model-index:
9
  - name: el_core_news_sm
10
  results:
@@ -14,47 +14,47 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.7142857143
18
  - name: NER Recall
19
  type: recall
20
- value: 0.6932773109
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.7036247335
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
- value: 0.9142631761
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
- value: 0.8947368421
38
  - name: SENTER Recall
39
  type: recall
40
- value: 0.9280397022
41
  - name: SENTER F Score
42
  type: f_score
43
- value: 0.9110840438
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
- value: 0.851442704
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
- value: 0.851442704
58
  ---
59
  ### Details: https://spacy.io/models/el#el_core_news_sm
60
 
@@ -63,12 +63,12 @@ Greek pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, se
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `el_core_news_sm` |
66
- | **Version** | `3.1.0` |
67
- | **spaCy** | `>=3.1.0,<3.2.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
71
- | **Sources** | [UD Greek GDT v2.5](https://github.com/UniversalDependencies/UD_Greek-GDT) (Prokopidis, Prokopis)<br />[Greek NER Corpus (Google Summer of Code 2018)](https://github.com/eellak/gsoc2018-spacy) (Giannis Daras)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion) |
72
  | **License** | `CC BY-NC-SA 3.0` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
@@ -92,15 +92,21 @@ Greek pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, se
92
  | Type | Score |
93
  | --- | --- |
94
  | `TOKEN_ACC` | 100.00 |
95
- | `DEP_UAS` | 85.14 |
96
- | `DEP_LAS` | 81.10 |
97
- | `ENTS_P` | 71.43 |
98
- | `ENTS_R` | 69.33 |
99
- | `ENTS_F` | 70.36 |
100
- | `SENTS_P` | 89.47 |
101
- | `SENTS_R` | 92.80 |
102
- | `SENTS_F` | 91.11 |
103
- | `TAG_ACC` | 91.43 |
104
- | `POS_ACC` | 94.42 |
105
- | `MORPH_ACC` | 88.72 |
106
- | `LEMMA_ACC` | 56.19 |
 
 
 
 
 
 
4
  - token-classification
5
  language:
6
  - el
7
+ license: cc-by-nc-sa-3.0
8
  model-index:
9
  - name: el_core_news_sm
10
  results:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.7348837209
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.6638655462
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.6975717439
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
  - name: POS Accuracy
29
  type: accuracy
30
+ value: 0.9134743381
31
  - task:
32
  name: SENTER
33
  type: token-classification
34
  metrics:
35
  - name: SENTER Precision
36
  type: precision
37
+ value: 0.9195121951
38
  - name: SENTER Recall
39
  type: recall
40
+ value: 0.935483871
41
  - name: SENTER F Score
42
  type: f_score
43
+ value: 0.9274292743
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
  - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
+ value: 0.8446911409
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
  - name: Labeled Dependencies Accuracy
56
  type: accuracy
57
+ value: 0.8446911409
58
  ---
59
  ### Details: https://spacy.io/models/el#el_core_news_sm
60
 
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `el_core_news_sm` |
66
+ | **Version** | `3.2.0` |
67
+ | **spaCy** | `>=3.2.0,<3.3.0` |
68
  | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
  | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
71
+ | **Sources** | [UD Greek GDT v2.8](https://github.com/UniversalDependencies/UD_Greek-GDT) (Prokopidis, Prokopis)<br />[Greek NER Corpus (Google Summer of Code 2018)](https://github.com/eellak/gsoc2018-spacy) (Giannis Daras)<br />[spaCy lookups data](https://github.com/explosion/spacy-lookups-data) (Explosion) |
72
  | **License** | `CC BY-NC-SA 3.0` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
92
  | Type | Score |
93
  | --- | --- |
94
  | `TOKEN_ACC` | 100.00 |
95
+ | `TOKEN_P` | 99.90 |
96
+ | `TOKEN_R` | 99.95 |
97
+ | `TOKEN_F` | 99.93 |
98
+ | `SENTS_P` | 91.95 |
99
+ | `SENTS_R` | 93.55 |
100
+ | `SENTS_F` | 92.74 |
101
+ | `DEP_UAS` | 84.47 |
102
+ | `DEP_LAS` | 80.48 |
103
+ | `ENTS_P` | 73.49 |
104
+ | `ENTS_R` | 66.39 |
105
+ | `ENTS_F` | 69.76 |
106
+ | `POS_ACC` | 94.35 |
107
+ | `MORPH_ACC` | 88.64 |
108
+ | `MORPH_MICRO_P` | 94.75 |
109
+ | `MORPH_MICRO_R` | 94.54 |
110
+ | `MORPH_MICRO_F` | 94.64 |
111
+ | `TAG_ACC` | 91.35 |
112
+ | `LEMMA_ACC` | 56.20 |
accuracy.json CHANGED
@@ -1,257 +1,143 @@
1
  {
2
  "token_acc": 1.0,
3
- "dep_uas": 0.851442704,
4
- "dep_las": 0.8109920308,
5
- "ents_p": 0.7142857143,
6
- "ents_r": 0.6932773109,
7
- "ents_f": 0.7036247335,
8
- "sents_p": 0.8947368421,
9
- "sents_r": 0.9280397022,
10
- "sents_f": 0.9110840438,
11
- "speed": 3078.4152487192,
12
- "ents_per_type": {
13
- "PERSON": {
14
- "p": 0.724137931,
15
- "r": 0.65625,
16
- "f": 0.6885245902
17
- },
18
- "GPE": {
19
- "p": 0.7692307692,
20
- "r": 0.8045977011,
21
- "f": 0.7865168539
22
- },
23
- "ORG": {
24
- "p": 0.6714285714,
25
- "r": 0.661971831,
26
- "f": 0.6666666667
27
- },
28
- "PRODUCT": {
29
- "p": 0.75,
30
- "r": 0.375,
31
- "f": 0.5
32
- },
33
- "EVENT": {
34
- "p": 0.4285714286,
35
- "r": 0.5,
36
- "f": 0.4615384615
37
- },
38
- "LOC": {
39
- "p": 0.0,
40
- "r": 0.0,
41
- "f": 0.0
42
- }
43
- },
44
- "tag_acc": 0.9142631761,
45
- "pos_acc": 0.9442390179,
46
- "morph_acc": 0.8872454765,
47
- "lemma_acc": 0.5619484297,
48
- "morph_per_feat": {
49
- "Abbr": {
50
- "p": 0.9733333333,
51
- "r": 0.7849462366,
52
- "f": 0.869047619
53
- },
54
- "Case": {
55
- "p": 0.9228210246,
56
- "r": 0.9249749917,
57
- "f": 0.9238967527
58
- },
59
- "Gender": {
60
- "p": 0.9183300067,
61
- "r": 0.9204734912,
62
- "f": 0.9194004996
63
- },
64
- "Number": {
65
- "p": 0.968844656,
66
- "r": 0.9695438799,
67
- "f": 0.9691941418
68
- },
69
- "Aspect": {
70
- "p": 0.916751269,
71
- "r": 0.906626506,
72
- "f": 0.9116607774
73
- },
74
- "Mood": {
75
- "p": 0.9847991314,
76
- "r": 0.9752688172,
77
- "f": 0.980010805
78
- },
79
- "Person": {
80
- "p": 0.9725111441,
81
- "r": 0.958974359,
82
- "f": 0.9656953154
83
- },
84
- "Tense": {
85
- "p": 0.9588859416,
86
- "r": 0.9426336375,
87
- "f": 0.9506903353
88
- },
89
- "VerbForm": {
90
- "p": 0.9725888325,
91
- "r": 0.9618473896,
92
- "f": 0.9671882887
93
- },
94
- "Voice": {
95
- "p": 0.9553299492,
96
- "r": 0.9447791165,
97
- "f": 0.9500252398
98
- },
99
- "Definite": {
100
- "p": 0.9853107345,
101
- "r": 0.9971412236,
102
- "f": 0.9911906792
103
- },
104
- "PronType": {
105
- "p": 0.979498861,
106
- "r": 0.9844322344,
107
- "f": 0.9819593515
108
- },
109
- "Foreign": {
110
- "p": 0.6569343066,
111
- "r": 0.5590062112,
112
- "f": 0.6040268456
113
- },
114
- "NumType": {
115
- "p": 0.9533678756,
116
- "r": 0.8975609756,
117
- "f": 0.9246231156
118
- },
119
- "Poss": {
120
- "p": 0.9058823529,
121
- "r": 0.8651685393,
122
- "f": 0.8850574713
123
- },
124
- "Degree": {
125
- "p": 0.6666666667,
126
- "r": 0.5789473684,
127
- "f": 0.6197183099
128
- }
129
- },
130
  "dep_las_per_type": {
131
  "root": {
132
- "p": 0.8492822967,
133
- "r": 0.8808933002,
134
- "f": 0.8647990256
135
  },
136
  "nmod": {
137
- "p": 0.7398171239,
138
- "r": 0.764604811,
139
- "f": 0.7520067596
140
  },
141
  "vocative": {
142
- "p": 1.0,
143
- "r": 0.5714285714,
144
- "f": 0.7272727273
145
  },
146
  "cc": {
147
- "p": 0.8193146417,
148
- "r": 0.8193146417,
149
- "f": 0.8193146417
150
  },
151
  "conj": {
152
- "p": 0.5246376812,
153
- "r": 0.4972527473,
154
- "f": 0.5105782793
155
  },
156
  "aux": {
157
- "p": 0.9741697417,
158
- "r": 0.9705882353,
159
- "f": 0.9723756906
160
  },
161
  "advmod": {
162
- "p": 0.7313019391,
163
- "r": 0.7436619718,
164
- "f": 0.7374301676
165
  },
166
  "ccomp": {
167
- "p": 0.7611940299,
168
- "r": 0.7391304348,
169
- "f": 0.75
170
  },
171
  "det": {
172
- "p": 0.9257478632,
173
  "r": 0.9418478261,
174
- "f": 0.9337284483
175
  },
176
  "obj": {
177
- "p": 0.7934131737,
178
  "r": 0.8054711246,
179
- "f": 0.7993966817
180
  },
181
  "flat": {
182
- "p": 0.6933333333,
183
- "r": 0.5252525253,
184
- "f": 0.5977011494
185
  },
186
  "case": {
187
- "p": 0.9392324094,
188
- "r": 0.9412393162,
189
- "f": 0.9402347919
190
  },
191
  "amod": {
192
- "p": 0.8613861386,
193
- "r": 0.8446601942,
194
- "f": 0.8529411765
195
  },
196
  "obl": {
197
- "p": 0.75975039,
198
- "r": 0.7669291339,
199
- "f": 0.763322884
200
  },
201
  "acl:relcl": {
202
- "p": 0.7058823529,
203
- "r": 0.652173913,
204
- "f": 0.6779661017
205
  },
206
  "mark": {
207
- "p": 0.8785714286,
208
  "r": 0.8601398601,
209
- "f": 0.8692579505
210
  },
211
  "nsubj:pass": {
212
- "p": 0.7785234899,
213
- "r": 0.703030303,
214
- "f": 0.7388535032
215
  },
216
  "nsubj": {
217
- "p": 0.7107061503,
218
- "r": 0.7289719626,
219
- "f": 0.7197231834
220
  },
221
  "cop": {
222
- "p": 0.7113402062,
223
- "r": 0.6831683168,
224
- "f": 0.696969697
225
  },
226
  "parataxis": {
227
- "p": 0.4285714286,
228
- "r": 0.1764705882,
229
- "f": 0.25
230
  },
231
  "nummod": {
232
- "p": 0.8641975309,
233
- "r": 0.843373494,
234
- "f": 0.8536585366
235
  },
236
  "advcl": {
237
- "p": 0.4552845528,
238
- "r": 0.5283018868,
239
- "f": 0.4890829694
240
  },
241
  "xcomp": {
242
- "p": 0.6923076923,
243
- "r": 0.6428571429,
244
- "f": 0.6666666667
245
  },
246
  "csubj": {
247
- "p": 0.75,
248
- "r": 0.5454545455,
249
- "f": 0.6315789474
 
 
 
 
 
250
  },
251
  "fixed": {
252
- "p": 0.2857142857,
253
  "r": 0.5714285714,
254
- "f": 0.380952381
255
  },
256
  "compound": {
257
  "p": 0.0,
@@ -259,14 +145,9 @@
259
  "f": 0.0
260
  },
261
  "appos": {
262
- "p": 0.4102564103,
263
- "r": 0.3265306122,
264
- "f": 0.3636363636
265
- },
266
- "acl": {
267
- "p": 0.7692307692,
268
- "r": 0.4545454545,
269
- "f": 0.5714285714
270
  },
271
  "dep": {
272
  "p": 0.0,
@@ -274,14 +155,14 @@
274
  "f": 0.0
275
  },
276
  "csubj:pass": {
277
- "p": 0.8,
278
- "r": 0.6666666667,
279
- "f": 0.7272727273
280
  },
281
  "obl:agent": {
282
- "p": 0.5833333333,
283
- "r": 0.28,
284
- "f": 0.3783783784
285
  },
286
  "orphan": {
287
  "p": 0.0,
@@ -289,14 +170,139 @@
289
  "f": 0.0
290
  },
291
  "iobj": {
292
- "p": 0.5,
293
  "r": 1.0,
294
- "f": 0.6666666667
295
  },
296
  "expl": {
297
  "p": 0.0,
298
  "r": 0.0,
299
  "f": 0.0
300
  }
301
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
302
  }
1
  {
2
  "token_acc": 1.0,
3
+ "token_p": 0.9990295973,
4
+ "token_r": 0.9995068547,
5
+ "token_f": 0.9992604644,
6
+ "sents_p": 0.9195121951,
7
+ "sents_r": 0.935483871,
8
+ "sents_f": 0.9274292743,
9
+ "dep_uas": 0.8446911409,
10
+ "dep_las": 0.804792262,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  "dep_las_per_type": {
12
  "root": {
13
+ "p": 0.8634146341,
14
+ "r": 0.8784119107,
15
+ "f": 0.8708487085
16
  },
17
  "nmod": {
18
+ "p": 0.7382154882,
19
+ "r": 0.7534364261,
20
+ "f": 0.7457482993
21
  },
22
  "vocative": {
23
+ "p": 0.75,
24
+ "r": 0.4285714286,
25
+ "f": 0.5454545455
26
  },
27
  "cc": {
28
+ "p": 0.8459119497,
29
+ "r": 0.8380062305,
30
+ "f": 0.8419405321
31
  },
32
  "conj": {
33
+ "p": 0.4827586207,
34
+ "r": 0.4615384615,
35
+ "f": 0.4719101124
36
  },
37
  "aux": {
38
+ "p": 0.9595588235,
39
+ "r": 0.9595588235,
40
+ "f": 0.9595588235
41
  },
42
  "advmod": {
43
+ "p": 0.7327823691,
44
+ "r": 0.7492957746,
45
+ "f": 0.7409470752
46
  },
47
  "ccomp": {
48
+ "p": 0.7397260274,
49
+ "r": 0.7826086957,
50
+ "f": 0.7605633803
51
  },
52
  "det": {
53
+ "p": 0.9307196563,
54
  "r": 0.9418478261,
55
+ "f": 0.9362506753
56
  },
57
  "obj": {
58
+ "p": 0.768115942,
59
  "r": 0.8054711246,
60
+ "f": 0.7863501484
61
  },
62
  "flat": {
63
+ "p": 0.6891891892,
64
+ "r": 0.5151515152,
65
+ "f": 0.5895953757
66
  },
67
  "case": {
68
+ "p": 0.9402985075,
69
+ "r": 0.9423076923,
70
+ "f": 0.9413020277
71
  },
72
  "amod": {
73
+ "p": 0.8497536946,
74
+ "r": 0.8373786408,
75
+ "f": 0.8435207824
76
  },
77
  "obl": {
78
+ "p": 0.7402799378,
79
+ "r": 0.7496062992,
80
+ "f": 0.744913928
81
  },
82
  "acl:relcl": {
83
+ "p": 0.7168674699,
84
+ "r": 0.6467391304,
85
+ "f": 0.68
86
  },
87
  "mark": {
88
+ "p": 0.8601398601,
89
  "r": 0.8601398601,
90
+ "f": 0.8601398601
91
  },
92
  "nsubj:pass": {
93
+ "p": 0.7448275862,
94
+ "r": 0.6545454545,
95
+ "f": 0.6967741935
96
  },
97
  "nsubj": {
98
+ "p": 0.7129411765,
99
+ "r": 0.7079439252,
100
+ "f": 0.7104337632
101
  },
102
  "cop": {
103
+ "p": 0.7272727273,
104
+ "r": 0.7128712871,
105
+ "f": 0.72
106
  },
107
  "parataxis": {
108
+ "p": 0.3076923077,
109
+ "r": 0.2352941176,
110
+ "f": 0.2666666667
111
  },
112
  "nummod": {
113
+ "p": 0.7578947368,
114
+ "r": 0.8674698795,
115
+ "f": 0.808988764
116
  },
117
  "advcl": {
118
+ "p": 0.4576271186,
119
+ "r": 0.5094339623,
120
+ "f": 0.4821428571
121
  },
122
  "xcomp": {
123
+ "p": 0.7,
124
+ "r": 0.6666666667,
125
+ "f": 0.6829268293
126
  },
127
  "csubj": {
128
+ "p": 0.7647058824,
129
+ "r": 0.5909090909,
130
+ "f": 0.6666666667
131
+ },
132
+ "acl": {
133
+ "p": 0.6818181818,
134
+ "r": 0.3409090909,
135
+ "f": 0.4545454545
136
  },
137
  "fixed": {
138
+ "p": 0.3636363636,
139
  "r": 0.5714285714,
140
+ "f": 0.4444444444
141
  },
142
  "compound": {
143
  "p": 0.0,
145
  "f": 0.0
146
  },
147
  "appos": {
148
+ "p": 0.2826086957,
149
+ "r": 0.2653061224,
150
+ "f": 0.2736842105
 
 
 
 
 
151
  },
152
  "dep": {
153
  "p": 0.0,
155
  "f": 0.0
156
  },
157
  "csubj:pass": {
158
+ "p": 0.75,
159
+ "r": 0.5,
160
+ "f": 0.6
161
  },
162
  "obl:agent": {
163
+ "p": 0.5714285714,
164
+ "r": 0.32,
165
+ "f": 0.4102564103
166
  },
167
  "orphan": {
168
  "p": 0.0,
170
  "f": 0.0
171
  },
172
  "iobj": {
173
+ "p": 1.0,
174
  "r": 1.0,
175
+ "f": 1.0
176
  },
177
  "expl": {
178
  "p": 0.0,
179
  "r": 0.0,
180
  "f": 0.0
181
  }
182
+ },
183
+ "ents_p": 0.7348837209,
184
+ "ents_r": 0.6638655462,
185
+ "ents_f": 0.6975717439,
186
+ "ents_per_type": {
187
+ "ORG": {
188
+ "p": 0.0,
189
+ "r": 0.0,
190
+ "f": 0.0
191
+ },
192
+ "PERSON": {
193
+ "p": 0.0,
194
+ "r": 0.0,
195
+ "f": 0.0
196
+ },
197
+ "GPE": {
198
+ "p": 0.0,
199
+ "r": 0.0,
200
+ "f": 0.0
201
+ },
202
+ "PRODUCT": {
203
+ "p": 0.0,
204
+ "r": 0.0,
205
+ "f": 0.0
206
+ },
207
+ "EVENT": {
208
+ "p": 0.0,
209
+ "r": 0.0,
210
+ "f": 0.0
211
+ },
212
+ "LOC": {
213
+ "p": 0.0,
214
+ "r": 0.0,
215
+ "f": 0.0
216
+ }
217
+ },
218
+ "speed": 2331.9277335651,
219
+ "pos_acc": 0.94345018,
220
+ "morph_acc": 0.8863580338,
221
+ "morph_micro_p": 0.9474649993,
222
+ "morph_micro_r": 0.9453768691,
223
+ "morph_micro_f": 0.9464197824,
224
+ "morph_per_feat": {
225
+ "Abbr": {
226
+ "p": 0.9487179487,
227
+ "r": 0.7956989247,
228
+ "f": 0.865497076
229
+ },
230
+ "Case": {
231
+ "p": 0.9229232562,
232
+ "r": 0.9243081027,
233
+ "f": 0.9236151603
234
+ },
235
+ "Gender": {
236
+ "p": 0.9189279174,
237
+ "r": 0.9203067689,
238
+ "f": 0.9196168263
239
+ },
240
+ "Number": {
241
+ "p": 0.9678442682,
242
+ "r": 0.9688221709,
243
+ "f": 0.9683329727
244
+ },
245
+ "Aspect": {
246
+ "p": 0.9227642276,
247
+ "r": 0.9116465863,
248
+ "f": 0.9171717172
249
+ },
250
+ "Mood": {
251
+ "p": 0.9827586207,
252
+ "r": 0.9806451613,
253
+ "f": 0.9817007535
254
+ },
255
+ "Person": {
256
+ "p": 0.9696521095,
257
+ "r": 0.9597069597,
258
+ "f": 0.9646539028
259
+ },
260
+ "Tense": {
261
+ "p": 0.9508408797,
262
+ "r": 0.9582790091,
263
+ "f": 0.9545454545
264
+ },
265
+ "VerbForm": {
266
+ "p": 0.9735772358,
267
+ "r": 0.9618473896,
268
+ "f": 0.9676767677
269
+ },
270
+ "Voice": {
271
+ "p": 0.9552845528,
272
+ "r": 0.9437751004,
273
+ "f": 0.9494949495
274
+ },
275
+ "Definite": {
276
+ "p": 0.9864253394,
277
+ "r": 0.9971412236,
278
+ "f": 0.9917543361
279
+ },
280
+ "PronType": {
281
+ "p": 0.9812870835,
282
+ "r": 0.9844322344,
283
+ "f": 0.9828571429
284
+ },
285
+ "Foreign": {
286
+ "p": 0.71875,
287
+ "r": 0.5714285714,
288
+ "f": 0.6366782007
289
+ },
290
+ "NumType": {
291
+ "p": 0.9427083333,
292
+ "r": 0.8829268293,
293
+ "f": 0.9118387909
294
+ },
295
+ "Poss": {
296
+ "p": 0.8977272727,
297
+ "r": 0.8876404494,
298
+ "f": 0.8926553672
299
+ },
300
+ "Degree": {
301
+ "p": 0.7666666667,
302
+ "r": 0.6052631579,
303
+ "f": 0.6764705882
304
+ }
305
+ },
306
+ "tag_acc": 0.9134743381,
307
+ "lemma_acc": 0.5620470345
308
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
config.cfg CHANGED
@@ -1,10 +1,8 @@
1
  [paths]
2
- train = "corpus/el-dep-news/train.spacy"
3
- dev = "corpus/el-dep-news/dev.spacy"
4
  vectors = null
5
- raw = null
6
  init_tok2vec = null
7
- vocab_data = null
8
 
9
  [system]
10
  gpu_allocator = null
@@ -24,6 +22,7 @@ tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
24
 
25
  [components.attribute_ruler]
26
  factory = "attribute_ruler"
 
27
  validate = false
28
 
29
  [components.lemmatizer]
@@ -31,9 +30,13 @@ factory = "lemmatizer"
31
  mode = "rule"
32
  model = null
33
  overwrite = false
 
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
 
 
 
37
 
38
  [components.morphologizer.model]
39
  @architectures = "spacy.Tagger.v1"
@@ -48,6 +51,7 @@ upstream = "tok2vec"
48
  factory = "ner"
49
  incorrect_spans_key = null
50
  moves = null
 
51
  update_with_oracle_cut_size = 100
52
 
53
  [components.ner.model]
@@ -65,8 +69,8 @@ nO = null
65
  [components.ner.model.tok2vec.embed]
66
  @architectures = "spacy.MultiHashEmbed.v2"
67
  width = 96
68
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
69
- rows = [5000,2500,2500,2500]
70
  include_static_vectors = false
71
 
72
  [components.ner.model.tok2vec.encode]
@@ -81,6 +85,7 @@ factory = "parser"
81
  learn_tokens = false
82
  min_action_freq = 30
83
  moves = null
 
84
  update_with_oracle_cut_size = 100
85
 
86
  [components.parser.model]
@@ -99,6 +104,8 @@ upstream = "tok2vec"
99
 
100
  [components.senter]
101
  factory = "senter"
 
 
102
 
103
  [components.senter.model]
104
  @architectures = "spacy.Tagger.v1"
@@ -110,8 +117,8 @@ nO = null
110
  [components.senter.model.tok2vec.embed]
111
  @architectures = "spacy.MultiHashEmbed.v2"
112
  width = 16
113
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
114
- rows = [1000,500,500,500]
115
  include_static_vectors = false
116
 
117
  [components.senter.model.tok2vec.encode]
@@ -130,8 +137,8 @@ factory = "tok2vec"
130
  [components.tok2vec.model.embed]
131
  @architectures = "spacy.MultiHashEmbed.v2"
132
  width = ${components.tok2vec.model.encode:width}
133
- attrs = ["NORM","PREFIX","SUFFIX","SHAPE"]
134
- rows = [5000,2500,2500,2500]
135
  include_static_vectors = false
136
 
137
  [components.tok2vec.model.encode]
@@ -145,22 +152,19 @@ maxout_pieces = 3
145
 
146
  [corpora.dev]
147
  @readers = "spacy.Corpus.v1"
148
- limit = 0
149
- max_length = 0
150
- path = ${paths:dev}
151
  gold_preproc = false
 
 
152
  augmenter = null
153
 
154
  [corpora.train]
155
  @readers = "spacy.Corpus.v1"
156
- path = ${paths:train}
157
- max_length = 5000
158
  gold_preproc = false
 
159
  limit = 0
160
-
161
- [corpora.train.augmenter]
162
- @augmenters = "spacy.lower_case.v1"
163
- level = 0.1
164
 
165
  [training]
166
  train_corpus = "corpora.train"
@@ -191,9 +195,8 @@ compound = 1.001
191
  t = 0.0
192
 
193
  [training.logger]
194
- @loggers = "spacy.WandbLogger.v1"
195
- project_name = "spacy-v3.0.0a2"
196
- remove_config_values = []
197
 
198
  [training.optimizer]
199
  @optimizers = "Adam.v1"
@@ -216,16 +219,17 @@ dep_las_per_type = null
216
  sents_p = null
217
  sents_r = null
218
  sents_f = 0.02
219
- lemma_acc = 0.33
220
- ents_f = 0.33
221
  ents_p = 0.0
222
  ents_r = 0.0
223
  ents_per_type = null
 
224
 
225
  [pretraining]
226
 
227
  [initialize]
228
- vocab_data = ${paths.vocab_data}
229
  vectors = ${paths.vectors}
230
  init_tok2vec = ${paths.init_tok2vec}
231
  before_init = null
1
  [paths]
2
+ train = null
3
+ dev = null
4
  vectors = null
 
5
  init_tok2vec = null
 
6
 
7
  [system]
8
  gpu_allocator = null
22
 
23
  [components.attribute_ruler]
24
  factory = "attribute_ruler"
25
+ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
26
  validate = false
27
 
28
  [components.lemmatizer]
30
  mode = "rule"
31
  model = null
32
  overwrite = false
33
+ scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
37
+ extend = false
38
+ overwrite = true
39
+ scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
41
  [components.morphologizer.model]
42
  @architectures = "spacy.Tagger.v1"
51
  factory = "ner"
52
  incorrect_spans_key = null
53
  moves = null
54
+ scorer = {"@scorers":"spacy.ner_scorer.v1"}
55
  update_with_oracle_cut_size = 100
56
 
57
  [components.ner.model]
69
  [components.ner.model.tok2vec.embed]
70
  @architectures = "spacy.MultiHashEmbed.v2"
71
  width = 96
72
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
73
+ rows = [5000,2500,2500,2500,100]
74
  include_static_vectors = false
75
 
76
  [components.ner.model.tok2vec.encode]
85
  learn_tokens = false
86
  min_action_freq = 30
87
  moves = null
88
+ scorer = {"@scorers":"spacy.parser_scorer.v1"}
89
  update_with_oracle_cut_size = 100
90
 
91
  [components.parser.model]
104
 
105
  [components.senter]
106
  factory = "senter"
107
+ overwrite = false
108
+ scorer = {"@scorers":"spacy.senter_scorer.v1"}
109
 
110
  [components.senter.model]
111
  @architectures = "spacy.Tagger.v1"
117
  [components.senter.model.tok2vec.embed]
118
  @architectures = "spacy.MultiHashEmbed.v2"
119
  width = 16
120
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
121
+ rows = [1000,500,500,500,50]
122
  include_static_vectors = false
123
 
124
  [components.senter.model.tok2vec.encode]
137
  [components.tok2vec.model.embed]
138
  @architectures = "spacy.MultiHashEmbed.v2"
139
  width = ${components.tok2vec.model.encode:width}
140
+ attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
141
+ rows = [5000,2500,2500,2500,100]
142
  include_static_vectors = false
143
 
144
  [components.tok2vec.model.encode]
152
 
153
  [corpora.dev]
154
  @readers = "spacy.Corpus.v1"
155
+ path = ${paths.dev}
 
 
156
  gold_preproc = false
157
+ max_length = 0
158
+ limit = 0
159
  augmenter = null
160
 
161
  [corpora.train]
162
  @readers = "spacy.Corpus.v1"
163
+ path = ${paths.train}
 
164
  gold_preproc = false
165
+ max_length = 0
166
  limit = 0
167
+ augmenter = null
 
 
 
168
 
169
  [training]
170
  train_corpus = "corpora.train"
195
  t = 0.0
196
 
197
  [training.logger]
198
+ @loggers = "spacy.ConsoleLogger.v1"
199
+ progress_bar = false
 
200
 
201
  [training.optimizer]
202
  @optimizers = "Adam.v1"
219
  sents_p = null
220
  sents_r = null
221
  sents_f = 0.02
222
+ lemma_acc = 0.5
223
+ ents_f = 0.16
224
  ents_p = 0.0
225
  ents_r = 0.0
226
  ents_per_type = null
227
+ speed = 0.0
228
 
229
  [pretraining]
230
 
231
  [initialize]
232
+ vocab_data = null
233
  vectors = ${paths.vectors}
234
  init_tok2vec = ${paths.init_tok2vec}
235
  before_init = null
el_core_news_sm-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6dc7b13bb26efc37a45c86b648bc5174d2fc9f6cce6fdfda156c0a025a59172e
3
- size 13553771
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b21b38878098e603e713a15b48c6b00493f1087470d8871abaf770d2d37a44ae
3
+ size 13829725
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"el",
3
  "name":"core_news_sm",
4
- "version":"3.1.0",
5
  "description":"Greek pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-NC-SA 3.0",
10
- "spacy_version":">=3.1.0,<3.2.0",
11
- "spacy_git_version":"caba63b74",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
@@ -452,258 +452,144 @@
452
  ],
453
  "performance":{
454
  "token_acc":1.0,
455
- "dep_uas":0.851442704,
456
- "dep_las":0.8109920308,
457
- "ents_p":0.7142857143,
458
- "ents_r":0.6932773109,
459
- "ents_f":0.7036247335,
460
- "sents_p":0.8947368421,
461
- "sents_r":0.9280397022,
462
- "sents_f":0.9110840438,
463
- "speed":3078.4152487192,
464
- "ents_per_type":{
465
- "PERSON":{
466
- "p":0.724137931,
467
- "r":0.65625,
468
- "f":0.6885245902
469
- },
470
- "GPE":{
471
- "p":0.7692307692,
472
- "r":0.8045977011,
473
- "f":0.7865168539
474
- },
475
- "ORG":{
476
- "p":0.6714285714,
477
- "r":0.661971831,
478
- "f":0.6666666667
479
- },
480
- "PRODUCT":{
481
- "p":0.75,
482
- "r":0.375,
483
- "f":0.5
484
- },
485
- "EVENT":{
486
- "p":0.4285714286,
487
- "r":0.5,
488
- "f":0.4615384615
489
- },
490
- "LOC":{
491
- "p":0.0,
492
- "r":0.0,
493
- "f":0.0
494
- }
495
- },
496
- "tag_acc":0.9142631761,
497
- "pos_acc":0.9442390179,
498
- "morph_acc":0.8872454765,
499
- "lemma_acc":0.5619484297,
500
- "morph_per_feat":{
501
- "Abbr":{
502
- "p":0.9733333333,
503
- "r":0.7849462366,
504
- "f":0.869047619
505
- },
506
- "Case":{
507
- "p":0.9228210246,
508
- "r":0.9249749917,
509
- "f":0.9238967527
510
- },
511
- "Gender":{
512
- "p":0.9183300067,
513
- "r":0.9204734912,
514
- "f":0.9194004996
515
- },
516
- "Number":{
517
- "p":0.968844656,
518
- "r":0.9695438799,
519
- "f":0.9691941418
520
- },
521
- "Aspect":{
522
- "p":0.916751269,
523
- "r":0.906626506,
524
- "f":0.9116607774
525
- },
526
- "Mood":{
527
- "p":0.9847991314,
528
- "r":0.9752688172,
529
- "f":0.980010805
530
- },
531
- "Person":{
532
- "p":0.9725111441,
533
- "r":0.958974359,
534
- "f":0.9656953154
535
- },
536
- "Tense":{
537
- "p":0.9588859416,
538
- "r":0.9426336375,
539
- "f":0.9506903353
540
- },
541
- "VerbForm":{
542
- "p":0.9725888325,
543
- "r":0.9618473896,
544
- "f":0.9671882887
545
- },
546
- "Voice":{
547
- "p":0.9553299492,
548
- "r":0.9447791165,
549
- "f":0.9500252398
550
- },
551
- "Definite":{
552
- "p":0.9853107345,
553
- "r":0.9971412236,
554
- "f":0.9911906792
555
- },
556
- "PronType":{
557
- "p":0.979498861,
558
- "r":0.9844322344,
559
- "f":0.9819593515
560
- },
561
- "Foreign":{
562
- "p":0.6569343066,
563
- "r":0.5590062112,
564
- "f":0.6040268456
565
- },
566
- "NumType":{
567
- "p":0.9533678756,
568
- "r":0.8975609756,
569
- "f":0.9246231156
570
- },
571
- "Poss":{
572
- "p":0.9058823529,
573
- "r":0.8651685393,
574
- "f":0.8850574713
575
- },
576
- "Degree":{
577
- "p":0.6666666667,
578
- "r":0.5789473684,
579
- "f":0.6197183099
580
- }
581
- },
582
  "dep_las_per_type":{
583
  "root":{
584
- "p":0.8492822967,
585
- "r":0.8808933002,
586
- "f":0.8647990256
587
  },
588
  "nmod":{
589
- "p":0.7398171239,
590
- "r":0.764604811,
591
- "f":0.7520067596
592
  },
593
  "vocative":{
594
- "p":1.0,
595
- "r":0.5714285714,
596
- "f":0.7272727273
597
  },
598
  "cc":{
599
- "p":0.8193146417,
600
- "r":0.8193146417,
601
- "f":0.8193146417
602
  },
603
  "conj":{
604
- "p":0.5246376812,
605
- "r":0.4972527473,
606
- "f":0.5105782793
607
  },
608
  "aux":{
609
- "p":0.9741697417,
610
- "r":0.9705882353,
611
- "f":0.9723756906
612
  },
613
  "advmod":{
614
- "p":0.7313019391,
615
- "r":0.7436619718,
616
- "f":0.7374301676
617
  },
618
  "ccomp":{
619
- "p":0.7611940299,
620
- "r":0.7391304348,
621
- "f":0.75
622
  },
623
  "det":{
624
- "p":0.9257478632,
625
  "r":0.9418478261,
626
- "f":0.9337284483
627
  },
628
  "obj":{
629
- "p":0.7934131737,
630
  "r":0.8054711246,
631
- "f":0.7993966817
632
  },
633
  "flat":{
634
- "p":0.6933333333,
635
- "r":0.5252525253,
636
- "f":0.5977011494
637
  },
638
  "case":{
639
- "p":0.9392324094,
640
- "r":0.9412393162,
641
- "f":0.9402347919
642
  },
643
  "amod":{
644
- "p":0.8613861386,
645
- "r":0.8446601942,
646
- "f":0.8529411765
647
  },
648
  "obl":{
649
- "p":0.75975039,
650
- "r":0.7669291339,
651
- "f":0.763322884
652
  },
653
  "acl:relcl":{
654
- "p":0.7058823529,
655
- "r":0.652173913,
656
- "f":0.6779661017
657
  },
658
  "mark":{
659
- "p":0.8785714286,
660
  "r":0.8601398601,
661
- "f":0.8692579505
662
  },
663
  "nsubj:pass":{
664
- "p":0.7785234899,
665
- "r":0.703030303,
666
- "f":0.7388535032
667
  },
668
  "nsubj":{
669
- "p":0.7107061503,
670
- "r":0.7289719626,
671
- "f":0.7197231834
672
  },
673
  "cop":{
674
- "p":0.7113402062,
675
- "r":0.6831683168,
676
- "f":0.696969697
677
  },
678
  "parataxis":{
679
- "p":0.4285714286,
680
- "r":0.1764705882,
681
- "f":0.25
682
  },
683
  "nummod":{
684
- "p":0.8641975309,
685
- "r":0.843373494,
686
- "f":0.8536585366
687
  },
688
  "advcl":{
689
- "p":0.4552845528,
690
- "r":0.5283018868,
691
- "f":0.4890829694
692
  },
693
  "xcomp":{
694
- "p":0.6923076923,
695
- "r":0.6428571429,
696
- "f":0.6666666667
697
  },
698
  "csubj":{
699
- "p":0.75,
700
- "r":0.5454545455,
701
- "f":0.6315789474
 
 
 
 
 
702
  },
703
  "fixed":{
704
- "p":0.2857142857,
705
  "r":0.5714285714,
706
- "f":0.380952381
707
  },
708
  "compound":{
709
  "p":0.0,
@@ -711,14 +597,9 @@
711
  "f":0.0
712
  },
713
  "appos":{
714
- "p":0.4102564103,
715
- "r":0.3265306122,
716
- "f":0.3636363636
717
- },
718
- "acl":{
719
- "p":0.7692307692,
720
- "r":0.4545454545,
721
- "f":0.5714285714
722
  },
723
  "dep":{
724
  "p":0.0,
@@ -726,14 +607,14 @@
726
  "f":0.0
727
  },
728
  "csubj:pass":{
729
- "p":0.8,
730
- "r":0.6666666667,
731
- "f":0.7272727273
732
  },
733
  "obl:agent":{
734
- "p":0.5833333333,
735
- "r":0.28,
736
- "f":0.3783783784
737
  },
738
  "orphan":{
739
  "p":0.0,
@@ -741,20 +622,145 @@
741
  "f":0.0
742
  },
743
  "iobj":{
744
- "p":0.5,
745
  "r":1.0,
746
- "f":0.6666666667
747
  },
748
  "expl":{
749
  "p":0.0,
750
  "r":0.0,
751
  "f":0.0
752
  }
753
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
754
  },
755
  "sources":[
756
  {
757
- "name":"UD Greek GDT v2.5",
758
  "url":"https://github.com/UniversalDependencies/UD_Greek-GDT",
759
  "license":"CC BY-NC-SA 3.0",
760
  "author":"Prokopidis, Prokopis"
1
  {
2
  "lang":"el",
3
  "name":"core_news_sm",
4
+ "version":"3.2.0",
5
  "description":"Greek pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-NC-SA 3.0",
10
+ "spacy_version":">=3.2.0,<3.3.0",
11
+ "spacy_git_version":"bb26550e2",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
452
  ],
453
  "performance":{
454
  "token_acc":1.0,
455
+ "token_p":0.9990295973,
456
+ "token_r":0.9995068547,
457
+ "token_f":0.9992604644,
458
+ "sents_p":0.9195121951,
459
+ "sents_r":0.935483871,
460
+ "sents_f":0.9274292743,
461
+ "dep_uas":0.8446911409,
462
+ "dep_las":0.804792262,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
463
  "dep_las_per_type":{
464
  "root":{
465
+ "p":0.8634146341,
466
+ "r":0.8784119107,
467
+ "f":0.8708487085
468
  },
469
  "nmod":{
470
+ "p":0.7382154882,
471
+ "r":0.7534364261,
472
+ "f":0.7457482993
473
  },
474
  "vocative":{
475
+ "p":0.75,
476
+ "r":0.4285714286,
477
+ "f":0.5454545455
478
  },
479
  "cc":{
480
+ "p":0.8459119497,
481
+ "r":0.8380062305,
482
+ "f":0.8419405321
483
  },
484
  "conj":{
485
+ "p":0.4827586207,
486
+ "r":0.4615384615,
487
+ "f":0.4719101124
488
  },
489
  "aux":{
490
+ "p":0.9595588235,
491
+ "r":0.9595588235,
492
+ "f":0.9595588235
493
  },
494
  "advmod":{
495
+ "p":0.7327823691,
496
+ "r":0.7492957746,
497
+ "f":0.7409470752
498
  },
499
  "ccomp":{
500
+ "p":0.7397260274,
501
+ "r":0.7826086957,
502
+ "f":0.7605633803
503
  },
504
  "det":{
505
+ "p":0.9307196563,
506
  "r":0.9418478261,
507
+ "f":0.9362506753
508
  },
509
  "obj":{
510
+ "p":0.768115942,
511
  "r":0.8054711246,
512
+ "f":0.7863501484
513
  },
514
  "flat":{
515
+ "p":0.6891891892,
516
+ "r":0.5151515152,
517
+ "f":0.5895953757
518
  },
519
  "case":{
520
+ "p":0.9402985075,
521
+ "r":0.9423076923,
522
+ "f":0.9413020277
523
  },
524
  "amod":{
525
+ "p":0.8497536946,
526
+ "r":0.8373786408,
527
+ "f":0.8435207824
528
  },
529
  "obl":{
530
+ "p":0.7402799378,
531
+ "r":0.7496062992,
532
+ "f":0.744913928
533
  },
534
  "acl:relcl":{
535
+ "p":0.7168674699,
536
+ "r":0.6467391304,
537
+ "f":0.68
538
  },
539
  "mark":{
540
+ "p":0.8601398601,
541
  "r":0.8601398601,
542
+ "f":0.8601398601
543
  },
544
  "nsubj:pass":{
545
+ "p":0.7448275862,
546
+ "r":0.6545454545,
547
+ "f":0.6967741935
548
  },
549
  "nsubj":{
550
+ "p":0.7129411765,
551
+ "r":0.7079439252,
552
+ "f":0.7104337632
553
  },
554
  "cop":{
555
+ "p":0.7272727273,
556
+ "r":0.7128712871,
557
+ "f":0.72
558
  },
559
  "parataxis":{
560
+ "p":0.3076923077,
561
+ "r":0.2352941176,
562
+ "f":0.2666666667
563
  },
564
  "nummod":{
565
+ "p":0.7578947368,
566
+ "r":0.8674698795,
567
+ "f":0.808988764
568
  },
569
  "advcl":{
570
+ "p":0.4576271186,
571
+ "r":0.5094339623,
572
+ "f":0.4821428571
573
  },
574
  "xcomp":{
575
+ "p":0.7,
576
+ "r":0.6666666667,
577
+ "f":0.6829268293
578
  },
579
  "csubj":{
580
+ "p":0.7647058824,
581
+ "r":0.5909090909,
582
+ "f":0.6666666667
583
+ },
584
+ "acl":{
585
+ "p":0.6818181818,
586
+ "r":0.3409090909,
587
+ "f":0.4545454545
588
  },
589
  "fixed":{
590
+ "p":0.3636363636,
591
  "r":0.5714285714,
592
+ "f":0.4444444444
593
  },
594
  "compound":{
595
  "p":0.0,
597
  "f":0.0
598
  },
599
  "appos":{
600
+ "p":0.2826086957,
601
+ "r":0.2653061224,
602
+ "f":0.2736842105
 
 
 
 
 
603
  },
604
  "dep":{
605
  "p":0.0,
607
  "f":0.0
608
  },
609
  "csubj:pass":{
610
+ "p":0.75,
611
+ "r":0.5,
612
+ "f":0.6
613
  },
614
  "obl:agent":{
615
+ "p":0.5714285714,
616
+ "r":0.32,
617
+ "f":0.4102564103
618
  },
619
  "orphan":{
620
  "p":0.0,
622
  "f":0.0
623
  },
624
  "iobj":{
625
+ "p":1.0,
626
  "r":1.0,
627
+ "f":1.0
628
  },
629
  "expl":{
630
  "p":0.0,
631
  "r":0.0,
632
  "f":0.0
633
  }
634
+ },
635
+ "ents_p":0.7348837209,
636
+ "ents_r":0.6638655462,
637
+ "ents_f":0.6975717439,
638
+ "ents_per_type":{
639
+ "ORG":{
640
+ "p":0.0,
641
+ "r":0.0,
642
+ "f":0.0
643
+ },
644
+ "PERSON":{
645
+ "p":0.0,
646
+ "r":0.0,
647
+ "f":0.0
648
+ },
649
+ "GPE":{
650
+ "p":0.0,
651
+ "r":0.0,
652
+ "f":0.0
653
+ },
654
+ "PRODUCT":{
655
+ "p":0.0,
656
+ "r":0.0,
657
+ "f":0.0
658
+ },
659
+ "EVENT":{
660
+ "p":0.0,
661
+ "r":0.0,
662
+ "f":0.0
663
+ },
664
+ "LOC":{
665
+ "p":0.0,
666
+ "r":0.0,
667
+ "f":0.0
668
+ }
669
+ },
670
+ "speed":2331.9277335651,
671
+ "pos_acc":0.94345018,
672
+ "morph_acc":0.8863580338,
673
+ "morph_micro_p":0.9474649993,
674
+ "morph_micro_r":0.9453768691,
675
+ "morph_micro_f":0.9464197824,
676
+ "morph_per_feat":{
677
+ "Abbr":{
678
+ "p":0.9487179487,
679
+ "r":0.7956989247,
680
+ "f":0.865497076
681
+ },
682
+ "Case":{
683
+ "p":0.9229232562,
684
+ "r":0.9243081027,
685
+ "f":0.9236151603
686
+ },
687
+ "Gender":{
688
+ "p":0.9189279174,
689
+ "r":0.9203067689,
690
+ "f":0.9196168263
691
+ },
692
+ "Number":{
693
+ "p":0.9678442682,
694
+ "r":0.9688221709,
695
+ "f":0.9683329727
696
+ },
697
+ "Aspect":{
698
+ "p":0.9227642276,
699
+ "r":0.9116465863,
700
+ "f":0.9171717172
701
+ },
702
+ "Mood":{
703
+ "p":0.9827586207,
704
+ "r":0.9806451613,
705
+ "f":0.9817007535
706
+ },
707
+ "Person":{
708
+ "p":0.9696521095,
709
+ "r":0.9597069597,
710
+ "f":0.9646539028
711
+ },
712
+ "Tense":{
713
+ "p":0.9508408797,
714
+ "r":0.9582790091,
715
+ "f":0.9545454545
716
+ },
717
+ "VerbForm":{
718
+ "p":0.9735772358,
719
+ "r":0.9618473896,
720
+ "f":0.9676767677
721
+ },
722
+ "Voice":{
723
+ "p":0.9552845528,
724
+ "r":0.9437751004,
725
+ "f":0.9494949495
726
+ },
727
+ "Definite":{
728
+ "p":0.9864253394,
729
+ "r":0.9971412236,
730
+ "f":0.9917543361
731
+ },
732
+ "PronType":{
733
+ "p":0.9812870835,
734
+ "r":0.9844322344,
735
+ "f":0.9828571429
736
+ },
737
+ "Foreign":{
738
+ "p":0.71875,
739
+ "r":0.5714285714,
740
+ "f":0.6366782007
741
+ },
742
+ "NumType":{
743
+ "p":0.9427083333,
744
+ "r":0.8829268293,
745
+ "f":0.9118387909
746
+ },
747
+ "Poss":{
748
+ "p":0.8977272727,
749
+ "r":0.8876404494,
750
+ "f":0.8926553672
751
+ },
752
+ "Degree":{
753
+ "p":0.7666666667,
754
+ "r":0.6052631579,
755
+ "f":0.6764705882
756
+ }
757
+ },
758
+ "tag_acc":0.9134743381,
759
+ "lemma_acc":0.5620470345
760
  },
761
  "sources":[
762
  {
763
+ "name":"UD Greek GDT v2.8",
764
  "url":"https://github.com/UniversalDependencies/UD_Greek-GDT",
765
  "license":"CC BY-NC-SA 3.0",
766
  "author":"Prokopidis, Prokopis"
morphologizer/cfg CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "labels_morph":{
3
  "Case=Nom|Definite=Def|Gender=Fem|Number=Sing|POS=DET|PronType=Art":"Case=Nom|Definite=Def|Gender=Fem|Number=Sing|PronType=Art",
4
  "Foreign=Yes|POS=X":"Foreign=Yes",
@@ -714,5 +715,6 @@
714
  "Case=Gen|Gender=Fem|NumType=Ord|Number=Plur|POS=NUM":93,
715
  "Case=Dat|Definite=Def|Gender=Fem|Number=Sing|POS=DET|PronType=Art":90,
716
  "Case=Gen|Degree=Cmp|Gender=Masc|Number=Sing|POS=ADJ":84
717
- }
 
718
  }
1
  {
2
+ "extend":false,
3
  "labels_morph":{
4
  "Case=Nom|Definite=Def|Gender=Fem|Number=Sing|POS=DET|PronType=Art":"Case=Nom|Definite=Def|Gender=Fem|Number=Sing|PronType=Art",
5
  "Foreign=Yes|POS=X":"Foreign=Yes",
715
  "Case=Gen|Gender=Fem|NumType=Ord|Number=Plur|POS=NUM":93,
716
  "Case=Dat|Definite=Def|Gender=Fem|Number=Sing|POS=DET|PronType=Art":90,
717
  "Case=Gen|Degree=Cmp|Gender=Masc|Number=Sing|POS=ADJ":84
718
+ },
719
+ "overwrite":true
720
  }
morphologizer/model CHANGED
Binary files a/morphologizer/model and b/morphologizer/model differ
ner/model CHANGED
Binary files a/ner/model and b/ner/model differ
parser/model CHANGED
Binary files a/parser/model and b/parser/model differ
senter/cfg CHANGED
@@ -1,3 +1,3 @@
1
  {
2
-
3
  }
1
  {
2
+ "overwrite":false
3
  }
senter/model CHANGED
Binary files a/senter/model and b/senter/model differ
tok2vec/model CHANGED
Binary files a/tok2vec/model and b/tok2vec/model differ
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ad9afc5bfb897749239840ae5bd366bb3886b60bf246a56044324827d68032a2
3
- size 1557424
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:62e96d9f1e2e9b95263c74526351921654abbdc3ef5987552922b52c7087e234
3
+ size 1557585
vocab/vectors.cfg ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ {
2
+ "mode":"default"
3
+ }