adrianeboyd commited on
Commit
402d614
1 Parent(s): 8c26f2a

Update spaCy pipeline

Browse files
LICENSES_SOURCES CHANGED
@@ -11,10 +11,10 @@ http://www.gnu.org/licenses/gpl.html```
11
 
12
 
13
 
14
- # UD Catalan AnCora v2.8 + NER v3.2.8
15
 
16
  * Author: Carlos Rodríguez-Penagos and Carme Armentano-Oller
17
- * URL: https://github.com/TeMU-BSC/spacy/releases/tag/3.2.8
18
  * License: CC BY 4.0
19
 
20
  ```
11
 
12
 
13
 
14
+ # UD Catalan AnCora v2.8 + NER v3.2.9
15
 
16
  * Author: Carlos Rodríguez-Penagos and Carme Armentano-Oller
17
+ * URL: https://github.com/TeMU-BSC/spacy/releases/tag/3.2.9
18
  * License: CC BY 4.0
19
 
20
  ```
README.md CHANGED
@@ -14,62 +14,62 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.9170191995
18
  - name: NER Recall
19
  type: recall
20
- value: 0.9116790683
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.9143413368
24
  - task:
25
  name: TAG
26
  type: token-classification
27
  metrics:
28
  - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
- value: 0.9635621944
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
- value: 0.9635621944
38
  - task:
39
  name: MORPH
40
  type: token-classification
41
  metrics:
42
  - name: Morph (UFeats) Accuracy
43
  type: accuracy
44
- value: 0.9569642087
45
  - task:
46
  name: LEMMA
47
  type: token-classification
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
- value: 0.9817147291
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
- value: 0.9481666365
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
- value: 0.9307828605
66
  - task:
67
  name: SENTS
68
  type: token-classification
69
  metrics:
70
  - name: Sentences F-Score
71
  type: f_score
72
- value: 0.9885731028
73
  ---
74
  ### Details: https://spacy.io/models/ca#ca_core_news_trf
75
 
@@ -78,12 +78,12 @@ Catalan transformer pipeline (projecte-aina/roberta-base-ca-v2). Components: tra
78
  | Feature | Description |
79
  | --- | --- |
80
  | **Name** | `ca_core_news_trf` |
81
- | **Version** | `3.5.0` |
82
- | **spaCy** | `>=3.5.0,<3.6.0` |
83
  | **Default Pipeline** | `transformer`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
84
  | **Components** | `transformer`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
85
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
86
- | **Sources** | [UD Catalan AnCora v2.8](https://github.com/UniversalDependencies/UD_Catalan-AnCora) (Martínez Alonso, Héctor; Pascual, Elena; Zeman, Daniel)<br />[UD Catalan AnCora v2.8 + NER v3.2.8](https://github.com/TeMU-BSC/spacy/releases/tag/3.2.8) (Carlos Rodríguez-Penagos and Carme Armentano-Oller)<br />[Catalan Lemmatizer](https://github.com/explosion/spacy-lookups-data) (Text Mining Unit, Barcelona Supercomputing Center)<br />[projecte-aina/roberta-base-ca-v2](https://huggingface.co/projecte-aina/roberta-base-ca-v2) (Text Mining Unit (TeMU) at the Barcelona Supercomputing Center) |
87
  | **License** | `GNU GPL 3.0` |
88
  | **Author** | [Explosion](https://explosion.ai) |
89
 
@@ -109,18 +109,18 @@ Catalan transformer pipeline (projecte-aina/roberta-base-ca-v2). Components: tra
109
  | `TOKEN_P` | 99.78 |
110
  | `TOKEN_R` | 99.79 |
111
  | `TOKEN_F` | 99.79 |
112
- | `POS_ACC` | 96.36 |
113
- | `MORPH_ACC` | 95.70 |
114
- | `MORPH_MICRO_P` | 99.37 |
115
- | `MORPH_MICRO_R` | 98.89 |
116
- | `MORPH_MICRO_F` | 99.13 |
117
- | `SENTS_P` | 99.00 |
118
- | `SENTS_R` | 98.71 |
119
- | `SENTS_F` | 98.86 |
120
- | `DEP_UAS` | 94.82 |
121
  | `DEP_LAS` | 93.08 |
122
- | `TAG_ACC` | 96.36 |
123
  | `LEMMA_ACC` | 98.17 |
124
- | `ENTS_P` | 91.70 |
125
- | `ENTS_R` | 91.17 |
126
- | `ENTS_F` | 91.43 |
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.9222476315
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.9132966677
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.9177503251
24
  - task:
25
  name: TAG
26
  type: token-classification
27
  metrics:
28
  - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
+ value: 0.9633641507
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
+ value: 0.9633641507
38
  - task:
39
  name: MORPH
40
  type: token-classification
41
  metrics:
42
  - name: Morph (UFeats) Accuracy
43
  type: accuracy
44
+ value: 0.9571111935
45
  - task:
46
  name: LEMMA
47
  type: token-classification
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
+ value: 0.9816802448
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
+ value: 0.9484123582
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
+ value: 0.930752303
66
  - task:
67
  name: SENTS
68
  type: token-classification
69
  metrics:
70
  - name: Sentences F-Score
71
  type: f_score
72
+ value: 0.9926750659
73
  ---
74
  ### Details: https://spacy.io/models/ca#ca_core_news_trf
75
 
78
  | Feature | Description |
79
  | --- | --- |
80
  | **Name** | `ca_core_news_trf` |
81
+ | **Version** | `3.6.1` |
82
+ | **spaCy** | `>=3.6.0,<3.7.0` |
83
  | **Default Pipeline** | `transformer`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
84
  | **Components** | `transformer`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
85
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
86
+ | **Sources** | [UD Catalan AnCora v2.8](https://github.com/UniversalDependencies/UD_Catalan-AnCora) (Martínez Alonso, Héctor; Pascual, Elena; Zeman, Daniel)<br />[UD Catalan AnCora v2.8 + NER v3.2.9](https://github.com/TeMU-BSC/spacy/releases/tag/3.2.9) (Carlos Rodríguez-Penagos and Carme Armentano-Oller)<br />[Catalan Lemmatizer](https://github.com/explosion/spacy-lookups-data) (Text Mining Unit, Barcelona Supercomputing Center)<br />[projecte-aina/roberta-base-ca-v2](https://huggingface.co/projecte-aina/roberta-base-ca-v2) (Text Mining Unit (TeMU) at the Barcelona Supercomputing Center) |
87
  | **License** | `GNU GPL 3.0` |
88
  | **Author** | [Explosion](https://explosion.ai) |
89
 
109
  | `TOKEN_P` | 99.78 |
110
  | `TOKEN_R` | 99.79 |
111
  | `TOKEN_F` | 99.79 |
112
+ | `POS_ACC` | 96.34 |
113
+ | `MORPH_ACC` | 95.71 |
114
+ | `MORPH_MICRO_P` | 99.41 |
115
+ | `MORPH_MICRO_R` | 98.46 |
116
+ | `MORPH_MICRO_F` | 98.93 |
117
+ | `SENTS_P` | 99.41 |
118
+ | `SENTS_R` | 99.12 |
119
+ | `SENTS_F` | 99.27 |
120
+ | `DEP_UAS` | 94.84 |
121
  | `DEP_LAS` | 93.08 |
122
+ | `TAG_ACC` | 96.34 |
123
  | `LEMMA_ACC` | 98.17 |
124
+ | `ENTS_P` | 92.22 |
125
+ | `ENTS_R` | 91.33 |
126
+ | `ENTS_F` | 91.78 |
accuracy.json CHANGED
@@ -3,66 +3,66 @@
3
  "token_p": 0.9978112128,
4
  "token_r": 0.9979488063,
5
  "token_f": 0.9978800048,
6
- "pos_acc": 0.9635621944,
7
- "morph_acc": 0.9569642087,
8
- "morph_micro_p": 0.9937208629,
9
- "morph_micro_r": 0.988864054,
10
- "morph_micro_f": 0.9912865095,
11
  "morph_per_feat": {
12
  "Mood": {
13
- "p": 0.9985228951,
14
  "r": 0.9980314961,
15
- "f": 0.9982771351
16
  },
17
  "Number": {
18
- "p": 0.9987019075,
19
- "r": 0.9940422783,
20
- "f": 0.9963666451
21
  },
22
  "Person": {
23
- "p": 0.9991145741,
24
- "r": 0.9977011494,
25
- "f": 0.9984073615
26
  },
27
  "Tense": {
28
- "p": 0.9935200669,
29
- "r": 0.996644999,
30
- "f": 0.9950800796
31
  },
32
  "VerbForm": {
33
- "p": 0.9977333333,
34
- "r": 0.9968029839,
35
- "f": 0.9972679416
36
  },
37
  "Gender": {
38
- "p": 0.9966032609,
39
- "r": 0.9890205143,
40
- "f": 0.992797409
41
  },
42
  "NumType": {
43
- "p": 0.9706994329,
44
- "r": 0.9679547597,
45
- "f": 0.9693251534
46
  },
47
  "Definite": {
48
- "p": 0.9967227767,
49
- "r": 0.9857100766,
50
- "f": 0.9911858381
51
  },
52
  "PronType": {
53
- "p": 0.9961737947,
54
- "r": 0.989527027,
55
- "f": 0.9928392865
56
  },
57
  "PunctType": {
58
- "p": 0.9540806841,
59
- "r": 0.935813275,
60
- "f": 0.9448586947
61
  },
62
  "NumForm": {
63
- "p": 0.9791666667,
64
- "r": 0.9901685393,
65
- "f": 0.9846368715
66
  },
67
  "Polarity": {
68
  "p": 1.0,
@@ -70,9 +70,9 @@
70
  "f": 0.993485342
71
  },
72
  "Case": {
73
- "p": 0.9974979149,
74
- "r": 0.9958368027,
75
- "f": 0.9966666667
76
  },
77
  "PrepCase": {
78
  "p": 0.9972413793,
@@ -91,18 +91,18 @@
91
  },
92
  "Poss": {
93
  "p": 1.0,
94
- "r": 0.9914529915,
95
- "f": 0.9957081545
96
  },
97
  "AdvType": {
98
- "p": 0.9769230769,
99
- "r": 0.9270072993,
100
- "f": 0.9513108614
101
  },
102
  "PunctSide": {
103
- "p": 0.8037190083,
104
- "r": 0.9396135266,
105
- "f": 0.8663697105
106
  },
107
  "Number[psor]": {
108
  "p": 1.0,
@@ -111,140 +111,140 @@
111
  },
112
  "Polite": {
113
  "p": 1.0,
114
- "r": 1.0,
115
- "f": 1.0
116
  }
117
  },
118
- "sents_p": 0.9900234742,
119
- "sents_r": 0.9871269748,
120
- "sents_f": 0.9885731028,
121
- "dep_uas": 0.9481666365,
122
- "dep_las": 0.9307828605,
123
  "dep_las_per_type": {
124
  "nsubj": {
125
- "p": 0.9525283798,
126
- "r": 0.9538408543,
127
- "f": 0.9531841652
128
  },
129
  "flat": {
130
- "p": 0.9299293862,
131
- "r": 0.9355191257,
132
- "f": 0.9327158812
133
  },
134
  "case": {
135
- "p": 0.9782345828,
136
- "r": 0.9727065047,
137
- "f": 0.9754627118
138
  },
139
  "aux": {
140
- "p": 0.9626666667,
141
- "r": 0.9636946076,
142
- "f": 0.9631803629
143
  },
144
  "root": {
145
- "p": 0.9683098592,
146
- "r": 0.9654768871,
147
- "f": 0.966891298
148
  },
149
  "nummod": {
150
- "p": 0.9256637168,
151
  "r": 0.9207746479,
152
- "f": 0.9232127096
153
  },
154
  "obj": {
155
- "p": 0.9203626562,
156
- "r": 0.9308550186,
157
- "f": 0.925579103
158
  },
159
  "det": {
160
- "p": 0.9870101986,
161
- "r": 0.9871161692,
162
- "f": 0.9870631811
163
  },
164
  "nmod": {
165
- "p": 0.8797849462,
166
- "r": 0.8762047548,
167
- "f": 0.8779912008
168
  },
169
  "amod": {
170
- "p": 0.9637325274,
171
- "r": 0.9626415094,
172
- "f": 0.9631867095
173
  },
174
  "obl": {
175
- "p": 0.8238050609,
176
- "r": 0.8005464481,
177
- "f": 0.8120092379
178
  },
179
  "cc": {
180
- "p": 0.9534883721,
181
- "r": 0.9558916194,
182
- "f": 0.9546884833
183
  },
184
  "fixed": {
185
- "p": 0.9308300395,
186
- "r": 0.9363817097,
187
- "f": 0.9335976214
188
  },
189
  "conj": {
190
- "p": 0.8433179724,
191
- "r": 0.8384879725,
192
- "f": 0.8408960368
193
  },
194
  "advmod": {
195
- "p": 0.8936294565,
196
- "r": 0.8899883586,
197
- "f": 0.891805191
198
  },
199
  "advcl": {
200
- "p": 0.7356181151,
201
- "r": 0.75125,
202
- "f": 0.7433518862
203
  },
204
  "compound": {
205
- "p": 0.9107806691,
206
  "r": 0.8844765343,
207
- "f": 0.8974358974
208
  },
209
  "mark": {
210
- "p": 0.9327146172,
211
  "r": 0.9409011118,
212
- "f": 0.9367899796
213
  },
214
  "cop": {
215
- "p": 0.8978723404,
216
- "r": 0.9134199134,
217
- "f": 0.9055793991
218
  },
219
  "ccomp": {
220
- "p": 0.863340564,
221
- "r": 0.8903803132,
222
- "f": 0.8766519824
223
  },
224
  "acl": {
225
- "p": 0.8622009569,
226
- "r": 0.8680154143,
227
- "f": 0.8650984157
228
  },
229
  "expl:pass": {
230
- "p": 0.7674418605,
231
- "r": 0.7173913043,
232
- "f": 0.7415730337
233
  },
234
  "appos": {
235
- "p": 0.8207070707,
236
- "r": 0.8269720102,
237
- "f": 0.8238276299
238
  },
239
  "xcomp": {
240
- "p": 0.8844339623,
241
  "r": 0.8802816901,
242
- "f": 0.8823529412
243
  },
244
  "iobj": {
245
- "p": 0.8284023669,
246
- "r": 0.7486631016,
247
- "f": 0.7865168539
248
  },
249
  "dep": {
250
  "p": 0.0,
@@ -252,14 +252,14 @@
252
  "f": 0.0
253
  },
254
  "csubj": {
255
- "p": 0.8514851485,
256
  "r": 0.8113207547,
257
- "f": 0.8309178744
258
  },
259
  "parataxis": {
260
- "p": 0.8461538462,
261
- "r": 0.6470588235,
262
- "f": 0.7333333333
263
  },
264
  "nsubj:pass": {
265
  "p": 0.0,
@@ -272,32 +272,32 @@
272
  "f": 0.0
273
  }
274
  },
275
- "tag_acc": 0.9635621944,
276
- "lemma_acc": 0.9817147291,
277
- "ents_p": 0.9170191995,
278
- "ents_r": 0.9116790683,
279
- "ents_f": 0.9143413368,
280
  "ents_per_type": {
281
  "ORG": {
282
- "p": 0.9015219338,
283
- "r": 0.908025248,
284
- "f": 0.9047619048
285
  },
286
  "LOC": {
287
- "p": 0.9416941694,
288
- "r": 0.9184549356,
289
- "f": 0.9299293862
290
  },
291
  "MISC": {
292
- "p": 0.8246073298,
293
- "r": 0.8224543081,
294
- "f": 0.8235294118
295
  },
296
  "PER": {
297
- "p": 0.962406015,
298
- "r": 0.9595202399,
299
- "f": 0.960960961
300
  }
301
  },
302
- "speed": 4373.51976419
303
  }
3
  "token_p": 0.9978112128,
4
  "token_r": 0.9979488063,
5
  "token_f": 0.9978800048,
6
+ "pos_acc": 0.9633641507,
7
+ "morph_acc": 0.9571111935,
8
+ "morph_micro_p": 0.9940523354,
9
+ "morph_micro_r": 0.9845527244,
10
+ "morph_micro_f": 0.9892797253,
11
  "morph_per_feat": {
12
  "Mood": {
13
+ "p": 0.9982771351,
14
  "r": 0.9980314961,
15
+ "f": 0.9981543005
16
  },
17
  "Number": {
18
+ "p": 0.9987333068,
19
+ "r": 0.9904173994,
20
+ "f": 0.9945579702
21
  },
22
  "Person": {
23
+ "p": 0.9989373007,
24
+ "r": 0.9973474801,
25
+ "f": 0.9981417574
26
  },
27
  "Tense": {
28
+ "p": 0.9933082392,
29
+ "r": 0.9960159363,
30
+ "f": 0.994660245
31
  },
32
  "VerbForm": {
33
+ "p": 0.9972026109,
34
+ "r": 0.9972026109,
35
+ "f": 0.9972026109
36
  },
37
  "Gender": {
38
+ "p": 0.9965883614,
39
+ "r": 0.9846865068,
40
+ "f": 0.9906016859
41
  },
42
  "NumType": {
43
+ "p": 0.9688385269,
44
+ "r": 0.9670122526,
45
+ "f": 0.9679245283
46
  },
47
  "Definite": {
48
+ "p": 0.9969747391,
49
+ "r": 0.9709781968,
50
+ "f": 0.9838047615
51
  },
52
  "PronType": {
53
+ "p": 0.9963973237,
54
+ "r": 0.9810810811,
55
+ "f": 0.9886798877
56
  },
57
  "PunctType": {
58
+ "p": 0.9537174721,
59
+ "r": 0.9356309263,
60
+ "f": 0.9445876289
61
  },
62
  "NumForm": {
63
+ "p": 0.9832167832,
64
+ "r": 0.9873595506,
65
+ "f": 0.9852838122
66
  },
67
  "Polarity": {
68
  "p": 1.0,
70
  "f": 0.993485342
71
  },
72
  "Case": {
73
+ "p": 0.9974958264,
74
+ "r": 0.9950041632,
75
+ "f": 0.9962484368
76
  },
77
  "PrepCase": {
78
  "p": 0.9972413793,
91
  },
92
  "Poss": {
93
  "p": 1.0,
94
+ "r": 0.9886039886,
95
+ "f": 0.994269341
96
  },
97
  "AdvType": {
98
+ "p": 0.9847328244,
99
+ "r": 0.9416058394,
100
+ "f": 0.9626865672
101
  },
102
  "PunctSide": {
103
+ "p": 0.8588807786,
104
+ "r": 0.8526570048,
105
+ "f": 0.8557575758
106
  },
107
  "Number[psor]": {
108
  "p": 1.0,
111
  },
112
  "Polite": {
113
  "p": 1.0,
114
+ "r": 0.75,
115
+ "f": 0.8571428571
116
  }
117
  },
118
+ "sents_p": 0.9941314554,
119
+ "sents_r": 0.9912229374,
120
+ "sents_f": 0.9926750659,
121
+ "dep_uas": 0.9484123582,
122
+ "dep_las": 0.930752303,
123
  "dep_las_per_type": {
124
  "nsubj": {
125
+ "p": 0.9565067311,
126
+ "r": 0.9545297968,
127
+ "f": 0.9555172414
128
  },
129
  "flat": {
130
+ "p": 0.9405940594,
131
+ "r": 0.9344262295,
132
+ "f": 0.9375
133
  },
134
  "case": {
135
+ "p": 0.9791868345,
136
+ "r": 0.9729469761,
137
+ "f": 0.9760569326
138
  },
139
  "aux": {
140
+ "p": 0.9641519529,
141
+ "r": 0.9620928991,
142
+ "f": 0.9631213255
143
  },
144
  "root": {
145
+ "p": 0.9700704225,
146
+ "r": 0.9672322996,
147
+ "f": 0.9686492822
148
  },
149
  "nummod": {
150
+ "p": 0.9240282686,
151
  "r": 0.9207746479,
152
+ "f": 0.9223985891
153
  },
154
  "obj": {
155
+ "p": 0.9109390126,
156
+ "r": 0.9328376704,
157
+ "f": 0.9217582956
158
  },
159
  "det": {
160
+ "p": 0.9860365199,
161
+ "r": 0.9856130556,
162
+ "f": 0.9858247423
163
  },
164
  "nmod": {
165
+ "p": 0.8819821778,
166
+ "r": 0.8691368601,
167
+ "f": 0.8755124056
168
  },
169
  "amod": {
170
+ "p": 0.9623210249,
171
+ "r": 0.9637735849,
172
+ "f": 0.9630467572
173
  },
174
  "obl": {
175
+ "p": 0.8178421298,
176
+ "r": 0.7973588342,
177
+ "f": 0.8074706018
178
  },
179
  "cc": {
180
+ "p": 0.955,
181
+ "r": 0.9628229364,
182
+ "f": 0.958895513
183
  },
184
  "fixed": {
185
+ "p": 0.9290640394,
186
+ "r": 0.9373757455,
187
+ "f": 0.9332013855
188
  },
189
  "conj": {
190
+ "p": 0.8430821147,
191
+ "r": 0.8585337915,
192
+ "f": 0.850737798
193
  },
194
  "advmod": {
195
+ "p": 0.898816568,
196
+ "r": 0.8841676368,
197
+ "f": 0.8914319249
198
  },
199
  "advcl": {
200
+ "p": 0.7384987893,
201
+ "r": 0.7625,
202
+ "f": 0.7503075031
203
  },
204
  "compound": {
205
+ "p": 0.9074074074,
206
  "r": 0.8844765343,
207
+ "f": 0.8957952468
208
  },
209
  "mark": {
210
+ "p": 0.9359720605,
211
  "r": 0.9409011118,
212
+ "f": 0.9384301138
213
  },
214
  "cop": {
215
+ "p": 0.8928571429,
216
+ "r": 0.9199134199,
217
+ "f": 0.9061833689
218
  },
219
  "ccomp": {
220
+ "p": 0.8586956522,
221
+ "r": 0.8836689038,
222
+ "f": 0.8710033076
223
  },
224
  "acl": {
225
+ "p": 0.8649951784,
226
+ "r": 0.8641618497,
227
+ "f": 0.8645783133
228
  },
229
  "expl:pass": {
230
+ "p": 0.6818181818,
231
+ "r": 0.652173913,
232
+ "f": 0.6666666667
233
  },
234
  "appos": {
235
+ "p": 0.8253164557,
236
+ "r": 0.8295165394,
237
+ "f": 0.8274111675
238
  },
239
  "xcomp": {
240
+ "p": 0.8886255924,
241
  "r": 0.8802816901,
242
+ "f": 0.8844339623
243
  },
244
  "iobj": {
245
+ "p": 0.8529411765,
246
+ "r": 0.7754010695,
247
+ "f": 0.81232493
248
  },
249
  "dep": {
250
  "p": 0.0,
252
  "f": 0.0
253
  },
254
  "csubj": {
255
+ "p": 0.8269230769,
256
  "r": 0.8113207547,
257
+ "f": 0.819047619
258
  },
259
  "parataxis": {
260
+ "p": 0.8571428571,
261
+ "r": 0.5294117647,
262
+ "f": 0.6545454545
263
  },
264
  "nsubj:pass": {
265
  "p": 0.0,
272
  "f": 0.0
273
  }
274
  },
275
+ "tag_acc": 0.9633641507,
276
+ "lemma_acc": 0.9816802448,
277
+ "ents_p": 0.9222476315,
278
+ "ents_r": 0.9132966677,
279
+ "ents_f": 0.9177503251,
280
  "ents_per_type": {
281
  "ORG": {
282
+ "p": 0.9086799277,
283
+ "r": 0.9062218215,
284
+ "f": 0.9074492099
285
  },
286
  "LOC": {
287
+ "p": 0.9441401972,
288
+ "r": 0.9248927039,
289
+ "f": 0.9344173442
290
  },
291
  "MISC": {
292
+ "p": 0.837398374,
293
+ "r": 0.8067885117,
294
+ "f": 0.8218085106
295
  },
296
  "PER": {
297
+ "p": 0.9613670134,
298
+ "r": 0.9700149925,
299
+ "f": 0.9656716418
300
  }
301
  },
302
+ "speed": 2440.4721637988
303
  }
ca_core_news_trf-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0034682b1385c3b6b560dd597ee16f21f8ae54fcbd48f4bbc3de4201617899ac
3
- size 459874587
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:334c4784bfa81b8608901ca77618b322ff24118a4350131a5009f6dcb593d32d
3
+ size 459879407
config.cfg CHANGED
@@ -35,6 +35,7 @@ scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
35
  [components.morphologizer]
36
  factory = "morphologizer"
37
  extend = false
 
38
  overwrite = true
39
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
@@ -100,8 +101,8 @@ max_batch_items = 4096
100
  set_extra_annotations = {"@annotation_setters":"spacy-transformers.null_annotation_setter.v1"}
101
 
102
  [components.transformer.model]
103
- @architectures = "spacy-transformers.TransformerModel.v3"
104
  name = "projecte-aina/roberta-base-ca-v2"
 
105
  mixed_precision = false
106
 
107
  [components.transformer.model.get_spans]
35
  [components.morphologizer]
36
  factory = "morphologizer"
37
  extend = false
38
+ label_smoothing = 0.0
39
  overwrite = true
40
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
41
 
101
  set_extra_annotations = {"@annotation_setters":"spacy-transformers.null_annotation_setter.v1"}
102
 
103
  [components.transformer.model]
 
104
  name = "projecte-aina/roberta-base-ca-v2"
105
+ @architectures = "spacy-transformers.TransformerModel.v3"
106
  mixed_precision = false
107
 
108
  [components.transformer.model.get_spans]
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"ca",
3
  "name":"core_news_trf",
4
- "version":"3.5.0",
5
  "description":"Catalan transformer pipeline (projecte-aina/roberta-base-ca-v2). Components: transformer, morphologizer, parser, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"GNU GPL 3.0",
10
- "spacy_version":">=3.5.0,<3.6.0",
11
- "spacy_git_version":"9e0322de1",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
@@ -372,66 +372,66 @@
372
  "token_p":0.9978112128,
373
  "token_r":0.9979488063,
374
  "token_f":0.9978800048,
375
- "pos_acc":0.9635621944,
376
- "morph_acc":0.9569642087,
377
- "morph_micro_p":0.9937208629,
378
- "morph_micro_r":0.988864054,
379
- "morph_micro_f":0.9912865095,
380
  "morph_per_feat":{
381
  "Mood":{
382
- "p":0.9985228951,
383
  "r":0.9980314961,
384
- "f":0.9982771351
385
  },
386
  "Number":{
387
- "p":0.9987019075,
388
- "r":0.9940422783,
389
- "f":0.9963666451
390
  },
391
  "Person":{
392
- "p":0.9991145741,
393
- "r":0.9977011494,
394
- "f":0.9984073615
395
  },
396
  "Tense":{
397
- "p":0.9935200669,
398
- "r":0.996644999,
399
- "f":0.9950800796
400
  },
401
  "VerbForm":{
402
- "p":0.9977333333,
403
- "r":0.9968029839,
404
- "f":0.9972679416
405
  },
406
  "Gender":{
407
- "p":0.9966032609,
408
- "r":0.9890205143,
409
- "f":0.992797409
410
  },
411
  "NumType":{
412
- "p":0.9706994329,
413
- "r":0.9679547597,
414
- "f":0.9693251534
415
  },
416
  "Definite":{
417
- "p":0.9967227767,
418
- "r":0.9857100766,
419
- "f":0.9911858381
420
  },
421
  "PronType":{
422
- "p":0.9961737947,
423
- "r":0.989527027,
424
- "f":0.9928392865
425
  },
426
  "PunctType":{
427
- "p":0.9540806841,
428
- "r":0.935813275,
429
- "f":0.9448586947
430
  },
431
  "NumForm":{
432
- "p":0.9791666667,
433
- "r":0.9901685393,
434
- "f":0.9846368715
435
  },
436
  "Polarity":{
437
  "p":1.0,
@@ -439,9 +439,9 @@
439
  "f":0.993485342
440
  },
441
  "Case":{
442
- "p":0.9974979149,
443
- "r":0.9958368027,
444
- "f":0.9966666667
445
  },
446
  "PrepCase":{
447
  "p":0.9972413793,
@@ -460,18 +460,18 @@
460
  },
461
  "Poss":{
462
  "p":1.0,
463
- "r":0.9914529915,
464
- "f":0.9957081545
465
  },
466
  "AdvType":{
467
- "p":0.9769230769,
468
- "r":0.9270072993,
469
- "f":0.9513108614
470
  },
471
  "PunctSide":{
472
- "p":0.8037190083,
473
- "r":0.9396135266,
474
- "f":0.8663697105
475
  },
476
  "Number[psor]":{
477
  "p":1.0,
@@ -480,140 +480,140 @@
480
  },
481
  "Polite":{
482
  "p":1.0,
483
- "r":1.0,
484
- "f":1.0
485
  }
486
  },
487
- "sents_p":0.9900234742,
488
- "sents_r":0.9871269748,
489
- "sents_f":0.9885731028,
490
- "dep_uas":0.9481666365,
491
- "dep_las":0.9307828605,
492
  "dep_las_per_type":{
493
  "nsubj":{
494
- "p":0.9525283798,
495
- "r":0.9538408543,
496
- "f":0.9531841652
497
  },
498
  "flat":{
499
- "p":0.9299293862,
500
- "r":0.9355191257,
501
- "f":0.9327158812
502
  },
503
  "case":{
504
- "p":0.9782345828,
505
- "r":0.9727065047,
506
- "f":0.9754627118
507
  },
508
  "aux":{
509
- "p":0.9626666667,
510
- "r":0.9636946076,
511
- "f":0.9631803629
512
  },
513
  "root":{
514
- "p":0.9683098592,
515
- "r":0.9654768871,
516
- "f":0.966891298
517
  },
518
  "nummod":{
519
- "p":0.9256637168,
520
  "r":0.9207746479,
521
- "f":0.9232127096
522
  },
523
  "obj":{
524
- "p":0.9203626562,
525
- "r":0.9308550186,
526
- "f":0.925579103
527
  },
528
  "det":{
529
- "p":0.9870101986,
530
- "r":0.9871161692,
531
- "f":0.9870631811
532
  },
533
  "nmod":{
534
- "p":0.8797849462,
535
- "r":0.8762047548,
536
- "f":0.8779912008
537
  },
538
  "amod":{
539
- "p":0.9637325274,
540
- "r":0.9626415094,
541
- "f":0.9631867095
542
  },
543
  "obl":{
544
- "p":0.8238050609,
545
- "r":0.8005464481,
546
- "f":0.8120092379
547
  },
548
  "cc":{
549
- "p":0.9534883721,
550
- "r":0.9558916194,
551
- "f":0.9546884833
552
  },
553
  "fixed":{
554
- "p":0.9308300395,
555
- "r":0.9363817097,
556
- "f":0.9335976214
557
  },
558
  "conj":{
559
- "p":0.8433179724,
560
- "r":0.8384879725,
561
- "f":0.8408960368
562
  },
563
  "advmod":{
564
- "p":0.8936294565,
565
- "r":0.8899883586,
566
- "f":0.891805191
567
  },
568
  "advcl":{
569
- "p":0.7356181151,
570
- "r":0.75125,
571
- "f":0.7433518862
572
  },
573
  "compound":{
574
- "p":0.9107806691,
575
  "r":0.8844765343,
576
- "f":0.8974358974
577
  },
578
  "mark":{
579
- "p":0.9327146172,
580
  "r":0.9409011118,
581
- "f":0.9367899796
582
  },
583
  "cop":{
584
- "p":0.8978723404,
585
- "r":0.9134199134,
586
- "f":0.9055793991
587
  },
588
  "ccomp":{
589
- "p":0.863340564,
590
- "r":0.8903803132,
591
- "f":0.8766519824
592
  },
593
  "acl":{
594
- "p":0.8622009569,
595
- "r":0.8680154143,
596
- "f":0.8650984157
597
  },
598
  "expl:pass":{
599
- "p":0.7674418605,
600
- "r":0.7173913043,
601
- "f":0.7415730337
602
  },
603
  "appos":{
604
- "p":0.8207070707,
605
- "r":0.8269720102,
606
- "f":0.8238276299
607
  },
608
  "xcomp":{
609
- "p":0.8844339623,
610
  "r":0.8802816901,
611
- "f":0.8823529412
612
  },
613
  "iobj":{
614
- "p":0.8284023669,
615
- "r":0.7486631016,
616
- "f":0.7865168539
617
  },
618
  "dep":{
619
  "p":0.0,
@@ -621,14 +621,14 @@
621
  "f":0.0
622
  },
623
  "csubj":{
624
- "p":0.8514851485,
625
  "r":0.8113207547,
626
- "f":0.8309178744
627
  },
628
  "parataxis":{
629
- "p":0.8461538462,
630
- "r":0.6470588235,
631
- "f":0.7333333333
632
  },
633
  "nsubj:pass":{
634
  "p":0.0,
@@ -641,34 +641,34 @@
641
  "f":0.0
642
  }
643
  },
644
- "tag_acc":0.9635621944,
645
- "lemma_acc":0.9817147291,
646
- "ents_p":0.9170191995,
647
- "ents_r":0.9116790683,
648
- "ents_f":0.9143413368,
649
  "ents_per_type":{
650
  "ORG":{
651
- "p":0.9015219338,
652
- "r":0.908025248,
653
- "f":0.9047619048
654
  },
655
  "LOC":{
656
- "p":0.9416941694,
657
- "r":0.9184549356,
658
- "f":0.9299293862
659
  },
660
  "MISC":{
661
- "p":0.8246073298,
662
- "r":0.8224543081,
663
- "f":0.8235294118
664
  },
665
  "PER":{
666
- "p":0.962406015,
667
- "r":0.9595202399,
668
- "f":0.960960961
669
  }
670
  },
671
- "speed":4373.51976419
672
  },
673
  "sources":[
674
  {
@@ -678,8 +678,8 @@
678
  "author":"Mart\u00ednez Alonso, H\u00e9ctor; Pascual, Elena; Zeman, Daniel"
679
  },
680
  {
681
- "name":"UD Catalan AnCora v2.8 + NER v3.2.8",
682
- "url":"https://github.com/TeMU-BSC/spacy/releases/tag/3.2.8",
683
  "license":"CC BY 4.0",
684
  "author":"Carlos Rodr\u00edguez-Penagos and Carme Armentano-Oller"
685
  },
@@ -697,6 +697,6 @@
697
  }
698
  ],
699
  "requirements":[
700
- "spacy-transformers>=1.2.0.dev0,<1.3.0"
701
  ]
702
  }
1
  {
2
  "lang":"ca",
3
  "name":"core_news_trf",
4
+ "version":"3.6.1",
5
  "description":"Catalan transformer pipeline (projecte-aina/roberta-base-ca-v2). Components: transformer, morphologizer, parser, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"GNU GPL 3.0",
10
+ "spacy_version":">=3.6.0,<3.7.0",
11
+ "spacy_git_version":"c067b5264",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
372
  "token_p":0.9978112128,
373
  "token_r":0.9979488063,
374
  "token_f":0.9978800048,
375
+ "pos_acc":0.9633641507,
376
+ "morph_acc":0.9571111935,
377
+ "morph_micro_p":0.9940523354,
378
+ "morph_micro_r":0.9845527244,
379
+ "morph_micro_f":0.9892797253,
380
  "morph_per_feat":{
381
  "Mood":{
382
+ "p":0.9982771351,
383
  "r":0.9980314961,
384
+ "f":0.9981543005
385
  },
386
  "Number":{
387
+ "p":0.9987333068,
388
+ "r":0.9904173994,
389
+ "f":0.9945579702
390
  },
391
  "Person":{
392
+ "p":0.9989373007,
393
+ "r":0.9973474801,
394
+ "f":0.9981417574
395
  },
396
  "Tense":{
397
+ "p":0.9933082392,
398
+ "r":0.9960159363,
399
+ "f":0.994660245
400
  },
401
  "VerbForm":{
402
+ "p":0.9972026109,
403
+ "r":0.9972026109,
404
+ "f":0.9972026109
405
  },
406
  "Gender":{
407
+ "p":0.9965883614,
408
+ "r":0.9846865068,
409
+ "f":0.9906016859
410
  },
411
  "NumType":{
412
+ "p":0.9688385269,
413
+ "r":0.9670122526,
414
+ "f":0.9679245283
415
  },
416
  "Definite":{
417
+ "p":0.9969747391,
418
+ "r":0.9709781968,
419
+ "f":0.9838047615
420
  },
421
  "PronType":{
422
+ "p":0.9963973237,
423
+ "r":0.9810810811,
424
+ "f":0.9886798877
425
  },
426
  "PunctType":{
427
+ "p":0.9537174721,
428
+ "r":0.9356309263,
429
+ "f":0.9445876289
430
  },
431
  "NumForm":{
432
+ "p":0.9832167832,
433
+ "r":0.9873595506,
434
+ "f":0.9852838122
435
  },
436
  "Polarity":{
437
  "p":1.0,
439
  "f":0.993485342
440
  },
441
  "Case":{
442
+ "p":0.9974958264,
443
+ "r":0.9950041632,
444
+ "f":0.9962484368
445
  },
446
  "PrepCase":{
447
  "p":0.9972413793,
460
  },
461
  "Poss":{
462
  "p":1.0,
463
+ "r":0.9886039886,
464
+ "f":0.994269341
465
  },
466
  "AdvType":{
467
+ "p":0.9847328244,
468
+ "r":0.9416058394,
469
+ "f":0.9626865672
470
  },
471
  "PunctSide":{
472
+ "p":0.8588807786,
473
+ "r":0.8526570048,
474
+ "f":0.8557575758
475
  },
476
  "Number[psor]":{
477
  "p":1.0,
480
  },
481
  "Polite":{
482
  "p":1.0,
483
+ "r":0.75,
484
+ "f":0.8571428571
485
  }
486
  },
487
+ "sents_p":0.9941314554,
488
+ "sents_r":0.9912229374,
489
+ "sents_f":0.9926750659,
490
+ "dep_uas":0.9484123582,
491
+ "dep_las":0.930752303,
492
  "dep_las_per_type":{
493
  "nsubj":{
494
+ "p":0.9565067311,
495
+ "r":0.9545297968,
496
+ "f":0.9555172414
497
  },
498
  "flat":{
499
+ "p":0.9405940594,
500
+ "r":0.9344262295,
501
+ "f":0.9375
502
  },
503
  "case":{
504
+ "p":0.9791868345,
505
+ "r":0.9729469761,
506
+ "f":0.9760569326
507
  },
508
  "aux":{
509
+ "p":0.9641519529,
510
+ "r":0.9620928991,
511
+ "f":0.9631213255
512
  },
513
  "root":{
514
+ "p":0.9700704225,
515
+ "r":0.9672322996,
516
+ "f":0.9686492822
517
  },
518
  "nummod":{
519
+ "p":0.9240282686,
520
  "r":0.9207746479,
521
+ "f":0.9223985891
522
  },
523
  "obj":{
524
+ "p":0.9109390126,
525
+ "r":0.9328376704,
526
+ "f":0.9217582956
527
  },
528
  "det":{
529
+ "p":0.9860365199,
530
+ "r":0.9856130556,
531
+ "f":0.9858247423
532
  },
533
  "nmod":{
534
+ "p":0.8819821778,
535
+ "r":0.8691368601,
536
+ "f":0.8755124056
537
  },
538
  "amod":{
539
+ "p":0.9623210249,
540
+ "r":0.9637735849,
541
+ "f":0.9630467572
542
  },
543
  "obl":{
544
+ "p":0.8178421298,
545
+ "r":0.7973588342,
546
+ "f":0.8074706018
547
  },
548
  "cc":{
549
+ "p":0.955,
550
+ "r":0.9628229364,
551
+ "f":0.958895513
552
  },
553
  "fixed":{
554
+ "p":0.9290640394,
555
+ "r":0.9373757455,
556
+ "f":0.9332013855
557
  },
558
  "conj":{
559
+ "p":0.8430821147,
560
+ "r":0.8585337915,
561
+ "f":0.850737798
562
  },
563
  "advmod":{
564
+ "p":0.898816568,
565
+ "r":0.8841676368,
566
+ "f":0.8914319249
567
  },
568
  "advcl":{
569
+ "p":0.7384987893,
570
+ "r":0.7625,
571
+ "f":0.7503075031
572
  },
573
  "compound":{
574
+ "p":0.9074074074,
575
  "r":0.8844765343,
576
+ "f":0.8957952468
577
  },
578
  "mark":{
579
+ "p":0.9359720605,
580
  "r":0.9409011118,
581
+ "f":0.9384301138
582
  },
583
  "cop":{
584
+ "p":0.8928571429,
585
+ "r":0.9199134199,
586
+ "f":0.9061833689
587
  },
588
  "ccomp":{
589
+ "p":0.8586956522,
590
+ "r":0.8836689038,
591
+ "f":0.8710033076
592
  },
593
  "acl":{
594
+ "p":0.8649951784,
595
+ "r":0.8641618497,
596
+ "f":0.8645783133
597
  },
598
  "expl:pass":{
599
+ "p":0.6818181818,
600
+ "r":0.652173913,
601
+ "f":0.6666666667
602
  },
603
  "appos":{
604
+ "p":0.8253164557,
605
+ "r":0.8295165394,
606
+ "f":0.8274111675
607
  },
608
  "xcomp":{
609
+ "p":0.8886255924,
610
  "r":0.8802816901,
611
+ "f":0.8844339623
612
  },
613
  "iobj":{
614
+ "p":0.8529411765,
615
+ "r":0.7754010695,
616
+ "f":0.81232493
617
  },
618
  "dep":{
619
  "p":0.0,
621
  "f":0.0
622
  },
623
  "csubj":{
624
+ "p":0.8269230769,
625
  "r":0.8113207547,
626
+ "f":0.819047619
627
  },
628
  "parataxis":{
629
+ "p":0.8571428571,
630
+ "r":0.5294117647,
631
+ "f":0.6545454545
632
  },
633
  "nsubj:pass":{
634
  "p":0.0,
641
  "f":0.0
642
  }
643
  },
644
+ "tag_acc":0.9633641507,
645
+ "lemma_acc":0.9816802448,
646
+ "ents_p":0.9222476315,
647
+ "ents_r":0.9132966677,
648
+ "ents_f":0.9177503251,
649
  "ents_per_type":{
650
  "ORG":{
651
+ "p":0.9086799277,
652
+ "r":0.9062218215,
653
+ "f":0.9074492099
654
  },
655
  "LOC":{
656
+ "p":0.9441401972,
657
+ "r":0.9248927039,
658
+ "f":0.9344173442
659
  },
660
  "MISC":{
661
+ "p":0.837398374,
662
+ "r":0.8067885117,
663
+ "f":0.8218085106
664
  },
665
  "PER":{
666
+ "p":0.9613670134,
667
+ "r":0.9700149925,
668
+ "f":0.9656716418
669
  }
670
  },
671
+ "speed":2440.4721637988
672
  },
673
  "sources":[
674
  {
678
  "author":"Mart\u00ednez Alonso, H\u00e9ctor; Pascual, Elena; Zeman, Daniel"
679
  },
680
  {
681
+ "name":"UD Catalan AnCora v2.8 + NER v3.2.9",
682
+ "url":"https://github.com/TeMU-BSC/spacy/releases/tag/3.2.9",
683
  "license":"CC BY 4.0",
684
  "author":"Carlos Rodr\u00edguez-Penagos and Carme Armentano-Oller"
685
  },
697
  }
698
  ],
699
  "requirements":[
700
+ "spacy-transformers>=1.2.2,<1.3.0"
701
  ]
702
  }
morphologizer/cfg CHANGED
@@ -1,5 +1,6 @@
1
  {
2
  "extend":false,
 
3
  "labels_morph":{
4
  "Definite=Def|Gender=Masc|Number=Sing|POS=DET|PronType=Art":"Definite=Def|Gender=Masc|Number=Sing|PronType=Art",
5
  "POS=PROPN":"",
1
  {
2
  "extend":false,
3
+ "label_smoothing":0.0,
4
  "labels_morph":{
5
  "Definite=Def|Gender=Masc|Number=Sing|POS=DET|PronType=Art":"Definite=Def|Gender=Masc|Number=Sing|PronType=Art",
6
  "POS=PROPN":"",
morphologizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:13cdaf21db1a6ee3e2a662e3974fe776a1c2cde7cb9f5f5a997910648f091400
3
  size 871161
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2e4c677b72c8e1cf128058b042bba6f392ef97aa5c508e066b9c8d275c1b6e46
3
  size 871161
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:738084ba5bb6a92fab74a7063686e66c2dbf59274efbe61a4f8d6f7a82898973
3
  size 225962
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b6c5724cc10367a4af366b99ea2033c4d1c5ae9e3da40b0139631040e0f79906
3
  size 225962
parser/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d59e4e1bfd3e2064a1ce9b374b335d4186084af8d498f618926b703f691ab403
3
  size 460325
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:996d6350c1b4f12065344c8cc1c8279d7440f521570b7bc41fb06be1cc1aa4e9
3
  size 460325
transformer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:066984391607d3df82c79a225303de0e9094bc04c0d2c029fa35e290e6d4a545
3
- size 502217416
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ef7515cc05db5dac9bd310f575fffd807088cce8171bfa3734b7fc63cdb318a3
3
+ size 502217324