EC2 Default User commited on
Commit
0c4ebde
1 Parent(s): 239866e

Update spaCy pipeline

Browse files
LICENSES_SOURCES CHANGED
@@ -1,554 +1,3 @@
1
- # Lemmatization Lists
2
-
3
- * Author: Michal Měchura
4
- * URL: https://github.com/michmech/lemmatization-lists/
5
- * License: ODbL
6
-
7
- ```
8
- ## ODC Open Database License (ODbL)
9
-
10
- ### Preamble
11
-
12
- The Open Database License (ODbL) is a license agreement intended to
13
- allow users to freely share, modify, and use this Database while
14
- maintaining this same freedom for others. Many databases are covered by
15
- copyright, and therefore this document licenses these rights. Some
16
- jurisdictions, mainly in the European Union, have specific rights that
17
- cover databases, and so the ODbL addresses these rights, too. Finally,
18
- the ODbL is also an agreement in contract for users of this Database to
19
- act in certain ways in return for accessing this Database.
20
-
21
- Databases can contain a wide variety of types of content (images,
22
- audiovisual material, and sounds all in the same database, for example),
23
- and so the ODbL only governs the rights over the Database, and not the
24
- contents of the Database individually. Licensors should use the ODbL
25
- together with another license for the contents, if the contents have a
26
- single set of rights that uniformly covers all of the contents. If the
27
- contents have multiple sets of different rights, Licensors should
28
- describe what rights govern what contents together in the individual
29
- record or in some other way that clarifies what rights apply.
30
-
31
- Sometimes the contents of a database, or the database itself, can be
32
- covered by other rights not addressed here (such as private contracts,
33
- trade mark over the name, or privacy rights / data protection rights
34
- over information in the contents), and so you are advised that you may
35
- have to consult other documents or clear other rights before doing
36
- activities not covered by this License.
37
-
38
- ------
39
-
40
- The Licensor (as defined below)
41
-
42
- and
43
-
44
- You (as defined below)
45
-
46
- agree as follows:
47
-
48
- ### 1.0 Definitions of Capitalised Words
49
-
50
- "Collective Database" – Means this Database in unmodified form as part
51
- of a collection of independent databases in themselves that together are
52
- assembled into a collective whole. A work that constitutes a Collective
53
- Database will not be considered a Derivative Database.
54
-
55
- "Convey" – As a verb, means Using the Database, a Derivative Database,
56
- or the Database as part of a Collective Database in any way that enables
57
- a Person to make or receive copies of the Database or a Derivative
58
- Database. Conveying does not include interaction with a user through a
59
- computer network, or creating and Using a Produced Work, where no
60
- transfer of a copy of the Database or a Derivative Database occurs.
61
- "Contents" – The contents of this Database, which includes the
62
- information, independent works, or other material collected into the
63
- Database. For example, the contents of the Database could be factual
64
- data or works such as images, audiovisual material, text, or sounds.
65
-
66
- "Database" – A collection of material (the Contents) arranged in a
67
- systematic or methodical way and individually accessible by electronic
68
- or other means offered under the terms of this License.
69
-
70
- "Database Directive" – Means Directive 96/9/EC of the European
71
- Parliament and of the Council of 11 March 1996 on the legal protection
72
- of databases, as amended or succeeded.
73
-
74
- "Database Right" – Means rights resulting from the Chapter III ("sui
75
- generis") rights in the Database Directive (as amended and as transposed
76
- by member states), which includes the Extraction and Re-utilisation of
77
- the whole or a Substantial part of the Contents, as well as any similar
78
- rights available in the relevant jurisdiction under Section 10.4.
79
-
80
- "Derivative Database" – Means a database based upon the Database, and
81
- includes any translation, adaptation, arrangement, modification, or any
82
- other alteration of the Database or of a Substantial part of the
83
- Contents. This includes, but is not limited to, Extracting or
84
- Re-utilising the whole or a Substantial part of the Contents in a new
85
- Database.
86
-
87
- "Extraction" – Means the permanent or temporary transfer of all or a
88
- Substantial part of the Contents to another medium by any means or in
89
- any form.
90
-
91
- "License" – Means this license agreement and is both a license of rights
92
- such as copyright and Database Rights and an agreement in contract.
93
-
94
- "Licensor" – Means the Person that offers the Database under the terms
95
- of this License.
96
-
97
- "Person" – Means a natural or legal person or a body of persons
98
- corporate or incorporate.
99
-
100
- "Produced Work" – a work (such as an image, audiovisual material, text,
101
- or sounds) resulting from using the whole or a Substantial part of the
102
- Contents (via a search or other query) from this Database, a Derivative
103
- Database, or this Database as part of a Collective Database.
104
-
105
- "Publicly" – means to Persons other than You or under Your control by
106
- either more than 50% ownership or by the power to direct their
107
- activities (such as contracting with an independent consultant).
108
-
109
- "Re-utilisation" – means any form of making available to the public all
110
- or a Substantial part of the Contents by the distribution of copies, by
111
- renting, by online or other forms of transmission.
112
-
113
- "Substantial" – Means substantial in terms of quantity or quality or a
114
- combination of both. The repeated and systematic Extraction or
115
- Re-utilisation of insubstantial parts of the Contents may amount to the
116
- Extraction or Re-utilisation of a Substantial part of the Contents.
117
-
118
- "Use" – As a verb, means doing any act that is restricted by copyright
119
- or Database Rights whether in the original medium or any other; and
120
- includes without limitation distributing, copying, publicly performing,
121
- publicly displaying, and preparing derivative works of the Database, as
122
- well as modifying the Database as may be technically necessary to use it
123
- in a different mode or format.
124
-
125
- "You" – Means a Person exercising rights under this License who has not
126
- previously violated the terms of this License with respect to the
127
- Database, or who has received express permission from the Licensor to
128
- exercise rights under this License despite a previous violation.
129
-
130
- Words in the singular include the plural and vice versa.
131
-
132
- ### 2.0 What this License covers
133
-
134
- 2.1. Legal effect of this document. This License is:
135
-
136
- a. A license of applicable copyright and neighbouring rights;
137
-
138
- b. A license of the Database Right; and
139
-
140
- c. An agreement in contract between You and the Licensor.
141
-
142
- 2.2 Legal rights covered. This License covers the legal rights in the
143
- Database, including:
144
-
145
- a. Copyright. Any copyright or neighbouring rights in the Database.
146
- The copyright licensed includes any individual elements of the
147
- Database, but does not cover the copyright over the Contents
148
- independent of this Database. See Section 2.4 for details. Copyright
149
- law varies between jurisdictions, but is likely to cover: the Database
150
- model or schema, which is the structure, arrangement, and organisation
151
- of the Database, and can also include the Database tables and table
152
- indexes; the data entry and output sheets; and the Field names of
153
- Contents stored in the Database;
154
-
155
- b. Database Rights. Database Rights only extend to the Extraction and
156
- Re-utilisation of the whole or a Substantial part of the Contents.
157
- Database Rights can apply even when there is no copyright over the
158
- Database. Database Rights can also apply when the Contents are removed
159
- from the Database and are selected and arranged in a way that would
160
- not infringe any applicable copyright; and
161
-
162
- c. Contract. This is an agreement between You and the Licensor for
163
- access to the Database. In return you agree to certain conditions of
164
- use on this access as outlined in this License.
165
-
166
- 2.3 Rights not covered.
167
-
168
- a. This License does not apply to computer programs used in the making
169
- or operation of the Database;
170
-
171
- b. This License does not cover any patents over the Contents or the
172
- Database; and
173
-
174
- c. This License does not cover any trademarks associated with the
175
- Database.
176
-
177
- 2.4 Relationship to Contents in the Database. The individual items of
178
- the Contents contained in this Database may be covered by other rights,
179
- including copyright, patent, data protection, privacy, or personality
180
- rights, and this License does not cover any rights (other than Database
181
- Rights or in contract) in individual Contents contained in the Database.
182
- For example, if used on a Database of images (the Contents), this
183
- License would not apply to copyright over individual images, which could
184
- have their own separate licenses, or one single license covering all of
185
- the rights over the images.
186
-
187
- ### 3.0 Rights granted
188
-
189
- 3.1 Subject to the terms and conditions of this License, the Licensor
190
- grants to You a worldwide, royalty-free, non-exclusive, terminable (but
191
- only under Section 9) license to Use the Database for the duration of
192
- any applicable copyright and Database Rights. These rights explicitly
193
- include commercial use, and do not exclude any field of endeavour. To
194
- the extent possible in the relevant jurisdiction, these rights may be
195
- exercised in all media and formats whether now known or created in the
196
- future.
197
-
198
- The rights granted cover, for example:
199
-
200
- a. Extraction and Re-utilisation of the whole or a Substantial part of
201
- the Contents;
202
-
203
- b. Creation of Derivative Databases;
204
-
205
- c. Creation of Collective Databases;
206
-
207
- d. Creation of temporary or permanent reproductions by any means and
208
- in any form, in whole or in part, including of any Derivative
209
- Databases or as a part of Collective Databases; and
210
-
211
- e. Distribution, communication, display, lending, making available, or
212
- performance to the public by any means and in any form, in whole or in
213
- part, including of any Derivative Database or as a part of Collective
214
- Databases.
215
-
216
- 3.2 Compulsory license schemes. For the avoidance of doubt:
217
-
218
- a. Non-waivable compulsory license schemes. In those jurisdictions in
219
- which the right to collect royalties through any statutory or
220
- compulsory licensing scheme cannot be waived, the Licensor reserves
221
- the exclusive right to collect such royalties for any exercise by You
222
- of the rights granted under this License;
223
-
224
- b. Waivable compulsory license schemes. In those jurisdictions in
225
- which the right to collect royalties through any statutory or
226
- compulsory licensing scheme can be waived, the Licensor waives the
227
- exclusive right to collect such royalties for any exercise by You of
228
- the rights granted under this License; and,
229
-
230
- c. Voluntary license schemes. The Licensor waives the right to collect
231
- royalties, whether individually or, in the event that the Licensor is
232
- a member of a collecting society that administers voluntary licensing
233
- schemes, via that society, from any exercise by You of the rights
234
- granted under this License.
235
-
236
- 3.3 The right to release the Database under different terms, or to stop
237
- distributing or making available the Database, is reserved. Note that
238
- this Database may be multiple-licensed, and so You may have the choice
239
- of using alternative licenses for this Database. Subject to Section
240
- 10.4, all other rights not expressly granted by Licensor are reserved.
241
-
242
- ### 4.0 Conditions of Use
243
-
244
- 4.1 The rights granted in Section 3 above are expressly made subject to
245
- Your complying with the following conditions of use. These are important
246
- conditions of this License, and if You fail to follow them, You will be
247
- in material breach of its terms.
248
-
249
- 4.2 Notices. If You Publicly Convey this Database, any Derivative
250
- Database, or the Database as part of a Collective Database, then You
251
- must:
252
-
253
- a. Do so only under the terms of this License or another license
254
- permitted under Section 4.4;
255
-
256
- b. Include a copy of this License (or, as applicable, a license
257
- permitted under Section 4.4) or its Uniform Resource Identifier (URI)
258
- with the Database or Derivative Database, including both in the
259
- Database or Derivative Database and in any relevant documentation; and
260
-
261
- c. Keep intact any copyright or Database Right notices and notices
262
- that refer to this License.
263
-
264
- d. If it is not possible to put the required notices in a particular
265
- file due to its structure, then You must include the notices in a
266
- location (such as a relevant directory) where users would be likely to
267
- look for it.
268
-
269
- 4.3 Notice for using output (Contents). Creating and Using a Produced
270
- Work does not require the notice in Section 4.2. However, if you
271
- Publicly Use a Produced Work, You must include a notice associated with
272
- the Produced Work reasonably calculated to make any Person that uses,
273
- views, accesses, interacts with, or is otherwise exposed to the Produced
274
- Work aware that Content was obtained from the Database, Derivative
275
- Database, or the Database as part of a Collective Database, and that it
276
- is available under this License.
277
-
278
- a. Example notice. The following text will satisfy notice under
279
- Section 4.3:
280
-
281
- Contains information from DATABASE NAME, which is made available
282
- here under the Open Database License (ODbL).
283
-
284
- DATABASE NAME should be replaced with the name of the Database and a
285
- hyperlink to the URI of the Database. "Open Database License" should
286
- contain a hyperlink to the URI of the text of this License. If
287
- hyperlinks are not possible, You should include the plain text of the
288
- required URI's with the above notice.
289
-
290
- 4.4 Share alike.
291
-
292
- a. Any Derivative Database that You Publicly Use must be only under
293
- the terms of:
294
-
295
- i. This License;
296
-
297
- ii. A later version of this License similar in spirit to this
298
- License; or
299
-
300
- iii. A compatible license.
301
-
302
- If You license the Derivative Database under one of the licenses
303
- mentioned in (iii), You must comply with the terms of that license.
304
-
305
- b. For the avoidance of doubt, Extraction or Re-utilisation of the
306
- whole or a Substantial part of the Contents into a new database is a
307
- Derivative Database and must comply with Section 4.4.
308
-
309
- c. Derivative Databases and Produced Works. A Derivative Database is
310
- Publicly Used and so must comply with Section 4.4. if a Produced Work
311
- created from the Derivative Database is Publicly Used.
312
-
313
- d. Share Alike and additional Contents. For the avoidance of doubt,
314
- You must not add Contents to Derivative Databases under Section 4.4 a
315
- that are incompatible with the rights granted under this License.
316
-
317
- e. Compatible licenses. Licensors may authorise a proxy to determine
318
- compatible licenses under Section 4.4 a iii. If they do so, the
319
- authorised proxy's public statement of acceptance of a compatible
320
- license grants You permission to use the compatible license.
321
-
322
-
323
- 4.5 Limits of Share Alike. The requirements of Section 4.4 do not apply
324
- in the following:
325
-
326
- a. For the avoidance of doubt, You are not required to license
327
- Collective Databases under this License if You incorporate this
328
- Database or a Derivative Database in the collection, but this License
329
- still applies to this Database or a Derivative Database as a part of
330
- the Collective Database;
331
-
332
- b. Using this Database, a Derivative Database, or this Database as
333
- part of a Collective Database to create a Produced Work does not
334
- create a Derivative Database for purposes of Section 4.4; and
335
-
336
- c. Use of a Derivative Database internally within an organisation is
337
- not to the public and therefore does not fall under the requirements
338
- of Section 4.4.
339
-
340
- 4.6 Access to Derivative Databases. If You Publicly Use a Derivative
341
- Database or a Produced Work from a Derivative Database, You must also
342
- offer to recipients of the Derivative Database or Produced Work a copy
343
- in a machine readable form of:
344
-
345
- a. The entire Derivative Database; or
346
-
347
- b. A file containing all of the alterations made to the Database or
348
- the method of making the alterations to the Database (such as an
349
- algorithm), including any additional Contents, that make up all the
350
- differences between the Database and the Derivative Database.
351
-
352
- The Derivative Database (under a.) or alteration file (under b.) must be
353
- available at no more than a reasonable production cost for physical
354
- distributions and free of charge if distributed over the internet.
355
-
356
- 4.7 Technological measures and additional terms
357
-
358
- a. This License does not allow You to impose (except subject to
359
- Section 4.7 b.) any terms or any technological measures on the
360
- Database, a Derivative Database, or the whole or a Substantial part of
361
- the Contents that alter or restrict the terms of this License, or any
362
- rights granted under it, or have the effect or intent of restricting
363
- the ability of any person to exercise those rights.
364
-
365
- b. Parallel distribution. You may impose terms or technological
366
- measures on the Database, a Derivative Database, or the whole or a
367
- Substantial part of the Contents (a "Restricted Database") in
368
- contravention of Section 4.74 a. only if You also make a copy of the
369
- Database or a Derivative Database available to the recipient of the
370
- Restricted Database:
371
-
372
- i. That is available without additional fee;
373
-
374
- ii. That is available in a medium that does not alter or restrict
375
- the terms of this License, or any rights granted under it, or have
376
- the effect or intent of restricting the ability of any person to
377
- exercise those rights (an "Unrestricted Database"); and
378
-
379
- iii. The Unrestricted Database is at least as accessible to the
380
- recipient as a practical matter as the Restricted Database.
381
-
382
- c. For the avoidance of doubt, You may place this Database or a
383
- Derivative Database in an authenticated environment, behind a
384
- password, or within a similar access control scheme provided that You
385
- do not alter or restrict the terms of this License or any rights
386
- granted under it or have the effect or intent of restricting the
387
- ability of any person to exercise those rights.
388
-
389
- 4.8 Licensing of others. You may not sublicense the Database. Each time
390
- You communicate the Database, the whole or Substantial part of the
391
- Contents, or any Derivative Database to anyone else in any way, the
392
- Licensor offers to the recipient a license to the Database on the same
393
- terms and conditions as this License. You are not responsible for
394
- enforcing compliance by third parties with this License, but You may
395
- enforce any rights that You have over a Derivative Database. You are
396
- solely responsible for any modifications of a Derivative Database made
397
- by You or another Person at Your direction. You may not impose any
398
- further restrictions on the exercise of the rights granted or affirmed
399
- under this License.
400
-
401
- ### 5.0 Moral rights
402
-
403
- 5.1 Moral rights. This section covers moral rights, including any rights
404
- to be identified as the author of the Database or to object to treatment
405
- that would otherwise prejudice the author's honour and reputation, or
406
- any other derogatory treatment:
407
-
408
- a. For jurisdictions allowing waiver of moral rights, Licensor waives
409
- all moral rights that Licensor may have in the Database to the fullest
410
- extent possible by the law of the relevant jurisdiction under Section
411
- 10.4;
412
-
413
- b. If waiver of moral rights under Section 5.1 a in the relevant
414
- jurisdiction is not possible, Licensor agrees not to assert any moral
415
- rights over the Database and waives all claims in moral rights to the
416
- fullest extent possible by the law of the relevant jurisdiction under
417
- Section 10.4; and
418
-
419
- c. For jurisdictions not allowing waiver or an agreement not to assert
420
- moral rights under Section 5.1 a and b, the author may retain their
421
- moral rights over certain aspects of the Database.
422
-
423
- Please note that some jurisdictions do not allow for the waiver of moral
424
- rights, and so moral rights may still subsist over the Database in some
425
- jurisdictions.
426
-
427
- ### 6.0 Fair dealing, Database exceptions, and other rights not affected
428
-
429
- 6.1 This License does not affect any rights that You or anyone else may
430
- independently have under any applicable law to make any use of this
431
- Database, including without limitation:
432
-
433
- a. Exceptions to the Database Right including: Extraction of Contents
434
- from non-electronic Databases for private purposes, Extraction for
435
- purposes of illustration for teaching or scientific research, and
436
- Extraction or Re-utilisation for public security or an administrative
437
- or judicial procedure.
438
-
439
- b. Fair dealing, fair use, or any other legally recognised limitation
440
- or exception to infringement of copyright or other applicable laws.
441
-
442
- 6.2 This License does not affect any rights of lawful users to Extract
443
- and Re-utilise insubstantial parts of the Contents, evaluated
444
- quantitatively or qualitatively, for any purposes whatsoever, including
445
- creating a Derivative Database (subject to other rights over the
446
- Contents, see Section 2.4). The repeated and systematic Extraction or
447
- Re-utilisation of insubstantial parts of the Contents may however amount
448
- to the Extraction or Re-utilisation of a Substantial part of the
449
- Contents.
450
-
451
- ### 7.0 Warranties and Disclaimer
452
-
453
- 7.1 The Database is licensed by the Licensor "as is" and without any
454
- warranty of any kind, either express, implied, or arising by statute,
455
- custom, course of dealing, or trade usage. Licensor specifically
456
- disclaims any and all implied warranties or conditions of title,
457
- non-infringement, accuracy or completeness, the presence or absence of
458
- errors, fitness for a particular purpose, merchantability, or otherwise.
459
- Some jurisdictions do not allow the exclusion of implied warranties, so
460
- this exclusion may not apply to You.
461
-
462
- ### 8.0 Limitation of liability
463
-
464
- 8.1 Subject to any liability that may not be excluded or limited by law,
465
- the Licensor is not liable for, and expressly excludes, all liability
466
- for loss or damage however and whenever caused to anyone by any use
467
- under this License, whether by You or by anyone else, and whether caused
468
- by any fault on the part of the Licensor or not. This exclusion of
469
- liability includes, but is not limited to, any special, incidental,
470
- consequential, punitive, or exemplary damages such as loss of revenue,
471
- data, anticipated profits, and lost business. This exclusion applies
472
- even if the Licensor has been advised of the possibility of such
473
- damages.
474
-
475
- 8.2 If liability may not be excluded by law, it is limited to actual and
476
- direct financial loss to the extent it is caused by proved negligence on
477
- the part of the Licensor.
478
-
479
- ### 9.0 Termination of Your rights under this License
480
-
481
- 9.1 Any breach by You of the terms and conditions of this License
482
- automatically terminates this License with immediate effect and without
483
- notice to You. For the avoidance of doubt, Persons who have received the
484
- Database, the whole or a Substantial part of the Contents, Derivative
485
- Databases, or the Database as part of a Collective Database from You
486
- under this License will not have their licenses terminated provided
487
- their use is in full compliance with this License or a license granted
488
- under Section 4.8 of this License. Sections 1, 2, 7, 8, 9 and 10 will
489
- survive any termination of this License.
490
-
491
- 9.2 If You are not in breach of the terms of this License, the Licensor
492
- will not terminate Your rights under it.
493
-
494
- 9.3 Unless terminated under Section 9.1, this License is granted to You
495
- for the duration of applicable rights in the Database.
496
-
497
- 9.4 Reinstatement of rights. If you cease any breach of the terms and
498
- conditions of this License, then your full rights under this License
499
- will be reinstated:
500
-
501
- a. Provisionally and subject to permanent termination until the 60th
502
- day after cessation of breach;
503
-
504
- b. Permanently on the 60th day after cessation of breach unless
505
- otherwise reasonably notified by the Licensor; or
506
-
507
- c. Permanently if reasonably notified by the Licensor of the
508
- violation, this is the first time You have received notice of
509
- violation of this License from the Licensor, and You cure the
510
- violation prior to 30 days after your receipt of the notice.
511
-
512
- Persons subject to permanent termination of rights are not eligible to
513
- be a recipient and receive a license under Section 4.8.
514
-
515
- 9.5 Notwithstanding the above, Licensor reserves the right to release
516
- the Database under different license terms or to stop distributing or
517
- making available the Database. Releasing the Database under different
518
- license terms or stopping the distribution of the Database will not
519
- withdraw this License (or any other license that has been, or is
520
- required to be, granted under the terms of this License), and this
521
- License will continue in full force and effect unless terminated as
522
- stated above.
523
-
524
- ### 10.0 General
525
-
526
- 10.1 If any provision of this License is held to be invalid or
527
- unenforceable, that must not affect the validity or enforceability of
528
- the remainder of the terms and conditions of this License and each
529
- remaining provision of this License shall be valid and enforced to the
530
- fullest extent permitted by law.
531
-
532
- 10.2 This License is the entire agreement between the parties with
533
- respect to the rights granted here over the Database. It replaces any
534
- earlier understandings, agreements or representations with respect to
535
- the Database.
536
-
537
- 10.3 If You are in breach of the terms of this License, You will not be
538
- entitled to rely on the terms of this License or to complain of any
539
- breach by the Licensor.
540
-
541
- 10.4 Choice of law. This License takes effect in and will be governed by
542
- the laws of the relevant jurisdiction in which the License terms are
543
- sought to be enforced. If the standard suite of rights granted under
544
- applicable copyright law and Database Rights in the relevant
545
- jurisdiction includes additional rights not granted under this License,
546
- these additional rights are granted in this License in order to meet the
547
- terms of this License.```
548
-
549
-
550
-
551
-
552
  # UD Romanian RRT v2.8
553
 
554
  * Author: Barbu Mititelu, Verginica; Irimia, Elena; Perez, Cenel-Augusto; Ion, Radu; Simionescu, Radu; Popel, Martin
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # UD Romanian RRT v2.8
2
 
3
  * Author: Barbu Mititelu, Verginica; Irimia, Elena; Perez, Cenel-Augusto; Ion, Radu; Simionescu, Radu; Popel, Martin
README.md CHANGED
@@ -14,61 +14,76 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.7550713749
18
  - name: NER Recall
19
  type: recall
20
- value: 0.7721859393
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.7635327635
 
 
 
 
 
 
 
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
- - name: POS Accuracy
29
  type: accuracy
30
- value: 0.9664291788
31
  - task:
32
- name: SENTER
33
  type: token-classification
34
  metrics:
35
- - name: SENTER Precision
36
- type: precision
37
- value: 0.954787234
38
- - name: SENTER Recall
39
- type: recall
40
- value: 0.954787234
41
- - name: SENTER F Score
42
- type: f_score
43
- value: 0.954787234
44
  - task:
45
- name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
- - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
- value: 0.8897462438
 
 
 
 
 
 
 
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
- - name: Labeled Dependencies Accuracy
56
- type: accuracy
57
- value: 0.8897462438
 
 
 
 
 
 
 
58
  ---
59
  ### Details: https://spacy.io/models/ro#ro_core_news_lg
60
 
61
- Romanian pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.
62
 
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `ro_core_news_lg` |
66
- | **Version** | `3.2.0` |
67
- | **spaCy** | `>=3.2.0,<3.3.0` |
68
- | **Default Pipeline** | `tok2vec`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
- | **Components** | `tok2vec`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
71
- | **Sources** | [Lemmatization Lists](https://github.com/michmech/lemmatization-lists/) (Michal Měchura)<br />[UD Romanian RRT v2.8](https://github.com/UniversalDependencies/UD_Romanian-RRT) (Barbu Mititelu, Verginica; Irimia, Elena; Perez, Cenel-Augusto; Ion, Radu; Simionescu, Radu; Popel, Martin)<br />[RONEC - the Romanian Named Entity Corpus (ca9ce460)](https://github.com/dumitrescustefan/ronec) (Dumitrescu, Stefan Daniel; Avram, Andrei-Marius; Morogan, Luciana; Toma; Stefan)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
72
  | **License** | `CC BY-SA 4.0` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
@@ -76,13 +91,12 @@ Romanian pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter
76
 
77
  <details>
78
 
79
- <summary>View label scheme (541 labels for 4 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
  | **`tagger`** | `ARROW`, `Af`, `Afcfp-n`, `Afcfson`, `Afcfsrn`, `Afcmpoy`, `Afcms-n`, `Afp`, `Afp-p-n`, `Afp-poy`, `Afp-srn`, `Afpf--n`, `Afpfp-n`, `Afpfp-ny`, `Afpfpoy`, `Afpfpry`, `Afpfson`, `Afpfsoy`, `Afpfsrn`, `Afpfsry`, `Afpm--n`, `Afpmp-n`, `Afpmpoy`, `Afpmpry`, `Afpms-n`, `Afpmsoy`, `Afpmsry`, `Afsfp-n`, `Afsfsrn`, `BULLET`, `COLON`, `COMMA`, `Ccssp`, `Ccsspy`, `Crssp`, `Csssp`, `Cssspy`, `DASH`, `DBLQ`, `Dd3-po---e`, `Dd3-po---o`, `Dd3fpo`, `Dd3fpr`, `Dd3fpr---e`, `Dd3fpr---o`, `Dd3fpr--y`, `Dd3fso`, `Dd3fso---e`, `Dd3fsr`, `Dd3fsr---e`, `Dd3fsr---o`, `Dd3fsr--yo`, `Dd3mpo`, `Dd3mpr`, `Dd3mpr---e`, `Dd3mpr---o`, `Dd3mso---e`, `Dd3msr`, `Dd3msr---e`, `Dd3msr---o`, `Dh1ms`, `Dh3fp`, `Dh3fso`, `Dh3fsr`, `Dh3mp`, `Dh3ms`, `Di3`, `Di3-----y`, `Di3--r---e`, `Di3-po`, `Di3-po---e`, `Di3-sr`, `Di3-sr---e`, `Di3-sr--y`, `Di3fp`, `Di3fpr`, `Di3fpr---e`, `Di3fso`, `Di3fso---e`, `Di3fsr`, `Di3fsr---e`, `Di3mp`, `Di3mpr`, `Di3mpr---e`, `Di3ms`, `Di3ms----e`, `Di3mso---e`, `Di3msr`, `Di3msr---e`, `Ds1fp-p`, `Ds1fp-s`, `Ds1fsop`, `Ds1fsos`, `Ds1fsrp`, `Ds1fsrs`, `Ds1fsrs-y`, `Ds1mp-p`, `Ds1mp-s`, `Ds1ms-p`, `Ds1ms-s`, `Ds1msrs-y`, `Ds2---s`, `Ds2fp-p`, `Ds2fp-s`, `Ds2fsrp`, `Ds2fsrs`, `Ds2mp-p`, `Ds2mp-s`, `Ds2ms-p`, `Ds2ms-s`, `Ds3---p`, `Ds3---s`, `Ds3---sy`, `Ds3fp-s`, `Ds3fsos`, `Ds3fsrs`, `Ds3mp-s`, `Ds3ms-s`, `Dw3--r---e`, `Dw3-po---e`, `Dw3fpr`, `Dw3fso---e`, `Dw3fsr`, `Dw3mpr`, `Dw3mso---e`, `Dw3msr`, `Dz3fsr---e`, `Dz3mso---e`, `Dz3msr---e`, `EQUAL`, `EXCL`, `EXCLHELLIP`, `GE`, `GT`, `HELLIP`, `I`, `LCURL`, `LPAR`, `LSQR`, `LT`, `M`, `Mc-p-d`, `Mc-p-l`, `Mc-s-b`, `Mc-s-d`, `Mc-s-l`, `Mcfp-l`, `Mcfp-ln`, `Mcfprln`, `Mcfprly`, `Mcfsoln`, `Mcfsrl`, `Mcfsrln`, `Mcfsrly`, `Mcmp-l`, `Mcms-ln`, `Mcmsrl`, `Mcmsrln`, `Mcmsrly`, `Mffprln`, `Mffsrln`, `Mlfpo`, `Mlfpr`, `Mlmpr`, `Mo---l`, `Mo---ln`, `Mo-s-r`, `Mofp-ln`, `Mofpoly`, `Mofprly`, `Mofs-l`, `Mofsoln`, `Mofsoly`, `Mofsrln`, `Mofsrly`, `Mompoly`, `Momprly`, `Moms-l`, `Moms-ln`, `Momsoly`, `Momsrly`, `Nc`, `Nc---n`, `Ncf--n`, `Ncfp-n`, `Ncfpoy`, `Ncfpry`, `Ncfs-n`, `Ncfson`, `Ncfsoy`, `Ncfsrn`, `Ncfsry`, `Ncfsryy`, `Ncfsvy`, `Ncm--n`, `Ncmp-n`, `Ncmpoy`, `Ncmpry`, `Ncms-n`, `Ncms-ny`, `Ncms-y`, `Ncmsoy`, `Ncmsrn`, `Ncmsry`, `Ncmsryy`, `Ncmsvn`, `Ncmsvy`, `Np`, `Npfson`, `Npfsoy`, `Npfsrn`, `Npfsry`, `Npmpoy`, `Npmpry`, `Npms-n`, `Npmsoy`, `Npmsry`, `PERCENT`, `PERIOD`, `PLUS`, `PLUSMINUS`, `Pd3-po`, `Pd3fpr`, `Pd3fso`, `Pd3fsr`, `Pd3mpo`, `Pd3mpr`, `Pd3mpr--y`, `Pd3mso`, `Pd3msr`, `Pi3--r`, `Pi3-po`, `Pi3-so`, `Pi3-sr`, `Pi3fpr`, `Pi3fso`, `Pi3fsr`, `Pi3mpr`, `Pi3mso`, `Pi3msr`, `Pi3msr--y`, `Pp1-pa--------w`, `Pp1-pa--y-----w`, `Pp1-pd--------s`, `Pp1-pd--------w`, `Pp1-pd--y-----w`, `Pp1-pr--------s`, `Pp1-sa--------s`, `Pp1-sa--------w`, `Pp1-sa--y-----w`, `Pp1-sd--------s`, `Pp1-sd--------w`, `Pp1-sd--y-----w`, `Pp1-sn--------s`, `Pp2-----------s`, `Pp2-pa--------w`, `Pp2-pa--y-----w`, `Pp2-pd--------w`, `Pp2-pd--y-----w`, `Pp2-pr--------s`, `Pp2-sa--------s`, `Pp2-sa--------w`, `Pp2-sa--y-----w`, `Pp2-sd--------s`, `Pp2-sd--------w`, `Pp2-sd--y-----w`, `Pp2-sn--------s`, `Pp2-so--------s`, `Pp2-sr--------s`, `Pp3-p---------s`, `Pp3-pd--------w`, `Pp3-pd--y-----w`, `Pp3-po--------s`, `Pp3-sd--------w`, `Pp3-sd--y-----w`, `Pp3-so--------s`, `Pp3fpa--------w`, `Pp3fpa--y-----w`, `Pp3fpr--------s`, `Pp3fs---------s`, `Pp3fsa--------w`, `Pp3fsa--y-----w`, `Pp3fso--------s`, `Pp3fsr--------s`, `Pp3fsr--y-----s`, `Pp3mpa--------w`, `Pp3mpa--y-----w`, `Pp3mpr--------s`, `Pp3ms---------s`, `Pp3msa--------w`, `Pp3msa--y-----w`, `Pp3mso--------s`, `Pp3msr--------s`, `Pp3msr--y-----s`, `Ps1fp-s`, `Ps1fsrp`, `Ps1fsrs`, `Ps1mp-p`, `Ps1ms-p`, `Ps2fp-s`, `Ps2fsrp`, `Ps2fsrs`, `Ps3---p`, `Ps3---s`, `Ps3fp-s`, `Ps3fsrs`, `Ps3mp-s`, `Ps3ms-s`, `Pw3--r`, `Pw3-po`, `Pw3-so`, `Pw3fpr`, `Pw3fso`, `Pw3mpr`, `Pw3mso`, `Px3--a--------s`, `Px3--a--------w`, `Px3--a--y-----w`, `Px3--d--------w`, `Px3--d--y-----w`, `Pz3-sr`, `Pz3fsr`, `QUEST`, `QUOT`, `Qf`, `Qn`, `Qs`, `Qs-y`, `Qz`, `Qz-y`, `RCURL`, `RPAR`, `RSQR`, `Rc`, `Rgp`, `Rgpy`, `Rgs`, `Rp`, `Rw`, `Rw-y`, `Rz`, `SCOLON`, `SLASH`, `STAR`, `Sp`, `Spsa`, `Spsay`, `Spsd`, `Spsg`, `Td-po`, `Tdfpr`, `Tdfso`, `Tdfsr`, `Tdmpr`, `Tdmso`, `Tdmsr`, `Tf-so`, `Tffpoy`, `Tffpry`, `Tffs-y`, `Tfmpoy`, `Tfms-y`, `Tfmsoy`, `Tfmsry`, `Ti-po`, `Tifp-y`, `Tifso`, `Tifsr`, `Timso`, `Timsr`, `Tsfp`, `Tsfs`, `Tsmp`, `Tsms`, `UNDERSC`, `Va--1`, `Va--1-----y`, `Va--1p`, `Va--1s`, `Va--1s----y`, `Va--2p`, `Va--2p----y`, `Va--2s`, `Va--2s----y`, `Va--3`, `Va--3-----y`, `Va--3p`, `Va--3p----y`, `Va--3s`, `Va--3s----y`, `Vag`, `Vag-------y`, `Vaii1`, `Vaii2s`, `Vaii3p`, `Vaii3s`, `Vail3p`, `Vail3s`, `Vaip1p`, `Vaip1s`, `Vaip2p`, `Vaip2s`, `Vaip3p`, `Vaip3p----y`, `Vaip3s`, `Vaip3s----y`, `Vais3p`, `Vais3s`, `Vam-2s`, `Vanp`, `Vap--sm`, `Vasp1p`, `Vasp1s`, `Vasp2p`, `Vasp2s`, `Vasp3`, `Vmg`, `Vmg-------y`, `Vmii1`, `Vmii1-----y`, `Vmii2p`, `Vmii2s`, `Vmii3p`, `Vmii3p----y`, `Vmii3s`, `Vmii3s----y`, `Vmil1`, `Vmil1p`, `Vmil2s`, `Vmil3p`, `Vmil3p----y`, `Vmil3s`, `Vmil3s----y`, `Vmip1p`, `Vmip1p----y`, `Vmip1s`, `Vmip1s----y`, `Vmip2p`, `Vmip2s`, `Vmip2s----y`, `Vmip3`, `Vmip3-----y`, `Vmip3p`, `Vmip3s`, `Vmip3s----y`, `Vmis1p`, `Vmis1s`, `Vmis3p`, `Vmis3p----y`, `Vmis3s`, `Vmis3s----y`, `Vmm-2p`, `Vmm-2s`, `Vmnp`, `Vmnp------y`, `Vmp--pf`, `Vmp--pm`, `Vmp--sf`, `Vmp--sm`, `Vmp--sm---y`, `Vmsp1p`, `Vmsp2p`, `Vmsp2s`, `Vmsp3`, `Vmsp3-----y`, `X`, `Y`, `Ya`, `Yn`, `Ynfsoy`, `Ynfsry`, `Ynmsoy`, `Ynmsry`, `Yp`, `Yp,Yn`, `Yp-sr`, `Yr` |
84
  | **`parser`** | `ROOT`, `acl`, `advcl`, `advcl:tcl`, `advmod`, `advmod:tmod`, `amod`, `appos`, `aux`, `aux:pass`, `case`, `cc`, `cc:preconj`, `ccomp`, `ccomp:pmod`, `compound`, `conj`, `cop`, `csubj`, `csubj:pass`, `dep`, `det`, `expl`, `expl:impers`, `expl:pass`, `expl:poss`, `expl:pv`, `fixed`, `flat`, `goeswith`, `iobj`, `mark`, `nmod`, `nmod:tmod`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl`, `obl:agent`, `obl:pmod`, `orphan`, `parataxis`, `punct`, `vocative`, `xcomp` |
85
- | **`senter`** | `I`, `S` |
86
  | **`ner`** | `DATETIME`, `EVENT`, `FACILITY`, `GPE`, `LANGUAGE`, `LOC`, `MONEY`, `NAT_REL_POL`, `NUMERIC_VALUE`, `ORDINAL`, `ORGANIZATION`, `PERIOD`, `PERSON`, `PRODUCT`, `QUANTITY`, `WORK_OF_ART` |
87
 
88
  </details>
@@ -95,18 +109,18 @@ Romanian pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter
95
  | `TOKEN_P` | 99.67 |
96
  | `TOKEN_R` | 99.57 |
97
  | `TOKEN_F` | 99.59 |
98
- | `TAG_ACC` | 96.64 |
99
- | `SENTS_P` | 95.48 |
100
- | `SENTS_R` | 95.48 |
101
- | `SENTS_F` | 95.48 |
102
- | `DEP_UAS` | 88.97 |
103
- | `DEP_LAS` | 83.90 |
104
- | `POS_ACC` | 94.06 |
105
- | `MORPH_ACC` | 95.11 |
106
- | `MORPH_MICRO_P` | 98.96 |
107
- | `MORPH_MICRO_R` | 95.82 |
108
- | `MORPH_MICRO_F` | 97.07 |
109
- | `LEMMA_ACC` | 81.83 |
110
- | `ENTS_P` | 75.51 |
111
- | `ENTS_R` | 77.22 |
112
- | `ENTS_F` | 76.35 |
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.7552238806
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.7775643488
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.766231308
24
+ - task:
25
+ name: TAG
26
+ type: token-classification
27
+ metrics:
28
+ - name: TAG (XPOS) Accuracy
29
+ type: accuracy
30
+ value: 0.9667810127
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
+ - name: POS (UPOS) Accuracy
36
  type: accuracy
37
+ value: 0.9403951881
38
  - task:
39
+ name: MORPH
40
  type: token-classification
41
  metrics:
42
+ - name: Morph (UFeats) Accuracy
43
+ type: accuracy
44
+ value: 0.9512416806
 
 
 
 
 
 
45
  - task:
46
+ name: LEMMA
47
  type: token-classification
48
  metrics:
49
+ - name: Lemma Accuracy
50
  type: accuracy
51
+ value: 0.9585129152
52
+ - task:
53
+ name: UNLABELED_DEPENDENCIES
54
+ type: token-classification
55
+ metrics:
56
+ - name: Unlabeled Attachment Score (UAS)
57
+ type: f_score
58
+ value: 0.8881779116
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
+ - name: Labeled Attachment Score (LAS)
64
+ type: f_score
65
+ value: 0.8359210815
66
+ - task:
67
+ name: SENTS
68
+ type: token-classification
69
+ metrics:
70
+ - name: Sentences F-Score
71
+ type: f_score
72
+ value: 0.9699398798
73
  ---
74
  ### Details: https://spacy.io/models/ro#ro_core_news_lg
75
 
76
+ Romanian pipeline optimized for CPU. Components: tok2vec, tagger, parser, lemmatizer (trainable_lemmatizer), senter, ner, attribute_ruler.
77
 
78
  | Feature | Description |
79
  | --- | --- |
80
  | **Name** | `ro_core_news_lg` |
81
+ | **Version** | `3.3.0` |
82
+ | **spaCy** | `>=3.3.0.dev0,<3.4.0` |
83
+ | **Default Pipeline** | `tok2vec`, `tagger`, `parser`, `lemmatizer`, `attribute_ruler`, `ner` |
84
+ | **Components** | `tok2vec`, `tagger`, `parser`, `lemmatizer`, `senter`, `attribute_ruler`, `ner` |
85
  | **Vectors** | 500000 keys, 500000 unique vectors (300 dimensions) |
86
+ | **Sources** | [UD Romanian RRT v2.8](https://github.com/UniversalDependencies/UD_Romanian-RRT) (Barbu Mititelu, Verginica; Irimia, Elena; Perez, Cenel-Augusto; Ion, Radu; Simionescu, Radu; Popel, Martin)<br />[RONEC - the Romanian Named Entity Corpus (ca9ce460)](https://github.com/dumitrescustefan/ronec) (Dumitrescu, Stefan Daniel; Avram, Andrei-Marius; Morogan, Luciana; Toma; Stefan)<br />[Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia)](https://spacy.io) (Explosion) |
87
  | **License** | `CC BY-SA 4.0` |
88
  | **Author** | [Explosion](https://explosion.ai) |
89
 
 
91
 
92
  <details>
93
 
94
+ <summary>View label scheme (539 labels for 3 components)</summary>
95
 
96
  | Component | Labels |
97
  | --- | --- |
98
  | **`tagger`** | `ARROW`, `Af`, `Afcfp-n`, `Afcfson`, `Afcfsrn`, `Afcmpoy`, `Afcms-n`, `Afp`, `Afp-p-n`, `Afp-poy`, `Afp-srn`, `Afpf--n`, `Afpfp-n`, `Afpfp-ny`, `Afpfpoy`, `Afpfpry`, `Afpfson`, `Afpfsoy`, `Afpfsrn`, `Afpfsry`, `Afpm--n`, `Afpmp-n`, `Afpmpoy`, `Afpmpry`, `Afpms-n`, `Afpmsoy`, `Afpmsry`, `Afsfp-n`, `Afsfsrn`, `BULLET`, `COLON`, `COMMA`, `Ccssp`, `Ccsspy`, `Crssp`, `Csssp`, `Cssspy`, `DASH`, `DBLQ`, `Dd3-po---e`, `Dd3-po---o`, `Dd3fpo`, `Dd3fpr`, `Dd3fpr---e`, `Dd3fpr---o`, `Dd3fpr--y`, `Dd3fso`, `Dd3fso---e`, `Dd3fsr`, `Dd3fsr---e`, `Dd3fsr---o`, `Dd3fsr--yo`, `Dd3mpo`, `Dd3mpr`, `Dd3mpr---e`, `Dd3mpr---o`, `Dd3mso---e`, `Dd3msr`, `Dd3msr---e`, `Dd3msr---o`, `Dh1ms`, `Dh3fp`, `Dh3fso`, `Dh3fsr`, `Dh3mp`, `Dh3ms`, `Di3`, `Di3-----y`, `Di3--r---e`, `Di3-po`, `Di3-po---e`, `Di3-sr`, `Di3-sr---e`, `Di3-sr--y`, `Di3fp`, `Di3fpr`, `Di3fpr---e`, `Di3fso`, `Di3fso---e`, `Di3fsr`, `Di3fsr---e`, `Di3mp`, `Di3mpr`, `Di3mpr---e`, `Di3ms`, `Di3ms----e`, `Di3mso---e`, `Di3msr`, `Di3msr---e`, `Ds1fp-p`, `Ds1fp-s`, `Ds1fsop`, `Ds1fsos`, `Ds1fsrp`, `Ds1fsrs`, `Ds1fsrs-y`, `Ds1mp-p`, `Ds1mp-s`, `Ds1ms-p`, `Ds1ms-s`, `Ds1msrs-y`, `Ds2---s`, `Ds2fp-p`, `Ds2fp-s`, `Ds2fsrp`, `Ds2fsrs`, `Ds2mp-p`, `Ds2mp-s`, `Ds2ms-p`, `Ds2ms-s`, `Ds3---p`, `Ds3---s`, `Ds3---sy`, `Ds3fp-s`, `Ds3fsos`, `Ds3fsrs`, `Ds3mp-s`, `Ds3ms-s`, `Dw3--r---e`, `Dw3-po---e`, `Dw3fpr`, `Dw3fso---e`, `Dw3fsr`, `Dw3mpr`, `Dw3mso---e`, `Dw3msr`, `Dz3fsr---e`, `Dz3mso---e`, `Dz3msr---e`, `EQUAL`, `EXCL`, `EXCLHELLIP`, `GE`, `GT`, `HELLIP`, `I`, `LCURL`, `LPAR`, `LSQR`, `LT`, `M`, `Mc-p-d`, `Mc-p-l`, `Mc-s-b`, `Mc-s-d`, `Mc-s-l`, `Mcfp-l`, `Mcfp-ln`, `Mcfprln`, `Mcfprly`, `Mcfsoln`, `Mcfsrl`, `Mcfsrln`, `Mcfsrly`, `Mcmp-l`, `Mcms-ln`, `Mcmsrl`, `Mcmsrln`, `Mcmsrly`, `Mffprln`, `Mffsrln`, `Mlfpo`, `Mlfpr`, `Mlmpr`, `Mo---l`, `Mo---ln`, `Mo-s-r`, `Mofp-ln`, `Mofpoly`, `Mofprly`, `Mofs-l`, `Mofsoln`, `Mofsoly`, `Mofsrln`, `Mofsrly`, `Mompoly`, `Momprly`, `Moms-l`, `Moms-ln`, `Momsoly`, `Momsrly`, `Nc`, `Nc---n`, `Ncf--n`, `Ncfp-n`, `Ncfpoy`, `Ncfpry`, `Ncfs-n`, `Ncfson`, `Ncfsoy`, `Ncfsrn`, `Ncfsry`, `Ncfsryy`, `Ncfsvy`, `Ncm--n`, `Ncmp-n`, `Ncmpoy`, `Ncmpry`, `Ncms-n`, `Ncms-ny`, `Ncms-y`, `Ncmsoy`, `Ncmsrn`, `Ncmsry`, `Ncmsryy`, `Ncmsvn`, `Ncmsvy`, `Np`, `Npfson`, `Npfsoy`, `Npfsrn`, `Npfsry`, `Npmpoy`, `Npmpry`, `Npms-n`, `Npmsoy`, `Npmsry`, `PERCENT`, `PERIOD`, `PLUS`, `PLUSMINUS`, `Pd3-po`, `Pd3fpr`, `Pd3fso`, `Pd3fsr`, `Pd3mpo`, `Pd3mpr`, `Pd3mpr--y`, `Pd3mso`, `Pd3msr`, `Pi3--r`, `Pi3-po`, `Pi3-so`, `Pi3-sr`, `Pi3fpr`, `Pi3fso`, `Pi3fsr`, `Pi3mpr`, `Pi3mso`, `Pi3msr`, `Pi3msr--y`, `Pp1-pa--------w`, `Pp1-pa--y-----w`, `Pp1-pd--------s`, `Pp1-pd--------w`, `Pp1-pd--y-----w`, `Pp1-pr--------s`, `Pp1-sa--------s`, `Pp1-sa--------w`, `Pp1-sa--y-----w`, `Pp1-sd--------s`, `Pp1-sd--------w`, `Pp1-sd--y-----w`, `Pp1-sn--------s`, `Pp2-----------s`, `Pp2-pa--------w`, `Pp2-pa--y-----w`, `Pp2-pd--------w`, `Pp2-pd--y-----w`, `Pp2-pr--------s`, `Pp2-sa--------s`, `Pp2-sa--------w`, `Pp2-sa--y-----w`, `Pp2-sd--------s`, `Pp2-sd--------w`, `Pp2-sd--y-----w`, `Pp2-sn--------s`, `Pp2-so--------s`, `Pp2-sr--------s`, `Pp3-p---------s`, `Pp3-pd--------w`, `Pp3-pd--y-----w`, `Pp3-po--------s`, `Pp3-sd--------w`, `Pp3-sd--y-----w`, `Pp3-so--------s`, `Pp3fpa--------w`, `Pp3fpa--y-----w`, `Pp3fpr--------s`, `Pp3fs---------s`, `Pp3fsa--------w`, `Pp3fsa--y-----w`, `Pp3fso--------s`, `Pp3fsr--------s`, `Pp3fsr--y-----s`, `Pp3mpa--------w`, `Pp3mpa--y-----w`, `Pp3mpr--------s`, `Pp3ms---------s`, `Pp3msa--------w`, `Pp3msa--y-----w`, `Pp3mso--------s`, `Pp3msr--------s`, `Pp3msr--y-----s`, `Ps1fp-s`, `Ps1fsrp`, `Ps1fsrs`, `Ps1mp-p`, `Ps1ms-p`, `Ps2fp-s`, `Ps2fsrp`, `Ps2fsrs`, `Ps3---p`, `Ps3---s`, `Ps3fp-s`, `Ps3fsrs`, `Ps3mp-s`, `Ps3ms-s`, `Pw3--r`, `Pw3-po`, `Pw3-so`, `Pw3fpr`, `Pw3fso`, `Pw3mpr`, `Pw3mso`, `Px3--a--------s`, `Px3--a--------w`, `Px3--a--y-----w`, `Px3--d--------w`, `Px3--d--y-----w`, `Pz3-sr`, `Pz3fsr`, `QUEST`, `QUOT`, `Qf`, `Qn`, `Qs`, `Qs-y`, `Qz`, `Qz-y`, `RCURL`, `RPAR`, `RSQR`, `Rc`, `Rgp`, `Rgpy`, `Rgs`, `Rp`, `Rw`, `Rw-y`, `Rz`, `SCOLON`, `SLASH`, `STAR`, `Sp`, `Spsa`, `Spsay`, `Spsd`, `Spsg`, `Td-po`, `Tdfpr`, `Tdfso`, `Tdfsr`, `Tdmpr`, `Tdmso`, `Tdmsr`, `Tf-so`, `Tffpoy`, `Tffpry`, `Tffs-y`, `Tfmpoy`, `Tfms-y`, `Tfmsoy`, `Tfmsry`, `Ti-po`, `Tifp-y`, `Tifso`, `Tifsr`, `Timso`, `Timsr`, `Tsfp`, `Tsfs`, `Tsmp`, `Tsms`, `UNDERSC`, `Va--1`, `Va--1-----y`, `Va--1p`, `Va--1s`, `Va--1s----y`, `Va--2p`, `Va--2p----y`, `Va--2s`, `Va--2s----y`, `Va--3`, `Va--3-----y`, `Va--3p`, `Va--3p----y`, `Va--3s`, `Va--3s----y`, `Vag`, `Vag-------y`, `Vaii1`, `Vaii2s`, `Vaii3p`, `Vaii3s`, `Vail3p`, `Vail3s`, `Vaip1p`, `Vaip1s`, `Vaip2p`, `Vaip2s`, `Vaip3p`, `Vaip3p----y`, `Vaip3s`, `Vaip3s----y`, `Vais3p`, `Vais3s`, `Vam-2s`, `Vanp`, `Vap--sm`, `Vasp1p`, `Vasp1s`, `Vasp2p`, `Vasp2s`, `Vasp3`, `Vmg`, `Vmg-------y`, `Vmii1`, `Vmii1-----y`, `Vmii2p`, `Vmii2s`, `Vmii3p`, `Vmii3p----y`, `Vmii3s`, `Vmii3s----y`, `Vmil1`, `Vmil1p`, `Vmil2s`, `Vmil3p`, `Vmil3p----y`, `Vmil3s`, `Vmil3s----y`, `Vmip1p`, `Vmip1p----y`, `Vmip1s`, `Vmip1s----y`, `Vmip2p`, `Vmip2s`, `Vmip2s----y`, `Vmip3`, `Vmip3-----y`, `Vmip3p`, `Vmip3s`, `Vmip3s----y`, `Vmis1p`, `Vmis1s`, `Vmis3p`, `Vmis3p----y`, `Vmis3s`, `Vmis3s----y`, `Vmm-2p`, `Vmm-2s`, `Vmnp`, `Vmnp------y`, `Vmp--pf`, `Vmp--pm`, `Vmp--sf`, `Vmp--sm`, `Vmp--sm---y`, `Vmsp1p`, `Vmsp2p`, `Vmsp2s`, `Vmsp3`, `Vmsp3-----y`, `X`, `Y`, `Ya`, `Yn`, `Ynfsoy`, `Ynfsry`, `Ynmsoy`, `Ynmsry`, `Yp`, `Yp,Yn`, `Yp-sr`, `Yr` |
99
  | **`parser`** | `ROOT`, `acl`, `advcl`, `advcl:tcl`, `advmod`, `advmod:tmod`, `amod`, `appos`, `aux`, `aux:pass`, `case`, `cc`, `cc:preconj`, `ccomp`, `ccomp:pmod`, `compound`, `conj`, `cop`, `csubj`, `csubj:pass`, `dep`, `det`, `expl`, `expl:impers`, `expl:pass`, `expl:poss`, `expl:pv`, `fixed`, `flat`, `goeswith`, `iobj`, `mark`, `nmod`, `nmod:tmod`, `nsubj`, `nsubj:pass`, `nummod`, `obj`, `obl`, `obl:agent`, `obl:pmod`, `orphan`, `parataxis`, `punct`, `vocative`, `xcomp` |
 
100
  | **`ner`** | `DATETIME`, `EVENT`, `FACILITY`, `GPE`, `LANGUAGE`, `LOC`, `MONEY`, `NAT_REL_POL`, `NUMERIC_VALUE`, `ORDINAL`, `ORGANIZATION`, `PERIOD`, `PERSON`, `PRODUCT`, `QUANTITY`, `WORK_OF_ART` |
101
 
102
  </details>
 
109
  | `TOKEN_P` | 99.67 |
110
  | `TOKEN_R` | 99.57 |
111
  | `TOKEN_F` | 99.59 |
112
+ | `TAG_ACC` | 96.68 |
113
+ | `SENTS_P` | 97.45 |
114
+ | `SENTS_R` | 96.54 |
115
+ | `SENTS_F` | 96.99 |
116
+ | `DEP_UAS` | 88.82 |
117
+ | `DEP_LAS` | 83.59 |
118
+ | `LEMMA_ACC` | 95.85 |
119
+ | `POS_ACC` | 94.04 |
120
+ | `MORPH_ACC` | 95.12 |
121
+ | `MORPH_MICRO_P` | 98.85 |
122
+ | `MORPH_MICRO_R` | 95.86 |
123
+ | `MORPH_MICRO_F` | 97.12 |
124
+ | `ENTS_P` | 75.52 |
125
+ | `ENTS_R` | 77.76 |
126
+ | `ENTS_F` | 76.62 |
accuracy.json CHANGED
@@ -3,127 +3,127 @@
3
  "token_p": 0.9967350492,
4
  "token_r": 0.9957244934,
5
  "token_f": 0.9959492157,
6
- "tag_acc": 0.9664291788,
7
- "sents_p": 0.954787234,
8
- "sents_r": 0.954787234,
9
- "sents_f": 0.954787234,
10
- "dep_uas": 0.8897462438,
11
- "dep_las": 0.8389686971,
12
  "dep_las_per_type": {
13
  "root": {
14
- "p": 0.8786231884,
15
  "r": 0.9133709981,
16
- "f": 0.8956602031
17
  },
18
  "mark": {
19
- "p": 0.9288389513,
20
- "r": 0.9358490566,
21
- "f": 0.9323308271
22
  },
23
  "case": {
24
- "p": 0.9638554217,
25
- "r": 0.959880015,
26
- "f": 0.9618636107
27
  },
28
  "nmod:tmod": {
29
- "p": 0.6842105263,
30
- "r": 0.1092436975,
31
- "f": 0.1884057971
32
  },
33
  "amod": {
34
- "p": 0.9172297297,
35
- "r": 0.9250425894,
36
- "f": 0.9211195929
37
  },
38
  "nsubj": {
39
- "p": 0.8803986711,
40
- "r": 0.8372827804,
41
- "f": 0.8582995951
42
  },
43
  "nmod": {
44
- "p": 0.8218838527,
45
- "r": 0.8286326312,
46
- "f": 0.8252444444
47
  },
48
  "aux": {
49
- "p": 0.9867924528,
50
- "r": 0.9561243144,
51
- "f": 0.9712163417
52
  },
53
  "advcl": {
54
- "p": 0.5862068966,
55
- "r": 0.6390977444,
56
- "f": 0.6115107914
57
  },
58
  "obj": {
59
- "p": 0.8326180258,
60
- "r": 0.896073903,
61
- "f": 0.8631813126
62
  },
63
  "det": {
64
- "p": 0.9575688073,
65
- "r": 0.9456398641,
66
- "f": 0.9515669516
67
  },
68
  "cc": {
69
- "p": 0.9340425532,
70
- "r": 0.9164926931,
71
- "f": 0.9251844046
72
  },
73
  "conj": {
74
- "p": 0.6115288221,
75
- "r": 0.5654692932,
76
- "f": 0.5875978326
77
  },
78
  "nummod": {
79
- "p": 0.887675507,
80
- "r": 0.8835403727,
81
- "f": 0.8856031128
82
  },
83
  "acl": {
84
- "p": 0.8063583815,
85
- "r": 0.7209302326,
86
- "f": 0.761255116
87
  },
88
  "advmod": {
89
- "p": 0.8117048346,
90
- "r": 0.8416886544,
91
- "f": 0.8264248705
92
  },
93
  "obl": {
94
- "p": 0.6821052632,
95
- "r": 0.8223350254,
96
- "f": 0.7456846951
97
  },
98
  "expl:pass": {
99
- "p": 0.8085106383,
100
- "r": 0.7037037037,
101
- "f": 0.7524752475
102
  },
103
  "nsubj:pass": {
104
- "p": 0.8,
105
- "r": 0.756097561,
106
- "f": 0.7774294671
107
  },
108
  "fixed": {
109
- "p": 0.9,
110
- "r": 0.8562367865,
111
- "f": 0.8775731311
112
  },
113
  "appos": {
114
- "p": 0.4956896552,
115
- "r": 0.4389312977,
116
- "f": 0.4655870445
117
  },
118
  "parataxis": {
119
- "p": 0.1627906977,
120
- "r": 0.2,
121
- "f": 0.1794871795
122
  },
123
  "aux:pass": {
124
- "p": 0.9125,
125
- "r": 0.9733333333,
126
- "f": 0.9419354839
127
  },
128
  "nmod:agent": {
129
  "p": 0.0,
@@ -131,9 +131,9 @@
131
  "f": 0.0
132
  },
133
  "ccomp": {
134
- "p": 0.8759689922,
135
- "r": 0.8759689922,
136
- "f": 0.8759689922
137
  },
138
  "nmod:pmod": {
139
  "p": 0.0,
@@ -141,64 +141,74 @@
141
  "f": 0.0
142
  },
143
  "iobj": {
144
- "p": 0.8157894737,
145
- "r": 0.7654320988,
146
- "f": 0.7898089172
147
  },
148
  "flat": {
149
- "p": 0.7557251908,
150
- "r": 0.7815789474,
151
- "f": 0.7684346701
152
  },
153
  "cop": {
154
- "p": 0.8524590164,
155
- "r": 0.8387096774,
156
- "f": 0.8455284553
157
  },
158
  "csubj": {
159
- "p": 0.8235294118,
160
- "r": 0.6666666667,
161
- "f": 0.7368421053
 
 
 
 
 
162
  },
163
  "obl:agent": {
164
  "p": 0.0,
165
  "r": 0.0,
166
  "f": 0.0
167
  },
168
- "dep": {
169
  "p": 0.0,
170
  "r": 0.0,
171
  "f": 0.0
172
  },
173
  "expl:pv": {
174
- "p": 0.7564102564,
175
- "r": 0.8550724638,
176
- "f": 0.8027210884
177
  },
178
  "expl": {
179
- "p": 0.6875,
180
  "r": 0.8148148148,
181
- "f": 0.7457627119
182
  },
183
- "obl:pmod": {
184
  "p": 0.0,
185
  "r": 0.0,
186
  "f": 0.0
187
  },
188
  "expl:poss": {
189
- "p": 0.9655172414,
190
- "r": 0.9032258065,
191
- "f": 0.9333333333
192
  },
193
  "goeswith": {
194
  "p": 0.0,
195
  "r": 0.0,
196
  "f": 0.0
197
  },
 
 
 
 
 
198
  "xcomp": {
199
- "p": 0.5806451613,
200
- "r": 0.6666666667,
201
- "f": 0.6206896552
202
  },
203
  "orphan": {
204
  "p": 0.0,
@@ -210,132 +220,133 @@
210
  "r": 0.3333333333,
211
  "f": 0.5
212
  },
213
- "csubj:pass": {
214
  "p": 0.0,
215
  "r": 0.0,
216
  "f": 0.0
217
  },
218
- "compound": {
219
- "p": 0.5714285714,
220
- "r": 0.5714285714,
221
- "f": 0.5714285714
222
  },
223
- "list": {
224
  "p": 0.0,
225
  "r": 0.0,
226
  "f": 0.0
227
  },
228
- "ccomp:pmod": {
229
- "p": 0.25,
230
- "r": 0.3333333333,
231
- "f": 0.2857142857
232
  },
233
- "cc:preconj": {
234
  "p": 0.0,
235
  "r": 0.0,
236
  "f": 0.0
237
  }
238
  },
239
- "pos_acc": 0.9405873228,
240
- "morph_acc": 0.9510657636,
241
- "morph_micro_p": 0.9896160458,
242
- "morph_micro_r": 0.9582489383,
243
- "morph_micro_f": 0.9706797273,
 
244
  "morph_per_feat": {
245
  "Case": {
246
- "p": 0.9938697318,
247
- "r": 0.9896985883,
248
- "f": 0.9917797744
249
  },
250
  "Gender": {
251
- "p": 0.991821842,
252
- "r": 0.9854981873,
253
- "f": 0.9886499028
254
  },
255
  "Number": {
256
- "p": 0.9894903379,
257
- "r": 0.922363847,
258
- "f": 0.9547486643
259
  },
260
  "Person": {
261
- "p": 0.9911452184,
262
- "r": 0.9893930466,
263
- "f": 0.9902683574
264
  },
265
  "PronType": {
266
- "p": 0.9965349965,
267
- "r": 0.993780235,
268
- "f": 0.9951557093
269
  },
270
  "Polarity": {
271
- "p": 0.9918566775,
272
- "r": 0.9983606557,
273
- "f": 0.9950980392
274
  },
275
  "AdpType": {
276
- "p": 0.998982706,
277
  "r": 0.9969543147,
278
- "f": 0.9979674797
279
  },
280
  "Definite": {
281
- "p": 0.9886490807,
282
- "r": 0.9815873016,
283
- "f": 0.9851055356
284
  },
285
  "Degree": {
286
- "p": 0.9582772544,
287
- "r": 0.9563465413,
288
- "f": 0.9573109244
289
  },
290
  "VerbForm": {
291
- "p": 0.9774236388,
292
- "r": 0.9787234043,
293
- "f": 0.9780730897
294
  },
295
  "Abbr": {
296
- "p": 0.9538461538,
297
- "r": 0.8303571429,
298
- "f": 0.8878281623
299
  },
300
  "Poss": {
301
  "p": 1.0,
302
- "r": 0.9927710843,
303
- "f": 0.9963724305
304
  },
305
  "NumForm": {
306
- "p": 0.9871794872,
307
- "r": 0.3181818182,
308
- "f": 0.48125
309
  },
310
  "NumType": {
311
- "p": 0.9872881356,
312
- "r": 0.3200549451,
313
- "f": 0.4834024896
314
  },
315
  "Reflex": {
316
  "p": 1.0,
317
- "r": 1.0,
318
- "f": 1.0
319
  },
320
  "Strength": {
321
- "p": 0.9920318725,
322
- "r": 0.9880952381,
323
- "f": 0.9900596421
324
  },
325
  "Mood": {
326
- "p": 0.972826087,
327
- "r": 0.9853211009,
328
- "f": 0.9790337284
329
  },
330
  "Tense": {
331
- "p": 0.9725036179,
332
  "r": 0.976744186,
333
- "f": 0.9746192893
334
  },
335
  "Variant": {
336
- "p": 0.9932885906,
337
- "r": 0.9548387097,
338
- "f": 0.9736842105
339
  },
340
  "Position": {
341
  "p": 1.0,
@@ -358,91 +369,90 @@
358
  "f": 0.0
359
  }
360
  },
361
- "lemma_acc": 0.8183070924,
362
- "ents_p": 0.7550713749,
363
- "ents_r": 0.7721859393,
364
- "ents_f": 0.7635327635,
365
  "ents_per_type": {
366
  "DATETIME": {
367
- "p": 0.7818791946,
368
- "r": 0.8118466899,
369
- "f": 0.7965811966
370
  },
371
  "ORGANIZATION": {
372
- "p": 0.7076923077,
373
- "r": 0.7324840764,
374
- "f": 0.7198748044
375
  },
376
  "FACILITY": {
377
- "p": 0.5039370079,
378
- "r": 0.4885496183,
379
- "f": 0.496124031
380
- },
381
- "PRODUCT": {
382
- "p": 0.5590551181,
383
- "r": 0.5182481752,
384
- "f": 0.5378787879
385
  },
386
  "NUMERIC_VALUE": {
387
- "p": 0.8875502008,
388
- "r": 0.936440678,
389
- "f": 0.9113402062
390
  },
391
  "ORDINAL": {
392
- "p": 0.8214285714,
393
- "r": 0.8363636364,
394
- "f": 0.8288288288
395
  },
396
  "EVENT": {
397
- "p": 0.5151515152,
398
- "r": 0.4594594595,
399
- "f": 0.4857142857
400
  },
401
  "GPE": {
402
- "p": 0.8636363636,
403
- "r": 0.8735632184,
404
- "f": 0.8685714286
405
  },
406
  "PERSON": {
407
- "p": 0.7046153846,
408
- "r": 0.7684563758,
409
- "f": 0.735152488
410
  },
411
  "NAT_REL_POL": {
412
- "p": 0.9315068493,
413
  "r": 0.9066666667,
414
- "f": 0.9189189189
415
  },
416
  "MONEY": {
417
- "p": 0.9622641509,
418
- "r": 0.8793103448,
419
- "f": 0.9189189189
 
 
 
 
 
420
  },
421
  "LOC": {
422
- "p": 0.4864864865,
423
- "r": 0.4736842105,
424
- "f": 0.48
425
  },
426
  "WORK_OF_ART": {
427
- "p": 0.3571428571,
428
  "r": 0.2631578947,
429
- "f": 0.303030303
430
  },
431
  "QUANTITY": {
432
- "p": 0.962962963,
433
- "r": 1.0,
434
- "f": 0.9811320755
 
 
 
 
 
435
  },
436
  "LANGUAGE": {
437
- "p": 0.6666666667,
438
  "r": 1.0,
439
- "f": 0.8
440
- },
441
- "PERIOD": {
442
- "p": 0.8648648649,
443
- "r": 0.7619047619,
444
- "f": 0.8101265823
445
  }
446
  },
447
- "speed": 7699.716829035
448
  }
 
3
  "token_p": 0.9967350492,
4
  "token_r": 0.9957244934,
5
  "token_f": 0.9959492157,
6
+ "tag_acc": 0.9667810127,
7
+ "sents_p": 0.9744966443,
8
+ "sents_r": 0.9654255319,
9
+ "sents_f": 0.9699398798,
10
+ "dep_uas": 0.8881779116,
11
+ "dep_las": 0.8359210815,
12
  "dep_las_per_type": {
13
  "root": {
14
+ "p": 0.8738738739,
15
  "r": 0.9133709981,
16
+ "f": 0.8931860037
17
  },
18
  "mark": {
19
+ "p": 0.927756654,
20
+ "r": 0.920754717,
21
+ "f": 0.9242424242
22
  },
23
  "case": {
24
+ "p": 0.9589453861,
25
+ "r": 0.9546306712,
26
+ "f": 0.9567831642
27
  },
28
  "nmod:tmod": {
29
+ "p": 0.5853658537,
30
+ "r": 0.2016806723,
31
+ "f": 0.3
32
  },
33
  "amod": {
34
+ "p": 0.9114359415,
35
+ "r": 0.9028960818,
36
+ "f": 0.9071459136
37
  },
38
  "nsubj": {
39
+ "p": 0.8717532468,
40
+ "r": 0.8483412322,
41
+ "f": 0.8598879103
42
  },
43
  "nmod": {
44
+ "p": 0.8199643494,
45
+ "r": 0.8211353088,
46
+ "f": 0.8205494113
47
  },
48
  "aux": {
49
+ "p": 0.9776119403,
50
+ "r": 0.957952468,
51
+ "f": 0.9676823638
52
  },
53
  "advcl": {
54
+ "p": 0.5947712418,
55
+ "r": 0.6842105263,
56
+ "f": 0.6363636364
57
  },
58
  "obj": {
59
+ "p": 0.8274336283,
60
+ "r": 0.8637413395,
61
+ "f": 0.8451977401
62
  },
63
  "det": {
64
+ "p": 0.9667812142,
65
+ "r": 0.9558323896,
66
+ "f": 0.9612756264
67
  },
68
  "cc": {
69
+ "p": 0.9411764706,
70
+ "r": 0.9352818372,
71
+ "f": 0.9382198953
72
  },
73
  "conj": {
74
+ "p": 0.5930232558,
75
+ "r": 0.5318655852,
76
+ "f": 0.5607819181
77
  },
78
  "nummod": {
79
+ "p": 0.8809891808,
80
+ "r": 0.8850931677,
81
+ "f": 0.8830364059
82
  },
83
  "acl": {
84
+ "p": 0.8211143695,
85
+ "r": 0.7235142119,
86
+ "f": 0.7692307692
87
  },
88
  "advmod": {
89
+ "p": 0.818877551,
90
+ "r": 0.8469656992,
91
+ "f": 0.8326848249
92
  },
93
  "obl": {
94
+ "p": 0.6858359957,
95
+ "r": 0.8172588832,
96
+ "f": 0.7458019687
97
  },
98
  "expl:pass": {
99
+ "p": 0.7735849057,
100
+ "r": 0.7592592593,
101
+ "f": 0.7663551402
102
  },
103
  "nsubj:pass": {
104
+ "p": 0.8246753247,
105
+ "r": 0.7743902439,
106
+ "f": 0.7987421384
107
  },
108
  "fixed": {
109
+ "p": 0.8623655914,
110
+ "r": 0.8477801268,
111
+ "f": 0.855010661
112
  },
113
  "appos": {
114
+ "p": 0.5085470085,
115
+ "r": 0.4541984733,
116
+ "f": 0.4798387097
117
  },
118
  "parataxis": {
119
+ "p": 0.0909090909,
120
+ "r": 0.0571428571,
121
+ "f": 0.0701754386
122
  },
123
  "aux:pass": {
124
+ "p": 0.9215686275,
125
+ "r": 0.94,
126
+ "f": 0.9306930693
127
  },
128
  "nmod:agent": {
129
  "p": 0.0,
 
131
  "f": 0.0
132
  },
133
  "ccomp": {
134
+ "p": 0.873015873,
135
+ "r": 0.8527131783,
136
+ "f": 0.862745098
137
  },
138
  "nmod:pmod": {
139
  "p": 0.0,
 
141
  "f": 0.0
142
  },
143
  "iobj": {
144
+ "p": 0.7710843373,
145
+ "r": 0.7901234568,
146
+ "f": 0.7804878049
147
  },
148
  "flat": {
149
+ "p": 0.8034825871,
150
+ "r": 0.85,
151
+ "f": 0.8260869565
152
  },
153
  "cop": {
154
+ "p": 0.8512396694,
155
+ "r": 0.8306451613,
156
+ "f": 0.8408163265
157
  },
158
  "csubj": {
159
+ "p": 0.8571428571,
160
+ "r": 0.8571428571,
161
+ "f": 0.8571428571
162
+ },
163
+ "dep": {
164
+ "p": 0.0,
165
+ "r": 0.0,
166
+ "f": 0.0
167
  },
168
  "obl:agent": {
169
  "p": 0.0,
170
  "r": 0.0,
171
  "f": 0.0
172
  },
173
+ "obl:pmod": {
174
  "p": 0.0,
175
  "r": 0.0,
176
  "f": 0.0
177
  },
178
  "expl:pv": {
179
+ "p": 0.7777777778,
180
+ "r": 0.8115942029,
181
+ "f": 0.7943262411
182
  },
183
  "expl": {
184
+ "p": 0.6285714286,
185
  "r": 0.8148148148,
186
+ "f": 0.7096774194
187
  },
188
+ "vocative": {
189
  "p": 0.0,
190
  "r": 0.0,
191
  "f": 0.0
192
  },
193
  "expl:poss": {
194
+ "p": 1.0,
195
+ "r": 0.935483871,
196
+ "f": 0.9666666667
197
  },
198
  "goeswith": {
199
  "p": 0.0,
200
  "r": 0.0,
201
  "f": 0.0
202
  },
203
+ "compound": {
204
+ "p": 0.3,
205
+ "r": 0.4285714286,
206
+ "f": 0.3529411765
207
+ },
208
  "xcomp": {
209
+ "p": 0.5416666667,
210
+ "r": 0.4814814815,
211
+ "f": 0.5098039216
212
  },
213
  "orphan": {
214
  "p": 0.0,
 
220
  "r": 0.3333333333,
221
  "f": 0.5
222
  },
223
+ "list": {
224
  "p": 0.0,
225
  "r": 0.0,
226
  "f": 0.0
227
  },
228
+ "ccomp:pmod": {
229
+ "p": 0.3333333333,
230
+ "r": 0.3333333333,
231
+ "f": 0.3333333333
232
  },
233
+ "cc:preconj": {
234
  "p": 0.0,
235
  "r": 0.0,
236
  "f": 0.0
237
  },
238
+ "csubj:pass": {
239
+ "p": 0.0,
240
+ "r": 0.0,
241
+ "f": 0.0
242
  },
243
+ "advcl:tcl": {
244
  "p": 0.0,
245
  "r": 0.0,
246
  "f": 0.0
247
  }
248
  },
249
+ "lemma_acc": 0.9585129152,
250
+ "pos_acc": 0.9403951881,
251
+ "morph_acc": 0.9512416806,
252
+ "morph_micro_p": 0.9885162858,
253
+ "morph_micro_r": 0.9585538495,
254
+ "morph_micro_f": 0.9711599991,
255
  "morph_per_feat": {
256
  "Case": {
257
+ "p": 0.9923332481,
258
+ "r": 0.9876637416,
259
+ "f": 0.9899929887
260
  },
261
  "Gender": {
262
+ "p": 0.9918074111,
263
+ "r": 0.9837479685,
264
+ "f": 0.9877612502
265
  },
266
  "Number": {
267
+ "p": 0.9894855851,
268
+ "r": 0.9219424839,
269
+ "f": 0.9545206675
270
  },
271
  "Person": {
272
+ "p": 0.9853113984,
273
+ "r": 0.9882144962,
274
+ "f": 0.986760812
275
  },
276
  "PronType": {
277
+ "p": 0.9965373961,
278
+ "r": 0.99447132,
279
+ "f": 0.9955032861
280
  },
281
  "Polarity": {
282
+ "p": 0.9902597403,
283
+ "r": 1.0,
284
+ "f": 0.9951060359
285
  },
286
  "AdpType": {
287
+ "p": 0.9996606719,
288
  "r": 0.9969543147,
289
+ "f": 0.9983056591
290
  },
291
  "Definite": {
292
+ "p": 0.9890903257,
293
+ "r": 0.9785714286,
294
+ "f": 0.9838027607
295
  },
296
  "Degree": {
297
+ "p": 0.9554355165,
298
+ "r": 0.9503022163,
299
+ "f": 0.9528619529
300
  },
301
  "VerbForm": {
302
+ "p": 0.9728656519,
303
+ "r": 0.977393617,
304
+ "f": 0.9751243781
305
  },
306
  "Abbr": {
307
+ "p": 0.9653465347,
308
+ "r": 0.8705357143,
309
+ "f": 0.9154929577
310
  },
311
  "Poss": {
312
  "p": 1.0,
313
+ "r": 0.9975903614,
314
+ "f": 0.9987937274
315
  },
316
  "NumForm": {
317
+ "p": 0.9709543568,
318
+ "r": 0.3223140496,
319
+ "f": 0.4839710445
320
  },
321
  "NumType": {
322
+ "p": 0.9794238683,
323
+ "r": 0.3269230769,
324
+ "f": 0.4902162719
325
  },
326
  "Reflex": {
327
  "p": 1.0,
328
+ "r": 0.9935897436,
329
+ "f": 0.9967845659
330
  },
331
  "Strength": {
332
+ "p": 0.992,
333
+ "r": 0.9841269841,
334
+ "f": 0.9880478088
335
  },
336
  "Mood": {
337
+ "p": 0.9588550984,
338
+ "r": 0.9834862385,
339
+ "f": 0.9710144928
340
  },
341
  "Tense": {
342
+ "p": 0.9627507163,
343
  "r": 0.976744186,
344
+ "f": 0.9696969697
345
  },
346
  "Variant": {
347
+ "p": 0.9933774834,
348
+ "r": 0.9677419355,
349
+ "f": 0.9803921569
350
  },
351
  "Position": {
352
  "p": 1.0,
 
369
  "f": 0.0
370
  }
371
  },
372
+ "ents_p": 0.7552238806,
373
+ "ents_r": 0.7775643488,
374
+ "ents_f": 0.766231308,
 
375
  "ents_per_type": {
376
  "DATETIME": {
377
+ "p": 0.7781569966,
378
+ "r": 0.7944250871,
379
+ "f": 0.7862068966
380
  },
381
  "ORGANIZATION": {
382
+ "p": 0.6888217523,
383
+ "r": 0.7261146497,
384
+ "f": 0.7069767442
385
  },
386
  "FACILITY": {
387
+ "p": 0.5714285714,
388
+ "r": 0.5496183206,
389
+ "f": 0.560311284
 
 
 
 
 
390
  },
391
  "NUMERIC_VALUE": {
392
+ "p": 0.8953974895,
393
+ "r": 0.906779661,
394
+ "f": 0.9010526316
395
  },
396
  "ORDINAL": {
397
+ "p": 0.8103448276,
398
+ "r": 0.8545454545,
399
+ "f": 0.8318584071
400
  },
401
  "EVENT": {
402
+ "p": 0.5526315789,
403
+ "r": 0.5675675676,
404
+ "f": 0.56
405
  },
406
  "GPE": {
407
+ "p": 0.8464912281,
408
+ "r": 0.8873563218,
409
+ "f": 0.8664421998
410
  },
411
  "PERSON": {
412
+ "p": 0.7164869029,
413
+ "r": 0.7802013423,
414
+ "f": 0.7469879518
415
  },
416
  "NAT_REL_POL": {
417
+ "p": 0.925170068,
418
  "r": 0.9066666667,
419
+ "f": 0.9158249158
420
  },
421
  "MONEY": {
422
+ "p": 0.9038461538,
423
+ "r": 0.8103448276,
424
+ "f": 0.8545454545
425
+ },
426
+ "PRODUCT": {
427
+ "p": 0.608,
428
+ "r": 0.5547445255,
429
+ "f": 0.5801526718
430
  },
431
  "LOC": {
432
+ "p": 0.5256410256,
433
+ "r": 0.5394736842,
434
+ "f": 0.5324675325
435
  },
436
  "WORK_OF_ART": {
437
+ "p": 0.2631578947,
438
  "r": 0.2631578947,
439
+ "f": 0.2631578947
440
  },
441
  "QUANTITY": {
442
+ "p": 0.8,
443
+ "r": 0.9230769231,
444
+ "f": 0.8571428571
445
+ },
446
+ "PERIOD": {
447
+ "p": 0.8823529412,
448
+ "r": 0.7142857143,
449
+ "f": 0.7894736842
450
  },
451
  "LANGUAGE": {
452
+ "p": 0.8,
453
  "r": 1.0,
454
+ "f": 0.8888888889
 
 
 
 
 
455
  }
456
  },
457
+ "speed": 9115.098662697
458
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
 
config.cfg CHANGED
@@ -10,7 +10,7 @@ seed = 0
10
 
11
  [nlp]
12
  lang = "ro"
13
- pipeline = ["tok2vec","tagger","parser","senter","attribute_ruler","lemmatizer","ner"]
14
  disabled = ["senter"]
15
  before_creation = null
16
  after_creation = null
@@ -26,11 +26,22 @@ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
26
  validate = false
27
 
28
  [components.lemmatizer]
29
- factory = "lemmatizer"
30
- mode = "lookup"
31
- model = null
32
  overwrite = false
33
  scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  [components.ner]
36
  factory = "ner"
@@ -55,7 +66,7 @@ nO = null
55
  @architectures = "spacy.MultiHashEmbed.v2"
56
  width = 96
57
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
58
- rows = [5000,2500,2500,2500,100]
59
  include_static_vectors = true
60
 
61
  [components.ner.model.tok2vec.encode]
@@ -93,8 +104,9 @@ overwrite = false
93
  scorer = {"@scorers":"spacy.senter_scorer.v1"}
94
 
95
  [components.senter.model]
96
- @architectures = "spacy.Tagger.v1"
97
  nO = null
 
98
 
99
  [components.senter.model.tok2vec]
100
  @architectures = "spacy.Tok2Vec.v2"
@@ -115,12 +127,14 @@ maxout_pieces = 2
115
 
116
  [components.tagger]
117
  factory = "tagger"
 
118
  overwrite = false
119
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
120
 
121
  [components.tagger.model]
122
- @architectures = "spacy.Tagger.v1"
123
  nO = null
 
124
 
125
  [components.tagger.model.tok2vec]
126
  @architectures = "spacy.Tok2VecListener.v1"
@@ -137,7 +151,7 @@ factory = "tok2vec"
137
  @architectures = "spacy.MultiHashEmbed.v2"
138
  width = ${components.tok2vec.model.encode:width}
139
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
140
- rows = [5000,2500,2500,2500,100]
141
  include_static_vectors = true
142
 
143
  [components.tok2vec.model.encode]
@@ -174,7 +188,7 @@ dropout = 0.1
174
  accumulate_gradient = 1
175
  patience = 5000
176
  max_epochs = 0
177
- max_steps = 0
178
  eval_frequency = 1000
179
  frozen_components = []
180
  before_to_disk = null
@@ -209,15 +223,15 @@ eps = 0.00000001
209
  learn_rate = 0.001
210
 
211
  [training.score_weights]
212
- tag_acc = 0.16
213
  dep_uas = 0.0
214
- dep_las = 0.16
215
  dep_las_per_type = null
216
  sents_p = null
217
  sents_r = null
218
- sents_f = 0.02
219
- lemma_acc = 0.5
220
- ents_f = 0.16
221
  ents_p = 0.0
222
  ents_r = 0.0
223
  ents_per_type = null
 
10
 
11
  [nlp]
12
  lang = "ro"
13
+ pipeline = ["tok2vec","tagger","parser","lemmatizer","senter","attribute_ruler","ner"]
14
  disabled = ["senter"]
15
  before_creation = null
16
  after_creation = null
 
26
  validate = false
27
 
28
  [components.lemmatizer]
29
+ factory = "trainable_lemmatizer"
30
+ backoff = "orth"
31
+ min_tree_freq = 3
32
  overwrite = false
33
  scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
34
+ top_k = 1
35
+
36
+ [components.lemmatizer.model]
37
+ @architectures = "spacy.Tagger.v2"
38
+ nO = null
39
+ normalize = false
40
+
41
+ [components.lemmatizer.model.tok2vec]
42
+ @architectures = "spacy.Tok2VecListener.v1"
43
+ width = ${components.tok2vec.model.encode:width}
44
+ upstream = "tok2vec"
45
 
46
  [components.ner]
47
  factory = "ner"
 
66
  @architectures = "spacy.MultiHashEmbed.v2"
67
  width = 96
68
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
69
+ rows = [5000,1000,2500,2500,50]
70
  include_static_vectors = true
71
 
72
  [components.ner.model.tok2vec.encode]
 
104
  scorer = {"@scorers":"spacy.senter_scorer.v1"}
105
 
106
  [components.senter.model]
107
+ @architectures = "spacy.Tagger.v2"
108
  nO = null
109
+ normalize = false
110
 
111
  [components.senter.model.tok2vec]
112
  @architectures = "spacy.Tok2Vec.v2"
 
127
 
128
  [components.tagger]
129
  factory = "tagger"
130
+ neg_prefix = "!"
131
  overwrite = false
132
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
133
 
134
  [components.tagger.model]
135
+ @architectures = "spacy.Tagger.v2"
136
  nO = null
137
+ normalize = false
138
 
139
  [components.tagger.model.tok2vec]
140
  @architectures = "spacy.Tok2VecListener.v1"
 
151
  @architectures = "spacy.MultiHashEmbed.v2"
152
  width = ${components.tok2vec.model.encode:width}
153
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
154
+ rows = [5000,1000,2500,2500,50]
155
  include_static_vectors = true
156
 
157
  [components.tok2vec.model.encode]
 
188
  accumulate_gradient = 1
189
  patience = 5000
190
  max_epochs = 0
191
+ max_steps = 100000
192
  eval_frequency = 1000
193
  frozen_components = []
194
  before_to_disk = null
 
223
  learn_rate = 0.001
224
 
225
  [training.score_weights]
226
+ tag_acc = 0.29
227
  dep_uas = 0.0
228
+ dep_las = 0.29
229
  dep_las_per_type = null
230
  sents_p = null
231
  sents_r = null
232
+ sents_f = 0.04
233
+ lemma_acc = 0.1
234
+ ents_f = 0.29
235
  ents_p = 0.0
236
  ents_r = 0.0
237
  ents_per_type = null
lemmatizer/cfg ADDED
@@ -0,0 +1,1141 @@