EC2 Default User commited on
Commit
3f88081
1 Parent(s): ddcdcad

Update spaCy pipeline

Browse files
.gitattributes CHANGED
@@ -19,3 +19,4 @@
19
  *strings.json filter=lfs diff=lfs merge=lfs -text
20
  vectors filter=lfs diff=lfs merge=lfs -text
21
  model filter=lfs diff=lfs merge=lfs -text
 
19
  *strings.json filter=lfs diff=lfs merge=lfs -text
20
  vectors filter=lfs diff=lfs merge=lfs -text
21
  model filter=lfs diff=lfs merge=lfs -text
22
+ *key2row filter=lfs diff=lfs merge=lfs -text
LICENSES_SOURCES CHANGED
@@ -878,554 +878,3 @@ Creative Commons may be contacted at creativecommons.org.
878
 
879
 
880
 
881
- # Lemmatization Lists
882
-
883
- * Author: Michal Měchura
884
- * URL: https://github.com/michmech/lemmatization-lists/
885
- * License: ODbL
886
-
887
- ```
888
- ## ODC Open Database License (ODbL)
889
-
890
- ### Preamble
891
-
892
- The Open Database License (ODbL) is a license agreement intended to
893
- allow users to freely share, modify, and use this Database while
894
- maintaining this same freedom for others. Many databases are covered by
895
- copyright, and therefore this document licenses these rights. Some
896
- jurisdictions, mainly in the European Union, have specific rights that
897
- cover databases, and so the ODbL addresses these rights, too. Finally,
898
- the ODbL is also an agreement in contract for users of this Database to
899
- act in certain ways in return for accessing this Database.
900
-
901
- Databases can contain a wide variety of types of content (images,
902
- audiovisual material, and sounds all in the same database, for example),
903
- and so the ODbL only governs the rights over the Database, and not the
904
- contents of the Database individually. Licensors should use the ODbL
905
- together with another license for the contents, if the contents have a
906
- single set of rights that uniformly covers all of the contents. If the
907
- contents have multiple sets of different rights, Licensors should
908
- describe what rights govern what contents together in the individual
909
- record or in some other way that clarifies what rights apply.
910
-
911
- Sometimes the contents of a database, or the database itself, can be
912
- covered by other rights not addressed here (such as private contracts,
913
- trade mark over the name, or privacy rights / data protection rights
914
- over information in the contents), and so you are advised that you may
915
- have to consult other documents or clear other rights before doing
916
- activities not covered by this License.
917
-
918
- ------
919
-
920
- The Licensor (as defined below)
921
-
922
- and
923
-
924
- You (as defined below)
925
-
926
- agree as follows:
927
-
928
- ### 1.0 Definitions of Capitalised Words
929
-
930
- "Collective Database" – Means this Database in unmodified form as part
931
- of a collection of independent databases in themselves that together are
932
- assembled into a collective whole. A work that constitutes a Collective
933
- Database will not be considered a Derivative Database.
934
-
935
- "Convey" – As a verb, means Using the Database, a Derivative Database,
936
- or the Database as part of a Collective Database in any way that enables
937
- a Person to make or receive copies of the Database or a Derivative
938
- Database. Conveying does not include interaction with a user through a
939
- computer network, or creating and Using a Produced Work, where no
940
- transfer of a copy of the Database or a Derivative Database occurs.
941
- "Contents" – The contents of this Database, which includes the
942
- information, independent works, or other material collected into the
943
- Database. For example, the contents of the Database could be factual
944
- data or works such as images, audiovisual material, text, or sounds.
945
-
946
- "Database" – A collection of material (the Contents) arranged in a
947
- systematic or methodical way and individually accessible by electronic
948
- or other means offered under the terms of this License.
949
-
950
- "Database Directive" – Means Directive 96/9/EC of the European
951
- Parliament and of the Council of 11 March 1996 on the legal protection
952
- of databases, as amended or succeeded.
953
-
954
- "Database Right" – Means rights resulting from the Chapter III ("sui
955
- generis") rights in the Database Directive (as amended and as transposed
956
- by member states), which includes the Extraction and Re-utilisation of
957
- the whole or a Substantial part of the Contents, as well as any similar
958
- rights available in the relevant jurisdiction under Section 10.4.
959
-
960
- "Derivative Database" – Means a database based upon the Database, and
961
- includes any translation, adaptation, arrangement, modification, or any
962
- other alteration of the Database or of a Substantial part of the
963
- Contents. This includes, but is not limited to, Extracting or
964
- Re-utilising the whole or a Substantial part of the Contents in a new
965
- Database.
966
-
967
- "Extraction" – Means the permanent or temporary transfer of all or a
968
- Substantial part of the Contents to another medium by any means or in
969
- any form.
970
-
971
- "License" – Means this license agreement and is both a license of rights
972
- such as copyright and Database Rights and an agreement in contract.
973
-
974
- "Licensor" – Means the Person that offers the Database under the terms
975
- of this License.
976
-
977
- "Person" – Means a natural or legal person or a body of persons
978
- corporate or incorporate.
979
-
980
- "Produced Work" – a work (such as an image, audiovisual material, text,
981
- or sounds) resulting from using the whole or a Substantial part of the
982
- Contents (via a search or other query) from this Database, a Derivative
983
- Database, or this Database as part of a Collective Database.
984
-
985
- "Publicly" – means to Persons other than You or under Your control by
986
- either more than 50% ownership or by the power to direct their
987
- activities (such as contracting with an independent consultant).
988
-
989
- "Re-utilisation" – means any form of making available to the public all
990
- or a Substantial part of the Contents by the distribution of copies, by
991
- renting, by online or other forms of transmission.
992
-
993
- "Substantial" – Means substantial in terms of quantity or quality or a
994
- combination of both. The repeated and systematic Extraction or
995
- Re-utilisation of insubstantial parts of the Contents may amount to the
996
- Extraction or Re-utilisation of a Substantial part of the Contents.
997
-
998
- "Use" – As a verb, means doing any act that is restricted by copyright
999
- or Database Rights whether in the original medium or any other; and
1000
- includes without limitation distributing, copying, publicly performing,
1001
- publicly displaying, and preparing derivative works of the Database, as
1002
- well as modifying the Database as may be technically necessary to use it
1003
- in a different mode or format.
1004
-
1005
- "You" – Means a Person exercising rights under this License who has not
1006
- previously violated the terms of this License with respect to the
1007
- Database, or who has received express permission from the Licensor to
1008
- exercise rights under this License despite a previous violation.
1009
-
1010
- Words in the singular include the plural and vice versa.
1011
-
1012
- ### 2.0 What this License covers
1013
-
1014
- 2.1. Legal effect of this document. This License is:
1015
-
1016
- a. A license of applicable copyright and neighbouring rights;
1017
-
1018
- b. A license of the Database Right; and
1019
-
1020
- c. An agreement in contract between You and the Licensor.
1021
-
1022
- 2.2 Legal rights covered. This License covers the legal rights in the
1023
- Database, including:
1024
-
1025
- a. Copyright. Any copyright or neighbouring rights in the Database.
1026
- The copyright licensed includes any individual elements of the
1027
- Database, but does not cover the copyright over the Contents
1028
- independent of this Database. See Section 2.4 for details. Copyright
1029
- law varies between jurisdictions, but is likely to cover: the Database
1030
- model or schema, which is the structure, arrangement, and organisation
1031
- of the Database, and can also include the Database tables and table
1032
- indexes; the data entry and output sheets; and the Field names of
1033
- Contents stored in the Database;
1034
-
1035
- b. Database Rights. Database Rights only extend to the Extraction and
1036
- Re-utilisation of the whole or a Substantial part of the Contents.
1037
- Database Rights can apply even when there is no copyright over the
1038
- Database. Database Rights can also apply when the Contents are removed
1039
- from the Database and are selected and arranged in a way that would
1040
- not infringe any applicable copyright; and
1041
-
1042
- c. Contract. This is an agreement between You and the Licensor for
1043
- access to the Database. In return you agree to certain conditions of
1044
- use on this access as outlined in this License.
1045
-
1046
- 2.3 Rights not covered.
1047
-
1048
- a. This License does not apply to computer programs used in the making
1049
- or operation of the Database;
1050
-
1051
- b. This License does not cover any patents over the Contents or the
1052
- Database; and
1053
-
1054
- c. This License does not cover any trademarks associated with the
1055
- Database.
1056
-
1057
- 2.4 Relationship to Contents in the Database. The individual items of
1058
- the Contents contained in this Database may be covered by other rights,
1059
- including copyright, patent, data protection, privacy, or personality
1060
- rights, and this License does not cover any rights (other than Database
1061
- Rights or in contract) in individual Contents contained in the Database.
1062
- For example, if used on a Database of images (the Contents), this
1063
- License would not apply to copyright over individual images, which could
1064
- have their own separate licenses, or one single license covering all of
1065
- the rights over the images.
1066
-
1067
- ### 3.0 Rights granted
1068
-
1069
- 3.1 Subject to the terms and conditions of this License, the Licensor
1070
- grants to You a worldwide, royalty-free, non-exclusive, terminable (but
1071
- only under Section 9) license to Use the Database for the duration of
1072
- any applicable copyright and Database Rights. These rights explicitly
1073
- include commercial use, and do not exclude any field of endeavour. To
1074
- the extent possible in the relevant jurisdiction, these rights may be
1075
- exercised in all media and formats whether now known or created in the
1076
- future.
1077
-
1078
- The rights granted cover, for example:
1079
-
1080
- a. Extraction and Re-utilisation of the whole or a Substantial part of
1081
- the Contents;
1082
-
1083
- b. Creation of Derivative Databases;
1084
-
1085
- c. Creation of Collective Databases;
1086
-
1087
- d. Creation of temporary or permanent reproductions by any means and
1088
- in any form, in whole or in part, including of any Derivative
1089
- Databases or as a part of Collective Databases; and
1090
-
1091
- e. Distribution, communication, display, lending, making available, or
1092
- performance to the public by any means and in any form, in whole or in
1093
- part, including of any Derivative Database or as a part of Collective
1094
- Databases.
1095
-
1096
- 3.2 Compulsory license schemes. For the avoidance of doubt:
1097
-
1098
- a. Non-waivable compulsory license schemes. In those jurisdictions in
1099
- which the right to collect royalties through any statutory or
1100
- compulsory licensing scheme cannot be waived, the Licensor reserves
1101
- the exclusive right to collect such royalties for any exercise by You
1102
- of the rights granted under this License;
1103
-
1104
- b. Waivable compulsory license schemes. In those jurisdictions in
1105
- which the right to collect royalties through any statutory or
1106
- compulsory licensing scheme can be waived, the Licensor waives the
1107
- exclusive right to collect such royalties for any exercise by You of
1108
- the rights granted under this License; and,
1109
-
1110
- c. Voluntary license schemes. The Licensor waives the right to collect
1111
- royalties, whether individually or, in the event that the Licensor is
1112
- a member of a collecting society that administers voluntary licensing
1113
- schemes, via that society, from any exercise by You of the rights
1114
- granted under this License.
1115
-
1116
- 3.3 The right to release the Database under different terms, or to stop
1117
- distributing or making available the Database, is reserved. Note that
1118
- this Database may be multiple-licensed, and so You may have the choice
1119
- of using alternative licenses for this Database. Subject to Section
1120
- 10.4, all other rights not expressly granted by Licensor are reserved.
1121
-
1122
- ### 4.0 Conditions of Use
1123
-
1124
- 4.1 The rights granted in Section 3 above are expressly made subject to
1125
- Your complying with the following conditions of use. These are important
1126
- conditions of this License, and if You fail to follow them, You will be
1127
- in material breach of its terms.
1128
-
1129
- 4.2 Notices. If You Publicly Convey this Database, any Derivative
1130
- Database, or the Database as part of a Collective Database, then You
1131
- must:
1132
-
1133
- a. Do so only under the terms of this License or another license
1134
- permitted under Section 4.4;
1135
-
1136
- b. Include a copy of this License (or, as applicable, a license
1137
- permitted under Section 4.4) or its Uniform Resource Identifier (URI)
1138
- with the Database or Derivative Database, including both in the
1139
- Database or Derivative Database and in any relevant documentation; and
1140
-
1141
- c. Keep intact any copyright or Database Right notices and notices
1142
- that refer to this License.
1143
-
1144
- d. If it is not possible to put the required notices in a particular
1145
- file due to its structure, then You must include the notices in a
1146
- location (such as a relevant directory) where users would be likely to
1147
- look for it.
1148
-
1149
- 4.3 Notice for using output (Contents). Creating and Using a Produced
1150
- Work does not require the notice in Section 4.2. However, if you
1151
- Publicly Use a Produced Work, You must include a notice associated with
1152
- the Produced Work reasonably calculated to make any Person that uses,
1153
- views, accesses, interacts with, or is otherwise exposed to the Produced
1154
- Work aware that Content was obtained from the Database, Derivative
1155
- Database, or the Database as part of a Collective Database, and that it
1156
- is available under this License.
1157
-
1158
- a. Example notice. The following text will satisfy notice under
1159
- Section 4.3:
1160
-
1161
- Contains information from DATABASE NAME, which is made available
1162
- here under the Open Database License (ODbL).
1163
-
1164
- DATABASE NAME should be replaced with the name of the Database and a
1165
- hyperlink to the URI of the Database. "Open Database License" should
1166
- contain a hyperlink to the URI of the text of this License. If
1167
- hyperlinks are not possible, You should include the plain text of the
1168
- required URI's with the above notice.
1169
-
1170
- 4.4 Share alike.
1171
-
1172
- a. Any Derivative Database that You Publicly Use must be only under
1173
- the terms of:
1174
-
1175
- i. This License;
1176
-
1177
- ii. A later version of this License similar in spirit to this
1178
- License; or
1179
-
1180
- iii. A compatible license.
1181
-
1182
- If You license the Derivative Database under one of the licenses
1183
- mentioned in (iii), You must comply with the terms of that license.
1184
-
1185
- b. For the avoidance of doubt, Extraction or Re-utilisation of the
1186
- whole or a Substantial part of the Contents into a new database is a
1187
- Derivative Database and must comply with Section 4.4.
1188
-
1189
- c. Derivative Databases and Produced Works. A Derivative Database is
1190
- Publicly Used and so must comply with Section 4.4. if a Produced Work
1191
- created from the Derivative Database is Publicly Used.
1192
-
1193
- d. Share Alike and additional Contents. For the avoidance of doubt,
1194
- You must not add Contents to Derivative Databases under Section 4.4 a
1195
- that are incompatible with the rights granted under this License.
1196
-
1197
- e. Compatible licenses. Licensors may authorise a proxy to determine
1198
- compatible licenses under Section 4.4 a iii. If they do so, the
1199
- authorised proxy's public statement of acceptance of a compatible
1200
- license grants You permission to use the compatible license.
1201
-
1202
-
1203
- 4.5 Limits of Share Alike. The requirements of Section 4.4 do not apply
1204
- in the following:
1205
-
1206
- a. For the avoidance of doubt, You are not required to license
1207
- Collective Databases under this License if You incorporate this
1208
- Database or a Derivative Database in the collection, but this License
1209
- still applies to this Database or a Derivative Database as a part of
1210
- the Collective Database;
1211
-
1212
- b. Using this Database, a Derivative Database, or this Database as
1213
- part of a Collective Database to create a Produced Work does not
1214
- create a Derivative Database for purposes of Section 4.4; and
1215
-
1216
- c. Use of a Derivative Database internally within an organisation is
1217
- not to the public and therefore does not fall under the requirements
1218
- of Section 4.4.
1219
-
1220
- 4.6 Access to Derivative Databases. If You Publicly Use a Derivative
1221
- Database or a Produced Work from a Derivative Database, You must also
1222
- offer to recipients of the Derivative Database or Produced Work a copy
1223
- in a machine readable form of:
1224
-
1225
- a. The entire Derivative Database; or
1226
-
1227
- b. A file containing all of the alterations made to the Database or
1228
- the method of making the alterations to the Database (such as an
1229
- algorithm), including any additional Contents, that make up all the
1230
- differences between the Database and the Derivative Database.
1231
-
1232
- The Derivative Database (under a.) or alteration file (under b.) must be
1233
- available at no more than a reasonable production cost for physical
1234
- distributions and free of charge if distributed over the internet.
1235
-
1236
- 4.7 Technological measures and additional terms
1237
-
1238
- a. This License does not allow You to impose (except subject to
1239
- Section 4.7 b.) any terms or any technological measures on the
1240
- Database, a Derivative Database, or the whole or a Substantial part of
1241
- the Contents that alter or restrict the terms of this License, or any
1242
- rights granted under it, or have the effect or intent of restricting
1243
- the ability of any person to exercise those rights.
1244
-
1245
- b. Parallel distribution. You may impose terms or technological
1246
- measures on the Database, a Derivative Database, or the whole or a
1247
- Substantial part of the Contents (a "Restricted Database") in
1248
- contravention of Section 4.74 a. only if You also make a copy of the
1249
- Database or a Derivative Database available to the recipient of the
1250
- Restricted Database:
1251
-
1252
- i. That is available without additional fee;
1253
-
1254
- ii. That is available in a medium that does not alter or restrict
1255
- the terms of this License, or any rights granted under it, or have
1256
- the effect or intent of restricting the ability of any person to
1257
- exercise those rights (an "Unrestricted Database"); and
1258
-
1259
- iii. The Unrestricted Database is at least as accessible to the
1260
- recipient as a practical matter as the Restricted Database.
1261
-
1262
- c. For the avoidance of doubt, You may place this Database or a
1263
- Derivative Database in an authenticated environment, behind a
1264
- password, or within a similar access control scheme provided that You
1265
- do not alter or restrict the terms of this License or any rights
1266
- granted under it or have the effect or intent of restricting the
1267
- ability of any person to exercise those rights.
1268
-
1269
- 4.8 Licensing of others. You may not sublicense the Database. Each time
1270
- You communicate the Database, the whole or Substantial part of the
1271
- Contents, or any Derivative Database to anyone else in any way, the
1272
- Licensor offers to the recipient a license to the Database on the same
1273
- terms and conditions as this License. You are not responsible for
1274
- enforcing compliance by third parties with this License, but You may
1275
- enforce any rights that You have over a Derivative Database. You are
1276
- solely responsible for any modifications of a Derivative Database made
1277
- by You or another Person at Your direction. You may not impose any
1278
- further restrictions on the exercise of the rights granted or affirmed
1279
- under this License.
1280
-
1281
- ### 5.0 Moral rights
1282
-
1283
- 5.1 Moral rights. This section covers moral rights, including any rights
1284
- to be identified as the author of the Database or to object to treatment
1285
- that would otherwise prejudice the author's honour and reputation, or
1286
- any other derogatory treatment:
1287
-
1288
- a. For jurisdictions allowing waiver of moral rights, Licensor waives
1289
- all moral rights that Licensor may have in the Database to the fullest
1290
- extent possible by the law of the relevant jurisdiction under Section
1291
- 10.4;
1292
-
1293
- b. If waiver of moral rights under Section 5.1 a in the relevant
1294
- jurisdiction is not possible, Licensor agrees not to assert any moral
1295
- rights over the Database and waives all claims in moral rights to the
1296
- fullest extent possible by the law of the relevant jurisdiction under
1297
- Section 10.4; and
1298
-
1299
- c. For jurisdictions not allowing waiver or an agreement not to assert
1300
- moral rights under Section 5.1 a and b, the author may retain their
1301
- moral rights over certain aspects of the Database.
1302
-
1303
- Please note that some jurisdictions do not allow for the waiver of moral
1304
- rights, and so moral rights may still subsist over the Database in some
1305
- jurisdictions.
1306
-
1307
- ### 6.0 Fair dealing, Database exceptions, and other rights not affected
1308
-
1309
- 6.1 This License does not affect any rights that You or anyone else may
1310
- independently have under any applicable law to make any use of this
1311
- Database, including without limitation:
1312
-
1313
- a. Exceptions to the Database Right including: Extraction of Contents
1314
- from non-electronic Databases for private purposes, Extraction for
1315
- purposes of illustration for teaching or scientific research, and
1316
- Extraction or Re-utilisation for public security or an administrative
1317
- or judicial procedure.
1318
-
1319
- b. Fair dealing, fair use, or any other legally recognised limitation
1320
- or exception to infringement of copyright or other applicable laws.
1321
-
1322
- 6.2 This License does not affect any rights of lawful users to Extract
1323
- and Re-utilise insubstantial parts of the Contents, evaluated
1324
- quantitatively or qualitatively, for any purposes whatsoever, including
1325
- creating a Derivative Database (subject to other rights over the
1326
- Contents, see Section 2.4). The repeated and systematic Extraction or
1327
- Re-utilisation of insubstantial parts of the Contents may however amount
1328
- to the Extraction or Re-utilisation of a Substantial part of the
1329
- Contents.
1330
-
1331
- ### 7.0 Warranties and Disclaimer
1332
-
1333
- 7.1 The Database is licensed by the Licensor "as is" and without any
1334
- warranty of any kind, either express, implied, or arising by statute,
1335
- custom, course of dealing, or trade usage. Licensor specifically
1336
- disclaims any and all implied warranties or conditions of title,
1337
- non-infringement, accuracy or completeness, the presence or absence of
1338
- errors, fitness for a particular purpose, merchantability, or otherwise.
1339
- Some jurisdictions do not allow the exclusion of implied warranties, so
1340
- this exclusion may not apply to You.
1341
-
1342
- ### 8.0 Limitation of liability
1343
-
1344
- 8.1 Subject to any liability that may not be excluded or limited by law,
1345
- the Licensor is not liable for, and expressly excludes, all liability
1346
- for loss or damage however and whenever caused to anyone by any use
1347
- under this License, whether by You or by anyone else, and whether caused
1348
- by any fault on the part of the Licensor or not. This exclusion of
1349
- liability includes, but is not limited to, any special, incidental,
1350
- consequential, punitive, or exemplary damages such as loss of revenue,
1351
- data, anticipated profits, and lost business. This exclusion applies
1352
- even if the Licensor has been advised of the possibility of such
1353
- damages.
1354
-
1355
- 8.2 If liability may not be excluded by law, it is limited to actual and
1356
- direct financial loss to the extent it is caused by proved negligence on
1357
- the part of the Licensor.
1358
-
1359
- ### 9.0 Termination of Your rights under this License
1360
-
1361
- 9.1 Any breach by You of the terms and conditions of this License
1362
- automatically terminates this License with immediate effect and without
1363
- notice to You. For the avoidance of doubt, Persons who have received the
1364
- Database, the whole or a Substantial part of the Contents, Derivative
1365
- Databases, or the Database as part of a Collective Database from You
1366
- under this License will not have their licenses terminated provided
1367
- their use is in full compliance with this License or a license granted
1368
- under Section 4.8 of this License. Sections 1, 2, 7, 8, 9 and 10 will
1369
- survive any termination of this License.
1370
-
1371
- 9.2 If You are not in breach of the terms of this License, the Licensor
1372
- will not terminate Your rights under it.
1373
-
1374
- 9.3 Unless terminated under Section 9.1, this License is granted to You
1375
- for the duration of applicable rights in the Database.
1376
-
1377
- 9.4 Reinstatement of rights. If you cease any breach of the terms and
1378
- conditions of this License, then your full rights under this License
1379
- will be reinstated:
1380
-
1381
- a. Provisionally and subject to permanent termination until the 60th
1382
- day after cessation of breach;
1383
-
1384
- b. Permanently on the 60th day after cessation of breach unless
1385
- otherwise reasonably notified by the Licensor; or
1386
-
1387
- c. Permanently if reasonably notified by the Licensor of the
1388
- violation, this is the first time You have received notice of
1389
- violation of this License from the Licensor, and You cure the
1390
- violation prior to 30 days after your receipt of the notice.
1391
-
1392
- Persons subject to permanent termination of rights are not eligible to
1393
- be a recipient and receive a license under Section 4.8.
1394
-
1395
- 9.5 Notwithstanding the above, Licensor reserves the right to release
1396
- the Database under different license terms or to stop distributing or
1397
- making available the Database. Releasing the Database under different
1398
- license terms or stopping the distribution of the Database will not
1399
- withdraw this License (or any other license that has been, or is
1400
- required to be, granted under the terms of this License), and this
1401
- License will continue in full force and effect unless terminated as
1402
- stated above.
1403
-
1404
- ### 10.0 General
1405
-
1406
- 10.1 If any provision of this License is held to be invalid or
1407
- unenforceable, that must not affect the validity or enforceability of
1408
- the remainder of the terms and conditions of this License and each
1409
- remaining provision of this License shall be valid and enforced to the
1410
- fullest extent permitted by law.
1411
-
1412
- 10.2 This License is the entire agreement between the parties with
1413
- respect to the rights granted here over the Database. It replaces any
1414
- earlier understandings, agreements or representations with respect to
1415
- the Database.
1416
-
1417
- 10.3 If You are in breach of the terms of this License, You will not be
1418
- entitled to rely on the terms of this License or to complain of any
1419
- breach by the Licensor.
1420
-
1421
- 10.4 Choice of law. This License takes effect in and will be governed by
1422
- the laws of the relevant jurisdiction in which the License terms are
1423
- sought to be enforced. If the standard suite of rights granted under
1424
- applicable copyright law and Database Rights in the relevant
1425
- jurisdiction includes additional rights not granted under this License,
1426
- these additional rights are granted in this License in order to meet the
1427
- terms of this License.```
1428
-
1429
-
1430
-
1431
-
878
 
879
 
880
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -14,61 +14,76 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.7682119205
18
  - name: NER Recall
19
  type: recall
20
- value: 0.725
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.7459807074
 
 
 
 
 
 
 
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
- - name: POS Accuracy
29
  type: accuracy
30
- value: 0.9507990315
31
  - task:
32
- name: SENTER
33
  type: token-classification
34
  metrics:
35
- - name: SENTER Precision
36
- type: precision
37
- value: 0.9045045045
38
- - name: SENTER Recall
39
- type: recall
40
- value: 0.890070922
41
- - name: SENTER F Score
42
- type: f_score
43
- value: 0.8972296693
44
  - task:
45
- name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
- - name: Unlabeled Dependencies Accuracy
49
  type: accuracy
50
- value: 0.8070861741
 
 
 
 
 
 
 
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
- - name: Labeled Dependencies Accuracy
56
- type: accuracy
57
- value: 0.8070861741
 
 
 
 
 
 
 
58
  ---
59
  ### Details: https://spacy.io/models/da#da_core_news_sm
60
 
61
- Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.
62
 
63
  | Feature | Description |
64
  | --- | --- |
65
  | **Name** | `da_core_news_sm` |
66
- | **Version** | `3.2.0` |
67
- | **spaCy** | `>=3.2.0,<3.3.0` |
68
- | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
69
- | **Components** | `tok2vec`, `morphologizer`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
70
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
71
- | **Sources** | [UD Danish DDT v2.8](https://github.com/UniversalDependencies/UD_Danish-DDT) (Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara)<br />[DaNE](https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane) (Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders Søgaard)<br />[Lemmatization Lists](https://github.com/michmech/lemmatization-lists/) (Michal Měchura) |
72
  | **License** | `CC BY-SA 4.0` |
73
  | **Author** | [Explosion](https://explosion.ai) |
74
 
@@ -76,13 +91,12 @@ Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
76
 
77
  <details>
78
 
79
- <summary>View label scheme (195 labels for 4 components)</summary>
80
 
81
  | Component | Labels |
82
  | --- | --- |
83
  | **`morphologizer`** | `AdpType=Prep\|POS=ADP`, `Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=AUX\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=PROPN`, `Definite=Ind\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=SCONJ`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=ADV`, `Number=Plur\|POS=DET\|PronType=Dem`, `Degree=Pos\|Number=Plur\|POS=ADJ`, `Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=PUNCT`, `POS=CCONJ`, `Definite=Ind\|Degree=Cmp\|Number=Sing\|POS=ADJ`, `Degree=Cmp\|POS=ADJ`, `POS=PRON\|PartType=Inf`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Definite=Ind\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Neut\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Dem`, `Degree=Pos\|POS=ADV`, `Definite=Def\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|PronType=Dem`, `NumType=Card\|POS=NUM`, `Definite=Ind\|Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `NumType=Ord\|POS=ADJ`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Mood=Ind\|POS=AUX\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=VERB\|VerbForm=Inf\|Voice=Act`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Pass`, `POS=ADP\|PartType=Inf`, `Degree=Pos\|POS=ADJ`, `Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `POS=AUX\|VerbForm=Inf\|Voice=Act`, `Definite=Ind\|Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Number=Plur\|POS=DET\|PronType=Ind`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Ind`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `POS=PART\|PartType=Inf`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Acc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Nom\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Nom\|Gender=Com\|POS=PRON\|PronType=Ind`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Ind`, `Mood=Imp\|POS=VERB`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Definite=Ind\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Number=Plur\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|VerbForm=Inf\|Voice=Pass`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Degree=Cmp\|POS=ADV`, `POS=ADV\|PartType=Inf`, `Degree=Sup\|POS=ADV`, `Number=Plur\|POS=PRON\|PronType=Dem`, `Number=Plur\|POS=PRON\|PronType=Ind`, `Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|POS=PROPN`, `POS=ADP`, `Degree=Cmp\|Number=Plur\|POS=ADJ`, `Definite=Def\|Degree=Sup\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Gender=Com\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Number=Plur\|POS=PRON\|PronType=Rcp`, `Case=Gen\|Degree=Cmp\|POS=ADJ`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=INTJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Number=Plur\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Definite=Def\|Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `POS=SYM`, `Case=Nom\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Degree=Sup\|POS=ADJ`, `Number=Plur\|POS=DET\|PronType=Ind\|Style=Arch`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Foreign=Yes\|POS=X`, `POS=DET\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Dem`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Gen\|POS=PRON\|PronType=Int,Rel`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Dem`, `Abbr=Yes\|POS=X`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Abs\|POS=ADJ`, `Definite=Ind\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Definite=Ind\|POS=NOUN`, `Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Gender=Com\|POS=PRON\|PronType=Int,Rel`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Degree=Abs\|POS=ADV`, `POS=VERB\|VerbForm=Ger`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Plur\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Gen\|Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Gen\|Degree=Pos\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|Tense=Pres`, `Case=Gen\|Number=Plur\|POS=DET\|PronType=Ind`, `Number[psor]=Plur\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=PRON\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Pass`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Plur\|POS=NOUN`, `Case=Gen\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Mood=Imp\|POS=AUX`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Gen\|POS=NOUN`, `Number[psor]=Plur\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=DET\|PronType=Dem`, `Definite=Def\|Number=Plur\|POS=NOUN` |
84
  | **`parser`** | `ROOT`, `acl:relcl`, `advcl`, `advmod`, `advmod:lmod`, `amod`, `appos`, `aux`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `dep`, `det`, `expl`, `fixed`, `flat`, `iobj`, `list`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nummod`, `obj`, `obl`, `obl:lmod`, `obl:tmod`, `punct`, `xcomp` |
85
- | **`senter`** | `I`, `S` |
86
  | **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
87
 
88
  </details>
@@ -95,18 +109,18 @@ Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, s
95
  | `TOKEN_P` | 99.78 |
96
  | `TOKEN_R` | 99.75 |
97
  | `TOKEN_F` | 99.76 |
98
- | `POS_ACC` | 95.08 |
99
  | `MORPH_ACC` | 93.71 |
100
- | `MORPH_MICRO_P` | 95.63 |
101
- | `MORPH_MICRO_R` | 94.83 |
102
- | `MORPH_MICRO_F` | 95.23 |
103
- | `SENTS_P` | 90.45 |
104
  | `SENTS_R` | 89.01 |
105
- | `SENTS_F` | 89.72 |
106
- | `DEP_UAS` | 80.71 |
107
- | `DEP_LAS` | 76.41 |
108
- | `TAG_ACC` | 95.08 |
109
- | `LEMMA_ACC` | 84.91 |
110
- | `ENTS_P` | 76.82 |
111
- | `ENTS_R` | 72.50 |
112
- | `ENTS_F` | 74.60 |
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.7450110865
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.7
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.7218045113
24
+ - task:
25
+ name: TAG
26
+ type: token-classification
27
+ metrics:
28
+ - name: TAG (XPOS) Accuracy
29
+ type: accuracy
30
+ value: 0.9506053269
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
+ - name: POS (UPOS) Accuracy
36
  type: accuracy
37
+ value: 0.9506053269
38
  - task:
39
+ name: MORPH
40
  type: token-classification
41
  metrics:
42
+ - name: Morph (UFeats) Accuracy
43
+ type: accuracy
44
+ value: 0.9371428571
 
 
 
 
 
 
45
  - task:
46
+ name: LEMMA
47
  type: token-classification
48
  metrics:
49
+ - name: Lemma Accuracy
50
  type: accuracy
51
+ value: 0.9430508475
52
+ - task:
53
+ name: UNLABELED_DEPENDENCIES
54
+ type: token-classification
55
+ metrics:
56
+ - name: Unlabeled Attachment Score (UAS)
57
+ type: f_score
58
+ value: 0.8080446927
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
+ - name: Labeled Attachment Score (LAS)
64
+ type: f_score
65
+ value: 0.7628624099
66
+ - task:
67
+ name: SENTS
68
+ type: token-classification
69
+ metrics:
70
+ - name: Sentences F-Score
71
+ type: f_score
72
+ value: 0.8956289028
73
  ---
74
  ### Details: https://spacy.io/models/da#da_core_news_sm
75
 
76
+ Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, lemmatizer (trainable_lemmatizer), senter, ner, attribute_ruler.
77
 
78
  | Feature | Description |
79
  | --- | --- |
80
  | **Name** | `da_core_news_sm` |
81
+ | **Version** | `3.3.0` |
82
+ | **spaCy** | `>=3.3.0.dev0,<3.4.0` |
83
+ | **Default Pipeline** | `tok2vec`, `morphologizer`, `parser`, `lemmatizer`, `attribute_ruler`, `ner` |
84
+ | **Components** | `tok2vec`, `morphologizer`, `parser`, `lemmatizer`, `senter`, `attribute_ruler`, `ner` |
85
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
86
+ | **Sources** | [UD Danish DDT v2.8](https://github.com/UniversalDependencies/UD_Danish-DDT) (Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara)<br />[DaNE](https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane) (Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders Søgaard) |
87
  | **License** | `CC BY-SA 4.0` |
88
  | **Author** | [Explosion](https://explosion.ai) |
89
 
91
 
92
  <details>
93
 
94
+ <summary>View label scheme (193 labels for 3 components)</summary>
95
 
96
  | Component | Labels |
97
  | --- | --- |
98
  | **`morphologizer`** | `AdpType=Prep\|POS=ADP`, `Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=AUX\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=PROPN`, `Definite=Ind\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=SCONJ`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Act`, `POS=ADV`, `Number=Plur\|POS=DET\|PronType=Dem`, `Degree=Pos\|Number=Plur\|POS=ADJ`, `Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=PUNCT`, `POS=CCONJ`, `Definite=Ind\|Degree=Cmp\|Number=Sing\|POS=ADJ`, `Degree=Cmp\|POS=ADJ`, `POS=PRON\|PartType=Inf`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Definite=Ind\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Neut\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Dem`, `Degree=Pos\|POS=ADV`, `Definite=Def\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `POS=PRON\|PronType=Dem`, `NumType=Card\|POS=NUM`, `Definite=Ind\|Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=3\|PronType=Prs`, `NumType=Ord\|POS=ADJ`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Mood=Ind\|POS=AUX\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=VERB\|VerbForm=Inf\|Voice=Act`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Act`, `POS=NOUN`, `Mood=Ind\|POS=VERB\|Tense=Pres\|VerbForm=Fin\|Voice=Pass`, `POS=ADP\|PartType=Inf`, `Degree=Pos\|POS=ADJ`, `Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Sing\|POS=NOUN`, `POS=AUX\|VerbForm=Inf\|Voice=Act`, `Definite=Ind\|Degree=Pos\|Gender=Com\|Number=Sing\|POS=ADJ`, `Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Number=Plur\|POS=DET\|PronType=Ind`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Ind`, `Case=Acc\|POS=PRON\|Person=3\|PronType=Prs\|Reflex=Yes`, `POS=PART\|PartType=Inf`, `Gender=Neut\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Acc\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Nom\|Number=Plur\|POS=PRON\|Person=3\|PronType=Prs`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Nom\|Gender=Com\|POS=PRON\|PronType=Ind`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Ind`, `Mood=Imp\|POS=VERB`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Definite=Ind\|Number=Sing\|POS=AUX\|Tense=Past\|VerbForm=Part`, `POS=X`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Def\|Gender=Com\|Number=Plur\|POS=NOUN`, `POS=VERB\|Tense=Pres\|VerbForm=Part`, `Number=Plur\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|VerbForm=Inf\|Voice=Pass`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Sing\|POS=NOUN`, `Degree=Cmp\|POS=ADV`, `POS=ADV\|PartType=Inf`, `Degree=Sup\|POS=ADV`, `Number=Plur\|POS=PRON\|PronType=Dem`, `Number=Plur\|POS=PRON\|PronType=Ind`, `Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|POS=PROPN`, `POS=ADP`, `Degree=Cmp\|Number=Plur\|POS=ADJ`, `Definite=Def\|Degree=Sup\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Gender=Com\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Number=Plur\|POS=PRON\|PronType=Rcp`, `Case=Gen\|Degree=Cmp\|POS=ADJ`, `Case=Gen\|Definite=Def\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=INTJ`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Degree=Pos\|Gender=Neut\|Number=Sing\|POS=ADJ`, `Gender=Neut\|Number=Sing\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Acc\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Plur\|POS=NOUN`, `Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Number=Plur\|Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `Definite=Def\|Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Nom\|Gender=Com\|Number=Sing\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Sing\|POS=NOUN`, `Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `POS=SYM`, `Case=Nom\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Degree=Sup\|POS=ADJ`, `Number=Plur\|POS=DET\|PronType=Ind\|Style=Arch`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Dem`, `Foreign=Yes\|POS=X`, `POS=DET\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|POS=PRON\|PronType=Dem`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=1\|PronType=Prs`, `Case=Gen\|Definite=Ind\|Gender=Neut\|Number=Sing\|POS=NOUN`, `Case=Gen\|POS=PRON\|PronType=Int,Rel`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Dem`, `Abbr=Yes\|POS=X`, `Case=Gen\|Definite=Ind\|Gender=Com\|Number=Plur\|POS=NOUN`, `Definite=Def\|Degree=Abs\|POS=ADJ`, `Definite=Ind\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Definite=Ind\|POS=NOUN`, `Gender=Com\|Number=Plur\|POS=NOUN`, `Number[psor]=Plur\|POS=DET\|Person=1\|Poss=Yes\|PronType=Prs`, `Gender=Com\|POS=PRON\|PronType=Int,Rel`, `Case=Nom\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Degree=Abs\|POS=ADV`, `POS=VERB\|VerbForm=Ger`, `POS=VERB\|Tense=Past\|VerbForm=Part`, `Definite=Def\|Degree=Sup\|Number=Sing\|POS=ADJ`, `Number=Plur\|Number[psor]=Plur\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs\|Style=Form`, `Case=Gen\|Definite=Def\|Degree=Pos\|Number=Sing\|POS=ADJ`, `Case=Gen\|Degree=Pos\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|POS=PRON\|Person=2\|Polite=Form\|PronType=Prs`, `Gender=Com\|Number=Sing\|POS=PRON\|PronType=Int,Rel`, `POS=VERB\|Tense=Pres`, `Case=Gen\|Number=Plur\|POS=DET\|PronType=Ind`, `Number[psor]=Plur\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=PRON\|Person=2\|Polite=Form\|Poss=Yes\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `POS=AUX\|Tense=Pres\|VerbForm=Part`, `Mood=Ind\|POS=VERB\|Tense=Past\|VerbForm=Fin\|Voice=Pass`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Degree=Sup\|Number=Plur\|POS=ADJ`, `Case=Acc\|Gender=Com\|Number=Plur\|POS=PRON\|Person=2\|PronType=Prs`, `Gender=Neut\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs\|Reflex=Yes`, `Definite=Ind\|Number=Plur\|POS=NOUN`, `Case=Gen\|Number=Plur\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Mood=Imp\|POS=AUX`, `Gender=Com\|Number=Sing\|Number[psor]=Sing\|POS=PRON\|Person=1\|Poss=Yes\|PronType=Prs`, `Number[psor]=Sing\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `Definite=Def\|Gender=Com\|Number=Sing\|POS=VERB\|Tense=Past\|VerbForm=Part`, `Number=Plur\|Number[psor]=Sing\|POS=DET\|Person=2\|Poss=Yes\|PronType=Prs`, `Case=Gen\|Gender=Com\|Number=Sing\|POS=DET\|PronType=Ind`, `Case=Gen\|POS=NOUN`, `Number[psor]=Plur\|POS=PRON\|Person=3\|Poss=Yes\|PronType=Prs`, `POS=DET\|PronType=Dem`, `Definite=Def\|Number=Plur\|POS=NOUN` |
99
  | **`parser`** | `ROOT`, `acl:relcl`, `advcl`, `advmod`, `advmod:lmod`, `amod`, `appos`, `aux`, `case`, `cc`, `ccomp`, `compound:prt`, `conj`, `cop`, `dep`, `det`, `expl`, `fixed`, `flat`, `iobj`, `list`, `mark`, `nmod`, `nmod:poss`, `nsubj`, `nummod`, `obj`, `obl`, `obl:lmod`, `obl:tmod`, `punct`, `xcomp` |
 
100
  | **`ner`** | `LOC`, `MISC`, `ORG`, `PER` |
101
 
102
  </details>
109
  | `TOKEN_P` | 99.78 |
110
  | `TOKEN_R` | 99.75 |
111
  | `TOKEN_F` | 99.76 |
112
+ | `POS_ACC` | 95.06 |
113
  | `MORPH_ACC` | 93.71 |
114
+ | `MORPH_MICRO_P` | 95.66 |
115
+ | `MORPH_MICRO_R` | 94.89 |
116
+ | `MORPH_MICRO_F` | 95.27 |
117
+ | `SENTS_P` | 90.13 |
118
  | `SENTS_R` | 89.01 |
119
+ | `SENTS_F` | 89.56 |
120
+ | `DEP_UAS` | 80.80 |
121
+ | `DEP_LAS` | 76.29 |
122
+ | `LEMMA_ACC` | 94.31 |
123
+ | `TAG_ACC` | 95.06 |
124
+ | `ENTS_P` | 74.50 |
125
+ | `ENTS_R` | 70.00 |
126
+ | `ENTS_F` | 72.18 |
accuracy.json CHANGED
@@ -3,51 +3,51 @@
3
  "token_p": 0.9977732598,
4
  "token_r": 0.9974835463,
5
  "token_f": 0.997628382,
6
- "pos_acc": 0.9507990315,
7
  "morph_acc": 0.9371428571,
8
- "morph_micro_p": 0.9563225412,
9
- "morph_micro_r": 0.9482610671,
10
- "morph_micro_f": 0.9522747434,
11
  "morph_per_feat": {
12
  "Mood": {
13
- "p": 0.9639468691,
14
- "r": 0.9685414681,
15
- "f": 0.9662387066
16
  },
17
  "Tense": {
18
- "p": 0.9529147982,
19
- "r": 0.9600903614,
20
- "f": 0.9564891223
21
  },
22
  "VerbForm": {
23
- "p": 0.9471419791,
24
- "r": 0.9430844553,
25
- "f": 0.9451088623
26
  },
27
  "Voice": {
28
- "p": 0.968492123,
29
- "r": 0.9648729447,
30
- "f": 0.9666791464
31
  },
32
  "Definite": {
33
- "p": 0.9468212715,
34
- "r": 0.9355985776,
35
- "f": 0.9411764706
36
  },
37
  "Gender": {
38
- "p": 0.9299128102,
39
- "r": 0.9215686275,
40
- "f": 0.9257219162
41
  },
42
  "Number": {
43
- "p": 0.9519535375,
44
- "r": 0.9405320814,
45
- "f": 0.9462083443
46
  },
47
  "AdpType": {
48
- "p": 1.0,
49
- "r": 0.9840848806,
50
- "f": 0.9919786096
51
  },
52
  "PartType": {
53
  "p": 1.0,
@@ -55,59 +55,59 @@
55
  "f": 0.9983739837
56
  },
57
  "Case": {
58
- "p": 0.9727126806,
59
- "r": 0.9573459716,
60
- "f": 0.9649681529
61
  },
62
  "Person": {
63
- "p": 0.9770723104,
64
- "r": 0.9840142096,
65
- "f": 0.9805309735
66
  },
67
  "PronType": {
68
- "p": 0.9826875515,
69
- "r": 0.9802631579,
70
- "f": 0.9814738576
71
  },
72
  "NumType": {
73
- "p": 0.9793103448,
74
  "r": 0.940397351,
75
- "f": 0.9594594595
76
  },
77
  "Degree": {
78
- "p": 0.9402985075,
79
- "r": 0.9108433735,
80
- "f": 0.9253365973
81
  },
82
  "Reflex": {
83
  "p": 1.0,
84
  "r": 1.0,
85
  "f": 1.0
86
  },
87
- "Polite": {
88
- "p": 0.75,
89
- "r": 0.75,
90
- "f": 0.75
91
- },
92
  "Number[psor]": {
93
- "p": 0.9885057471,
94
  "r": 1.0,
95
- "f": 0.9942196532
96
  },
97
  "Poss": {
98
- "p": 1.0,
99
  "r": 1.0,
100
- "f": 1.0
 
 
 
 
 
101
  },
102
  "Foreign": {
103
- "p": 0.8333333333,
104
- "r": 0.5,
105
- "f": 0.625
106
  },
107
  "Abbr": {
108
- "p": 1.0,
109
- "r": 0.2,
110
- "f": 0.3333333333
111
  },
112
  "Style": {
113
  "p": 1.0,
@@ -115,141 +115,141 @@
115
  "f": 1.0
116
  }
117
  },
118
- "sents_p": 0.9045045045,
119
  "sents_r": 0.890070922,
120
- "sents_f": 0.8972296693,
121
- "dep_uas": 0.8070861741,
122
- "dep_las": 0.7640549905,
123
  "dep_las_per_type": {
124
  "advmod": {
125
- "p": 0.6948682386,
126
- "r": 0.7076271186,
127
- "f": 0.7011896431
128
  },
129
  "root": {
130
- "p": 0.8078994614,
131
- "r": 0.7978723404,
132
- "f": 0.8028545941
133
  },
134
  "nsubj": {
135
- "p": 0.8367129136,
136
- "r": 0.8270042194,
137
- "f": 0.8318302387
138
  },
139
  "case": {
140
- "p": 0.8841584158,
141
- "r": 0.8806706114,
142
- "f": 0.8824110672
143
  },
144
  "obl": {
145
- "p": 0.6810207337,
146
- "r": 0.6630434783,
147
- "f": 0.6719118804
148
  },
149
  "cc": {
150
- "p": 0.7620396601,
151
- "r": 0.7819767442,
152
- "f": 0.7718794835
153
  },
154
  "conj": {
155
- "p": 0.6066481994,
156
- "r": 0.584,
157
- "f": 0.5951086957
158
  },
159
  "obj": {
160
- "p": 0.7678244973,
161
- "r": 0.8155339806,
162
- "f": 0.790960452
163
  },
164
  "aux": {
165
- "p": 0.875,
166
- "r": 0.8571428571,
167
- "f": 0.8659793814
168
  },
169
  "acl:relcl": {
170
- "p": 0.5852272727,
171
- "r": 0.5567567568,
172
- "f": 0.5706371191
173
  },
174
  "advmod:lmod": {
175
- "p": 0.7042253521,
176
- "r": 0.7462686567,
177
- "f": 0.7246376812
178
  },
179
  "det": {
180
- "p": 0.903814262,
181
- "r": 0.8978583196,
182
- "f": 0.9008264463
183
  },
184
  "amod": {
185
- "p": 0.7807757167,
186
- "r": 0.7901023891,
187
- "f": 0.7854113656
188
  },
189
  "nmod:poss": {
190
- "p": 0.7083333333,
191
- "r": 0.6732673267,
192
- "f": 0.6903553299
193
  },
194
  "ccomp": {
195
- "p": 0.6212121212,
196
- "r": 0.6612903226,
197
- "f": 0.640625
198
  },
199
  "nummod": {
200
- "p": 0.7868852459,
201
- "r": 0.8,
202
- "f": 0.7933884298
203
  },
204
  "flat": {
205
- "p": 0.7804878049,
206
  "r": 0.8476821192,
207
- "f": 0.8126984127
208
  },
209
  "compound:prt": {
210
- "p": 0.652173913,
211
- "r": 0.3658536585,
212
- "f": 0.46875
213
  },
214
  "advcl": {
215
- "p": 0.6339285714,
216
- "r": 0.6120689655,
217
- "f": 0.6228070175
218
  },
219
  "mark": {
220
- "p": 0.8773784355,
221
- "r": 0.8521560575,
222
- "f": 0.8645833333
223
  },
224
  "cop": {
225
- "p": 0.7700534759,
226
- "r": 0.8228571429,
227
- "f": 0.7955801105
228
  },
229
  "dep": {
230
- "p": 0.1647058824,
231
  "r": 0.2641509434,
232
- "f": 0.2028985507
233
  },
234
  "nmod": {
235
- "p": 0.6058252427,
236
- "r": 0.609375,
237
- "f": 0.6075949367
238
  },
239
  "iobj": {
240
- "p": 0.8181818182,
241
  "r": 0.4090909091,
242
- "f": 0.5454545455
243
  },
244
  "xcomp": {
245
- "p": 0.431372549,
246
- "r": 0.3728813559,
247
- "f": 0.4
248
  },
249
  "list": {
250
- "p": 0.3,
251
- "r": 0.1666666667,
252
- "f": 0.2142857143
253
  },
254
  "vocative": {
255
  "p": 0.0,
@@ -257,62 +257,62 @@
257
  "f": 0.0
258
  },
259
  "fixed": {
260
- "p": 0.85,
261
- "r": 0.8292682927,
262
- "f": 0.8395061728
263
- },
264
- "obl:lmod": {
265
- "p": 0.0,
266
- "r": 0.0,
267
- "f": 0.0
268
  },
269
  "expl": {
270
- "p": 0.7941176471,
271
  "r": 0.7941176471,
272
- "f": 0.7941176471
273
  },
274
  "appos": {
275
- "p": 0.4333333333,
276
- "r": 0.3939393939,
277
- "f": 0.4126984127
278
  },
279
  "obl:tmod": {
280
- "p": 0.8571428571,
281
- "r": 0.3333333333,
282
- "f": 0.48
283
  },
284
  "discourse": {
285
  "p": 0.0,
286
  "r": 0.0,
287
  "f": 0.0
 
 
 
 
 
288
  }
289
  },
290
- "tag_acc": 0.9507990315,
291
- "lemma_acc": 0.8491041162,
292
- "ents_p": 0.7682119205,
293
- "ents_r": 0.725,
294
- "ents_f": 0.7459807074,
295
  "ents_per_type": {
296
  "PER": {
297
- "p": 0.8089171975,
298
- "r": 0.765060241,
299
- "f": 0.786377709
300
  },
301
  "ORG": {
302
- "p": 0.7215189873,
303
- "r": 0.6333333333,
304
- "f": 0.674556213
305
  },
306
  "MISC": {
307
- "p": 0.6782608696,
308
- "r": 0.6902654867,
309
- "f": 0.6842105263
310
  },
311
  "LOC": {
312
- "p": 0.8431372549,
313
- "r": 0.7747747748,
314
- "f": 0.8075117371
315
  }
316
  },
317
- "speed": 10057.2129225514
318
  }
3
  "token_p": 0.9977732598,
4
  "token_r": 0.9974835463,
5
  "token_f": 0.997628382,
6
+ "pos_acc": 0.9506053269,
7
  "morph_acc": 0.9371428571,
8
+ "morph_micro_p": 0.9565925398,
9
+ "morph_micro_r": 0.9488667912,
10
+ "morph_micro_f": 0.9527140033,
11
  "morph_per_feat": {
12
  "Mood": {
13
+ "p": 0.9636711281,
14
+ "r": 0.9609151573,
15
+ "f": 0.9622911695
16
  },
17
  "Tense": {
18
+ "p": 0.9503759398,
19
+ "r": 0.9518072289,
20
+ "f": 0.9510910459
21
  },
22
  "VerbForm": {
23
+ "p": 0.9457125231,
24
+ "r": 0.9381884945,
25
+ "f": 0.9419354839
26
  },
27
  "Voice": {
28
+ "p": 0.9713423831,
29
+ "r": 0.9626307922,
30
+ "f": 0.966966967
31
  },
32
  "Definite": {
33
+ "p": 0.950039968,
34
+ "r": 0.9391544844,
35
+ "f": 0.9445658653
36
  },
37
  "Gender": {
38
+ "p": 0.9317497491,
39
+ "r": 0.9255566633,
40
+ "f": 0.928642881
41
  },
42
  "Number": {
43
+ "p": 0.9502107482,
44
+ "r": 0.9407929056,
45
+ "f": 0.9454783748
46
  },
47
  "AdpType": {
48
+ "p": 0.9991087344,
49
+ "r": 0.991158267,
50
+ "f": 0.9951176209
51
  },
52
  "PartType": {
53
  "p": 1.0,
55
  "f": 0.9983739837
56
  },
57
  "Case": {
58
+ "p": 0.9774557166,
59
+ "r": 0.9589257504,
60
+ "f": 0.9681020734
61
  },
62
  "Person": {
63
+ "p": 0.9771528998,
64
+ "r": 0.9875666075,
65
+ "f": 0.9823321555
66
  },
67
  "PronType": {
68
+ "p": 0.9851851852,
69
+ "r": 0.984375,
70
+ "f": 0.984779926
71
  },
72
  "NumType": {
73
+ "p": 0.9726027397,
74
  "r": 0.940397351,
75
+ "f": 0.9562289562
76
  },
77
  "Degree": {
78
+ "p": 0.9394313968,
79
+ "r": 0.9156626506,
80
+ "f": 0.9273947529
81
  },
82
  "Reflex": {
83
  "p": 1.0,
84
  "r": 1.0,
85
  "f": 1.0
86
  },
 
 
 
 
 
87
  "Number[psor]": {
88
+ "p": 0.9772727273,
89
  "r": 1.0,
90
+ "f": 0.9885057471
91
  },
92
  "Poss": {
93
+ "p": 0.9887640449,
94
  "r": 1.0,
95
+ "f": 0.9943502825
96
+ },
97
+ "Polite": {
98
+ "p": 0.6,
99
+ "r": 0.75,
100
+ "f": 0.6666666667
101
  },
102
  "Foreign": {
103
+ "p": 1.0,
104
+ "r": 0.4,
105
+ "f": 0.5714285714
106
  },
107
  "Abbr": {
108
+ "p": 0.6666666667,
109
+ "r": 0.4,
110
+ "f": 0.5
111
  },
112
  "Style": {
113
  "p": 1.0,
115
  "f": 1.0
116
  }
117
  },
118
+ "sents_p": 0.9012567325,
119
  "sents_r": 0.890070922,
120
+ "sents_f": 0.8956289028,
121
+ "dep_uas": 0.8080446927,
122
+ "dep_las": 0.7628624099,
123
  "dep_las_per_type": {
124
  "advmod": {
125
+ "p": 0.6630434783,
126
+ "r": 0.6892655367,
127
+ "f": 0.675900277
128
  },
129
  "root": {
130
+ "p": 0.8085867621,
131
+ "r": 0.8014184397,
132
+ "f": 0.8049866429
133
  },
134
  "nsubj": {
135
+ "p": 0.8269639066,
136
+ "r": 0.8217299578,
137
+ "f": 0.8243386243
138
  },
139
  "case": {
140
+ "p": 0.881372549,
141
+ "r": 0.8865877712,
142
+ "f": 0.883972468
143
  },
144
  "obl": {
145
+ "p": 0.6786885246,
146
+ "r": 0.6428571429,
147
+ "f": 0.6602870813
148
  },
149
  "cc": {
150
+ "p": 0.7614942529,
151
+ "r": 0.7703488372,
152
+ "f": 0.7658959538
153
  },
154
  "conj": {
155
+ "p": 0.6264044944,
156
+ "r": 0.5946666667,
157
+ "f": 0.610123119
158
  },
159
  "obj": {
160
+ "p": 0.761732852,
161
+ "r": 0.8194174757,
162
+ "f": 0.7895229186
163
  },
164
  "aux": {
165
+ "p": 0.8795180723,
166
+ "r": 0.8513119534,
167
+ "f": 0.8651851852
168
  },
169
  "acl:relcl": {
170
+ "p": 0.6054054054,
171
+ "r": 0.6054054054,
172
+ "f": 0.6054054054
173
  },
174
  "advmod:lmod": {
175
+ "p": 0.6875,
176
+ "r": 0.6567164179,
177
+ "f": 0.6717557252
178
  },
179
  "det": {
180
+ "p": 0.9126853377,
181
+ "r": 0.9126853377,
182
+ "f": 0.9126853377
183
  },
184
  "amod": {
185
+ "p": 0.7852348993,
186
+ "r": 0.7986348123,
187
+ "f": 0.7918781726
188
  },
189
  "nmod:poss": {
190
+ "p": 0.6767676768,
191
+ "r": 0.6633663366,
192
+ "f": 0.67
193
  },
194
  "ccomp": {
195
+ "p": 0.6290322581,
196
+ "r": 0.6290322581,
197
+ "f": 0.6290322581
198
  },
199
  "nummod": {
200
+ "p": 0.824,
201
+ "r": 0.8583333333,
202
+ "f": 0.8408163265
203
  },
204
  "flat": {
205
+ "p": 0.7901234568,
206
  "r": 0.8476821192,
207
+ "f": 0.8178913738
208
  },
209
  "compound:prt": {
210
+ "p": 0.44,
211
+ "r": 0.2682926829,
212
+ "f": 0.3333333333
213
  },
214
  "advcl": {
215
+ "p": 0.5877192982,
216
+ "r": 0.5775862069,
217
+ "f": 0.5826086957
218
  },
219
  "mark": {
220
+ "p": 0.872651357,
221
+ "r": 0.8583162218,
222
+ "f": 0.8654244306
223
  },
224
  "cop": {
225
+ "p": 0.752688172,
226
+ "r": 0.8,
227
+ "f": 0.7756232687
228
  },
229
  "dep": {
230
+ "p": 0.1707317073,
231
  "r": 0.2641509434,
232
+ "f": 0.2074074074
233
  },
234
  "nmod": {
235
+ "p": 0.6122840691,
236
+ "r": 0.623046875,
237
+ "f": 0.6176185866
238
  },
239
  "iobj": {
240
+ "p": 0.6923076923,
241
  "r": 0.4090909091,
242
+ "f": 0.5142857143
243
  },
244
  "xcomp": {
245
+ "p": 0.5161290323,
246
+ "r": 0.2711864407,
247
+ "f": 0.3555555556
248
  },
249
  "list": {
250
+ "p": 0.3636363636,
251
+ "r": 0.2222222222,
252
+ "f": 0.275862069
253
  },
254
  "vocative": {
255
  "p": 0.0,
257
  "f": 0.0
258
  },
259
  "fixed": {
260
+ "p": 0.8684210526,
261
+ "r": 0.8048780488,
262
+ "f": 0.835443038
 
 
 
 
 
263
  },
264
  "expl": {
265
+ "p": 0.84375,
266
  "r": 0.7941176471,
267
+ "f": 0.8181818182
268
  },
269
  "appos": {
270
+ "p": 0.4666666667,
271
+ "r": 0.4242424242,
272
+ "f": 0.4444444444
273
  },
274
  "obl:tmod": {
275
+ "p": 0.7777777778,
276
+ "r": 0.3888888889,
277
+ "f": 0.5185185185
278
  },
279
  "discourse": {
280
  "p": 0.0,
281
  "r": 0.0,
282
  "f": 0.0
283
+ },
284
+ "obl:lmod": {
285
+ "p": 0.0,
286
+ "r": 0.0,
287
+ "f": 0.0
288
  }
289
  },
290
+ "lemma_acc": 0.9430508475,
291
+ "tag_acc": 0.9506053269,
292
+ "ents_p": 0.7450110865,
293
+ "ents_r": 0.7,
294
+ "ents_f": 0.7218045113,
295
  "ents_per_type": {
296
  "PER": {
297
+ "p": 0.7818181818,
298
+ "r": 0.7771084337,
299
+ "f": 0.7794561934
300
  },
301
  "ORG": {
302
+ "p": 0.6463414634,
303
+ "r": 0.5888888889,
304
+ "f": 0.6162790698
305
  },
306
  "MISC": {
307
+ "p": 0.6698113208,
308
+ "r": 0.6283185841,
309
+ "f": 0.6484018265
310
  },
311
  "LOC": {
312
+ "p": 0.8469387755,
313
+ "r": 0.7477477477,
314
+ "f": 0.7942583732
315
  }
316
  },
317
+ "speed": 12430.3213337348
318
  }
attribute_ruler/patterns CHANGED
Binary files a/attribute_ruler/patterns and b/attribute_ruler/patterns differ
config.cfg CHANGED
@@ -10,7 +10,7 @@ seed = 0
10
 
11
  [nlp]
12
  lang = "da"
13
- pipeline = ["tok2vec","morphologizer","parser","senter","attribute_ruler","lemmatizer","ner"]
14
  disabled = ["senter"]
15
  before_creation = null
16
  after_creation = null
@@ -26,11 +26,22 @@ scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
26
  validate = false
27
 
28
  [components.lemmatizer]
29
- factory = "lemmatizer"
30
- mode = "lookup"
31
- model = null
32
  overwrite = false
33
  scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  [components.morphologizer]
36
  factory = "morphologizer"
@@ -39,8 +50,9 @@ overwrite = true
39
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
40
 
41
  [components.morphologizer.model]
42
- @architectures = "spacy.Tagger.v1"
43
  nO = null
 
44
 
45
  [components.morphologizer.model.tok2vec]
46
  @architectures = "spacy.Tok2VecListener.v1"
@@ -70,7 +82,7 @@ nO = null
70
  @architectures = "spacy.MultiHashEmbed.v2"
71
  width = 96
72
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
73
- rows = [5000,2500,2500,2500,100]
74
  include_static_vectors = false
75
 
76
  [components.ner.model.tok2vec.encode]
@@ -108,8 +120,9 @@ overwrite = false
108
  scorer = {"@scorers":"spacy.senter_scorer.v1"}
109
 
110
  [components.senter.model]
111
- @architectures = "spacy.Tagger.v1"
112
  nO = null
 
113
 
114
  [components.senter.model.tok2vec]
115
  @architectures = "spacy.Tok2Vec.v2"
@@ -138,7 +151,7 @@ factory = "tok2vec"
138
  @architectures = "spacy.MultiHashEmbed.v2"
139
  width = ${components.tok2vec.model.encode:width}
140
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
141
- rows = [5000,2500,2500,2500,100]
142
  include_static_vectors = false
143
 
144
  [components.tok2vec.model.encode]
@@ -175,7 +188,7 @@ dropout = 0.1
175
  accumulate_gradient = 1
176
  patience = 5000
177
  max_epochs = 0
178
- max_steps = 0
179
  eval_frequency = 1000
180
  frozen_components = []
181
  before_to_disk = null
@@ -210,17 +223,17 @@ eps = 0.00000001
210
  learn_rate = 0.001
211
 
212
  [training.score_weights]
213
- pos_acc = 0.08
214
- morph_acc = 0.08
215
  morph_per_feat = null
216
  dep_uas = 0.0
217
- dep_las = 0.16
218
  dep_las_per_type = null
219
  sents_p = null
220
  sents_r = null
221
- sents_f = 0.02
222
- lemma_acc = 0.5
223
- ents_f = 0.16
224
  ents_p = 0.0
225
  ents_r = 0.0
226
  ents_per_type = null
@@ -237,6 +250,13 @@ after_init = null
237
 
238
  [initialize.components]
239
 
 
 
 
 
 
 
 
240
  [initialize.components.morphologizer]
241
 
242
  [initialize.components.morphologizer.labels]
10
 
11
  [nlp]
12
  lang = "da"
13
+ pipeline = ["tok2vec","morphologizer","parser","lemmatizer","senter","attribute_ruler","ner"]
14
  disabled = ["senter"]
15
  before_creation = null
16
  after_creation = null
26
  validate = false
27
 
28
  [components.lemmatizer]
29
+ factory = "trainable_lemmatizer"
30
+ backoff = "orth"
31
+ min_tree_freq = 3
32
  overwrite = false
33
  scorer = {"@scorers":"spacy.lemmatizer_scorer.v1"}
34
+ top_k = 1
35
+
36
+ [components.lemmatizer.model]
37
+ @architectures = "spacy.Tagger.v2"
38
+ nO = null
39
+ normalize = false
40
+
41
+ [components.lemmatizer.model.tok2vec]
42
+ @architectures = "spacy.Tok2VecListener.v1"
43
+ width = ${components.tok2vec.model.encode:width}
44
+ upstream = "tok2vec"
45
 
46
  [components.morphologizer]
47
  factory = "morphologizer"
50
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
51
 
52
  [components.morphologizer.model]
53
+ @architectures = "spacy.Tagger.v2"
54
  nO = null
55
+ normalize = false
56
 
57
  [components.morphologizer.model.tok2vec]
58
  @architectures = "spacy.Tok2VecListener.v1"
82
  @architectures = "spacy.MultiHashEmbed.v2"
83
  width = 96
84
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
85
+ rows = [5000,1000,2500,2500,50]
86
  include_static_vectors = false
87
 
88
  [components.ner.model.tok2vec.encode]
120
  scorer = {"@scorers":"spacy.senter_scorer.v1"}
121
 
122
  [components.senter.model]
123
+ @architectures = "spacy.Tagger.v2"
124
  nO = null
125
+ normalize = false
126
 
127
  [components.senter.model.tok2vec]
128
  @architectures = "spacy.Tok2Vec.v2"
151
  @architectures = "spacy.MultiHashEmbed.v2"
152
  width = ${components.tok2vec.model.encode:width}
153
  attrs = ["NORM","PREFIX","SUFFIX","SHAPE","SPACY"]
154
+ rows = [5000,1000,2500,2500,50]
155
  include_static_vectors = false
156
 
157
  [components.tok2vec.model.encode]
188
  accumulate_gradient = 1
189
  patience = 5000
190
  max_epochs = 0
191
+ max_steps = 100000
192
  eval_frequency = 1000
193
  frozen_components = []
194
  before_to_disk = null
223
  learn_rate = 0.001
224
 
225
  [training.score_weights]
226
+ pos_acc = 0.14
227
+ morph_acc = 0.14
228
  morph_per_feat = null
229
  dep_uas = 0.0
230
+ dep_las = 0.29
231
  dep_las_per_type = null
232
  sents_p = null
233
  sents_r = null
234
+ sents_f = 0.04
235
+ lemma_acc = 0.1
236
+ ents_f = 0.29
237
  ents_p = 0.0
238
  ents_r = 0.0
239
  ents_per_type = null
250
 
251
  [initialize.components]
252
 
253
+ [initialize.components.lemmatizer]
254
+
255
+ [initialize.components.lemmatizer.labels]
256
+ @readers = "spacy.read_labels.v1"
257
+ path = "corpus/labels/trainable_lemmatizer.json"
258
+ require = false
259
+
260
  [initialize.components.morphologizer]
261
 
262
  [initialize.components.morphologizer.labels]
da_core_news_sm-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bfbc3ee87da2c0ae1523f78ce34ff0713684928bac1e0d450598725555acaf5d
3
- size 19129449
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ce68d469f3fee83ffd81e65031b80db2b6e671d300bc14e76f0d08cf50cfa6a3
3
+ size 12379452
lemmatizer/cfg ADDED
@@ -0,0 +1,457 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "labels":[
3
+ 1,
4
+ 2,
5
+ 4,
6
+ 6,
7
+ 8,
8
+ 10,
9
+ 12,
10
+ 14,
11
+ 16,
12
+ 18,
13
+ 20,
14
+ 24,
15
+ 28,
16
+ 30,
17
+ 32,
18
+ 34,
19
+ 36,
20
+ 39,
21
+ 41,
22
+ 42,
23
+ 43,
24
+ 45,
25
+ 47,
26
+ 49,
27
+ 51,
28
+ 53,
29
+ 55,
30
+ 57,
31
+ 61,
32
+ 65,
33
+ 67,
34
+ 71,
35
+ 73,
36
+ 75,
37
+ 77,
38
+ 79,
39
+ 81,
40
+ 83,
41
+ 85,
42
+ 87,
43
+ 89,
44
+ 91,
45
+ 93,
46
+ 95,
47
+ 99,
48
+ 101,
49
+ 102,
50
+ 104,
51
+ 107,
52
+ 111,
53
+ 113,
54
+ 116,
55
+ 118,
56
+ 121,
57
+ 124,
58
+ 127,
59
+ 128,
60
+ 131,
61
+ 133,
62
+ 134,
63
+ 136,
64
+ 138,
65
+ 140,
66
+ 142,
67
+ 144,
68
+ 145,
69
+ 147,
70
+ 148,
71
+ 149,
72
+ 153,
73
+ 155,
74
+ 158,
75
+ 161,
76
+ 164,
77
+ 166,
78
+ 168,
79
+ 170,
80
+ 172,
81
+ 174,
82
+ 175,
83
+ 177,
84
+ 179,
85
+ 182,
86
+ 184,
87
+ 186,
88
+ 188,
89
+ 190,
90
+ 192,
91
+ 194,
92
+ 196,
93
+ 199,
94
+ 201,
95
+ 203,
96
+ 204,
97
+ 207,
98
+ 208,
99
+ 209,
100
+ 211,
101
+ 213,
102
+ 214,
103
+ 216,
104
+ 218,
105
+ 220,
106
+ 222,
107
+ 224,
108
+ 226,
109
+ 229,
110
+ 231,
111
+ 232,
112
+ 233,
113
+ 235,
114
+ 236,
115
+ 238,
116
+ 239,
117
+ 243,
118
+ 249,
119
+ 253,
120
+ 255,
121
+ 257,
122
+ 259,
123
+ 261,
124
+ 262,
125
+ 263,
126
+ 264,
127
+ 267,
128
+ 269,
129
+ 270,
130
+ 272,
131
+ 274,
132
+ 276,
133
+ 278,
134
+ 280,
135
+ 282,
136
+ 284,
137
+ 286,
138
+ 290,
139
+ 291,
140
+ 293,
141
+ 295,
142
+ 297,
143
+ 299,
144
+ 300,
145
+ 302,
146
+ 303,
147
+ 304,
148
+ 306,
149
+ 308,
150
+ 311,
151
+ 314,
152
+ 315,
153
+ 317,
154
+ 320,
155
+ 321,
156
+ 323,
157
+ 324,
158
+ 326,
159
+ 327,
160
+ 328,
161
+ 330,
162
+ 331,
163
+ 333,
164
+ 337,
165
+ 339,
166
+ 340,
167
+ 344,
168
+ 346,
169
+ 350,
170
+ 353,
171
+ 354,
172
+ 355,
173
+ 358,
174
+ 360,
175
+ 361,
176
+ 363,
177
+ 365,
178
+ 366,
179
+ 369,
180
+ 372,
181
+ 373,
182
+ 376,
183
+ 380,
184
+ 382,
185
+ 383,
186
+ 384,
187
+ 386,
188
+ 387,
189
+ 389,
190
+ 391,
191
+ 392,
192
+ 394,
193
+ 395,
194
+ 398,
195
+ 400,
196
+ 402,
197
+ 404,
198
+ 406,
199
+ 409,
200
+ 411,
201
+ 412,
202
+ 413,
203
+ 415,
204
+ 417,
205
+ 420,
206
+ 421,
207
+ 423,
208
+ 424,
209
+ 425,
210
+ 427,
211
+ 429,
212
+ 431,
213
+ 433,
214
+ 434,
215
+ 436,
216
+ 437,
217
+ 439,
218
+ 440,
219
+ 442,
220
+ 444,
221
+ 445,
222
+ 449,
223
+ 450,
224
+ 452,
225
+ 454,
226
+ 457,
227
+ 459,
228
+ 462,
229
+ 465,
230
+ 466,
231
+ 468,
232
+ 470,
233
+ 471,
234
+ 474,
235
+ 475,
236
+ 478,
237
+ 480,
238
+ 483,
239
+ 485,
240
+ 486,
241
+ 487,
242
+ 489,
243
+ 491,
244
+ 492,
245
+ 493,
246
+ 495,
247
+ 496,
248
+ 498,
249
+ 500,
250
+ 501,
251
+ 502,
252
+ 503,
253
+ 504,
254
+ 505,
255
+ 507,
256
+ 508,
257
+ 509,
258
+ 510,
259
+ 511,
260
+ 512,
261
+ 514,
262
+ 515,
263
+ 516,
264
+ 518,
265
+ 519,
266
+ 520,
267
+ 521,
268
+ 523,
269
+ 525,
270
+ 526,
271
+ 528,
272
+ 531,
273
+ 533,
274
+ 535,
275
+ 453,
276
+ 536,
277
+ 538,
278
+ 539,
279
+ 541,
280
+ 545,
281
+ 547,
282
+ 548,
283
+ 549,
284
+ 550,
285
+ 551,
286
+ 553,
287
+ 554,
288
+ 555,
289
+ 557,
290
+ 559,
291
+ 560,
292
+ 561,
293
+ 563,
294
+ 565,
295
+ 566,
296
+ 567,
297
+ 568,
298
+ 570,
299
+ 571,
300
+ 575,
301
+ 577,
302
+ 578,
303
+ 579,
304
+ 582,
305
+ 585,
306
+ 587,
307
+ 589,
308
+ 593,
309
+ 594,
310
+ 596,
311
+ 597,
312
+ 601,
313
+ 603,
314
+ 605,
315
+ 609,
316
+ 611,
317
+ 612,
318
+ 613,
319
+ 614,
320
+ 615,
321
+ 616,
322
+ 617,
323
+ 619,
324
+ 621,
325
+ 622,
326
+ 624,
327
+ 625,
328
+ 627,
329
+ 628,
330
+ 629,
331
+ 632,
332
+ 634,
333
+ 638,
334
+ 639,
335
+ 640,
336
+ 642,
337
+ 644,
338
+ 647,
339
+ 649,
340
+ 650,
341
+ 651,
342
+ 653,
343
+ 654,
344
+ 655,
345
+ 657,
346
+ 658,
347
+ 659,
348
+ 661,
349
+ 663,
350
+ 665,
351
+ 667,
352
+ 669,
353
+ 670,
354
+ 672,
355
+ 674,
356
+ 676,
357
+ 677,
358
+ 678,
359
+ 680,
360
+ 682,
361
+ 683,
362
+ 685,
363
+ 686,
364
+ 688,
365
+ 689,
366
+ 690,
367
+ 691,
368
+ 694,
369
+ 695,
370
+ 696,
371
+ 697,
372
+ 699,
373
+ 700,
374
+ 701,
375
+ 703,
376
+ 705,
377
+ 706,
378
+ 707,
379
+ 708,
380
+ 712,
381
+ 715,
382
+ 716,
383
+ 718,
384
+ 720,
385
+ 724,
386
+ 726,
387
+ 729,
388
+ 730,
389
+ 732,
390
+ 733,
391
+ 734,
392
+ 736,
393
+ 738,
394
+ 739,
395
+ 740,
396
+ 741,
397
+ 742,
398
+ 743,
399
+ 744,
400
+ 747,
401
+ 749,
402
+ 753,
403
+ 756,
404
+ 758,
405
+ 759,
406
+ 761,
407
+ 762,
408
+ 763,
409
+ 764,
410
+ 766,
411
+ 768,
412
+ 769,
413
+ 771,
414
+ 773,
415
+ 774,
416
+ 775,
417
+ 776,
418
+ 777,
419
+ 781,
420
+ 783,
421
+ 784,
422
+ 785,
423
+ 788,
424
+ 791,
425
+ 792,
426
+ 794,
427
+ 796,
428
+ 797,
429
+ 798,
430
+ 799,
431
+ 800,
432
+ 802,
433
+ 803,
434
+ 804,
435
+ 805,
436
+ 806,
437
+ 808,
438
+ 809,
439
+ 810,
440
+ 811,
441
+ 812,
442
+ 814,
443
+ 815,
444
+ 817,
445
+ 819,
446
+ 820,
447
+ 822,
448
+ 824,
449
+ 825,
450
+ 827,
451
+ 829,
452
+ 831,
453
+ 833,
454
+ 835,
455
+ 837
456
+ ]
457
+ }
lemmatizer/{lookups/lookups.bin → model} RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6864ce8705293ba1b6dcf349ec133cdc33db3ba57f6e9337458cfe5073b6f103
3
- size 11537995
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f53e8bdea1eb3bca6208e92fe31944d15dd562ecd791ba6559e12e713fd5f0c1
3
+ size 176206
lemmatizer/trees ADDED
Binary file (89.9 kB). View file
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"da",
3
  "name":"core_news_sm",
4
- "version":"3.2.0",
5
- "description":"Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, senter, ner, attribute_ruler, lemmatizer.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
- "spacy_version":">=3.2.0,<3.3.0",
11
- "spacy_git_version":"bb26550e2",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
@@ -212,15 +212,8 @@
212
  "punct",
213
  "xcomp"
214
  ],
215
- "senter":[
216
- "I",
217
- "S"
218
- ],
219
  "attribute_ruler":[
220
 
221
- ],
222
- "lemmatizer":[
223
-
224
  ],
225
  "ner":[
226
  "LOC",
@@ -233,17 +226,17 @@
233
  "tok2vec",
234
  "morphologizer",
235
  "parser",
236
- "attribute_ruler",
237
  "lemmatizer",
 
238
  "ner"
239
  ],
240
  "components":[
241
  "tok2vec",
242
  "morphologizer",
243
  "parser",
 
244
  "senter",
245
  "attribute_ruler",
246
- "lemmatizer",
247
  "ner"
248
  ],
249
  "disabled":[
@@ -254,51 +247,51 @@
254
  "token_p":0.9977732598,
255
  "token_r":0.9974835463,
256
  "token_f":0.997628382,
257
- "pos_acc":0.9507990315,
258
  "morph_acc":0.9371428571,
259
- "morph_micro_p":0.9563225412,
260
- "morph_micro_r":0.9482610671,
261
- "morph_micro_f":0.9522747434,
262
  "morph_per_feat":{
263
  "Mood":{
264
- "p":0.9639468691,
265
- "r":0.9685414681,
266
- "f":0.9662387066
267
  },
268
  "Tense":{
269
- "p":0.9529147982,
270
- "r":0.9600903614,
271
- "f":0.9564891223
272
  },
273
  "VerbForm":{
274
- "p":0.9471419791,
275
- "r":0.9430844553,
276
- "f":0.9451088623
277
  },
278
  "Voice":{
279
- "p":0.968492123,
280
- "r":0.9648729447,
281
- "f":0.9666791464
282
  },
283
  "Definite":{
284
- "p":0.9468212715,
285
- "r":0.9355985776,
286
- "f":0.9411764706
287
  },
288
  "Gender":{
289
- "p":0.9299128102,
290
- "r":0.9215686275,
291
- "f":0.9257219162
292
  },
293
  "Number":{
294
- "p":0.9519535375,
295
- "r":0.9405320814,
296
- "f":0.9462083443
297
  },
298
  "AdpType":{
299
- "p":1.0,
300
- "r":0.9840848806,
301
- "f":0.9919786096
302
  },
303
  "PartType":{
304
  "p":1.0,
@@ -306,59 +299,59 @@
306
  "f":0.9983739837
307
  },
308
  "Case":{
309
- "p":0.9727126806,
310
- "r":0.9573459716,
311
- "f":0.9649681529
312
  },
313
  "Person":{
314
- "p":0.9770723104,
315
- "r":0.9840142096,
316
- "f":0.9805309735
317
  },
318
  "PronType":{
319
- "p":0.9826875515,
320
- "r":0.9802631579,
321
- "f":0.9814738576
322
  },
323
  "NumType":{
324
- "p":0.9793103448,
325
  "r":0.940397351,
326
- "f":0.9594594595
327
  },
328
  "Degree":{
329
- "p":0.9402985075,
330
- "r":0.9108433735,
331
- "f":0.9253365973
332
  },
333
  "Reflex":{
334
  "p":1.0,
335
  "r":1.0,
336
  "f":1.0
337
  },
338
- "Polite":{
339
- "p":0.75,
340
- "r":0.75,
341
- "f":0.75
342
- },
343
  "Number[psor]":{
344
- "p":0.9885057471,
345
  "r":1.0,
346
- "f":0.9942196532
347
  },
348
  "Poss":{
349
- "p":1.0,
350
  "r":1.0,
351
- "f":1.0
 
 
 
 
 
352
  },
353
  "Foreign":{
354
- "p":0.8333333333,
355
- "r":0.5,
356
- "f":0.625
357
  },
358
  "Abbr":{
359
- "p":1.0,
360
- "r":0.2,
361
- "f":0.3333333333
362
  },
363
  "Style":{
364
  "p":1.0,
@@ -366,141 +359,141 @@
366
  "f":1.0
367
  }
368
  },
369
- "sents_p":0.9045045045,
370
  "sents_r":0.890070922,
371
- "sents_f":0.8972296693,
372
- "dep_uas":0.8070861741,
373
- "dep_las":0.7640549905,
374
  "dep_las_per_type":{
375
  "advmod":{
376
- "p":0.6948682386,
377
- "r":0.7076271186,
378
- "f":0.7011896431
379
  },
380
  "root":{
381
- "p":0.8078994614,
382
- "r":0.7978723404,
383
- "f":0.8028545941
384
  },
385
  "nsubj":{
386
- "p":0.8367129136,
387
- "r":0.8270042194,
388
- "f":0.8318302387
389
  },
390
  "case":{
391
- "p":0.8841584158,
392
- "r":0.8806706114,
393
- "f":0.8824110672
394
  },
395
  "obl":{
396
- "p":0.6810207337,
397
- "r":0.6630434783,
398
- "f":0.6719118804
399
  },
400
  "cc":{
401
- "p":0.7620396601,
402
- "r":0.7819767442,
403
- "f":0.7718794835
404
  },
405
  "conj":{
406
- "p":0.6066481994,
407
- "r":0.584,
408
- "f":0.5951086957
409
  },
410
  "obj":{
411
- "p":0.7678244973,
412
- "r":0.8155339806,
413
- "f":0.790960452
414
  },
415
  "aux":{
416
- "p":0.875,
417
- "r":0.8571428571,
418
- "f":0.8659793814
419
  },
420
  "acl:relcl":{
421
- "p":0.5852272727,
422
- "r":0.5567567568,
423
- "f":0.5706371191
424
  },
425
  "advmod:lmod":{
426
- "p":0.7042253521,
427
- "r":0.7462686567,
428
- "f":0.7246376812
429
  },
430
  "det":{
431
- "p":0.903814262,
432
- "r":0.8978583196,
433
- "f":0.9008264463
434
  },
435
  "amod":{
436
- "p":0.7807757167,
437
- "r":0.7901023891,
438
- "f":0.7854113656
439
  },
440
  "nmod:poss":{
441
- "p":0.7083333333,
442
- "r":0.6732673267,
443
- "f":0.6903553299
444
  },
445
  "ccomp":{
446
- "p":0.6212121212,
447
- "r":0.6612903226,
448
- "f":0.640625
449
  },
450
  "nummod":{
451
- "p":0.7868852459,
452
- "r":0.8,
453
- "f":0.7933884298
454
  },
455
  "flat":{
456
- "p":0.7804878049,
457
  "r":0.8476821192,
458
- "f":0.8126984127
459
  },
460
  "compound:prt":{
461
- "p":0.652173913,
462
- "r":0.3658536585,
463
- "f":0.46875
464
  },
465
  "advcl":{
466
- "p":0.6339285714,
467
- "r":0.6120689655,
468
- "f":0.6228070175
469
  },
470
  "mark":{
471
- "p":0.8773784355,
472
- "r":0.8521560575,
473
- "f":0.8645833333
474
  },
475
  "cop":{
476
- "p":0.7700534759,
477
- "r":0.8228571429,
478
- "f":0.7955801105
479
  },
480
  "dep":{
481
- "p":0.1647058824,
482
  "r":0.2641509434,
483
- "f":0.2028985507
484
  },
485
  "nmod":{
486
- "p":0.6058252427,
487
- "r":0.609375,
488
- "f":0.6075949367
489
  },
490
  "iobj":{
491
- "p":0.8181818182,
492
  "r":0.4090909091,
493
- "f":0.5454545455
494
  },
495
  "xcomp":{
496
- "p":0.431372549,
497
- "r":0.3728813559,
498
- "f":0.4
499
  },
500
  "list":{
501
- "p":0.3,
502
- "r":0.1666666667,
503
- "f":0.2142857143
504
  },
505
  "vocative":{
506
  "p":0.0,
@@ -508,64 +501,64 @@
508
  "f":0.0
509
  },
510
  "fixed":{
511
- "p":0.85,
512
- "r":0.8292682927,
513
- "f":0.8395061728
514
- },
515
- "obl:lmod":{
516
- "p":0.0,
517
- "r":0.0,
518
- "f":0.0
519
  },
520
  "expl":{
521
- "p":0.7941176471,
522
  "r":0.7941176471,
523
- "f":0.7941176471
524
  },
525
  "appos":{
526
- "p":0.4333333333,
527
- "r":0.3939393939,
528
- "f":0.4126984127
529
  },
530
  "obl:tmod":{
531
- "p":0.8571428571,
532
- "r":0.3333333333,
533
- "f":0.48
534
  },
535
  "discourse":{
536
  "p":0.0,
537
  "r":0.0,
538
  "f":0.0
 
 
 
 
 
539
  }
540
  },
541
- "tag_acc":0.9507990315,
542
- "lemma_acc":0.8491041162,
543
- "ents_p":0.7682119205,
544
- "ents_r":0.725,
545
- "ents_f":0.7459807074,
546
  "ents_per_type":{
547
  "PER":{
548
- "p":0.8089171975,
549
- "r":0.765060241,
550
- "f":0.786377709
551
  },
552
  "ORG":{
553
- "p":0.7215189873,
554
- "r":0.6333333333,
555
- "f":0.674556213
556
  },
557
  "MISC":{
558
- "p":0.6782608696,
559
- "r":0.6902654867,
560
- "f":0.6842105263
561
  },
562
  "LOC":{
563
- "p":0.8431372549,
564
- "r":0.7747747748,
565
- "f":0.8075117371
566
  }
567
  },
568
- "speed":10057.2129225514
569
  },
570
  "sources":[
571
  {
@@ -579,12 +572,6 @@
579
  "url":"https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane",
580
  "license":"CC BY-SA 4.0",
581
  "author":"Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders S\u00f8gaard"
582
- },
583
- {
584
- "name":"Lemmatization Lists",
585
- "url":"https://github.com/michmech/lemmatization-lists/",
586
- "license":"ODbL",
587
- "author":"Michal M\u011bchura"
588
  }
589
  ],
590
  "requirements":[
1
  {
2
  "lang":"da",
3
  "name":"core_news_sm",
4
+ "version":"3.3.0",
5
+ "description":"Danish pipeline optimized for CPU. Components: tok2vec, morphologizer, parser, lemmatizer (trainable_lemmatizer), senter, ner, attribute_ruler.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
+ "spacy_version":">=3.3.0.dev0,<3.4.0",
11
+ "spacy_git_version":"849bef2de",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
212
  "punct",
213
  "xcomp"
214
  ],
 
 
 
 
215
  "attribute_ruler":[
216
 
 
 
 
217
  ],
218
  "ner":[
219
  "LOC",
226
  "tok2vec",
227
  "morphologizer",
228
  "parser",
 
229
  "lemmatizer",
230
+ "attribute_ruler",
231
  "ner"
232
  ],
233
  "components":[
234
  "tok2vec",
235
  "morphologizer",
236
  "parser",
237
+ "lemmatizer",
238
  "senter",
239
  "attribute_ruler",
 
240
  "ner"
241
  ],
242
  "disabled":[
247
  "token_p":0.9977732598,
248
  "token_r":0.9974835463,
249
  "token_f":0.997628382,
250
+ "pos_acc":0.9506053269,
251
  "morph_acc":0.9371428571,
252
+ "morph_micro_p":0.9565925398,
253
+ "morph_micro_r":0.9488667912,
254
+ "morph_micro_f":0.9527140033,
255
  "morph_per_feat":{
256
  "Mood":{
257
+ "p":0.9636711281,
258
+ "r":0.9609151573,
259
+ "f":0.9622911695
260
  },
261
  "Tense":{
262
+ "p":0.9503759398,
263
+ "r":0.9518072289,
264
+ "f":0.9510910459
265
  },
266
  "VerbForm":{
267
+ "p":0.9457125231,
268
+ "r":0.9381884945,
269
+ "f":0.9419354839
270
  },
271
  "Voice":{
272
+ "p":0.9713423831,
273
+ "r":0.9626307922,
274
+ "f":0.966966967
275
  },
276
  "Definite":{
277
+ "p":0.950039968,
278
+ "r":0.9391544844,
279
+ "f":0.9445658653
280
  },
281
  "Gender":{
282
+ "p":0.9317497491,
283
+ "r":0.9255566633,
284
+ "f":0.928642881
285
  },
286
  "Number":{
287
+ "p":0.9502107482,
288
+ "r":0.9407929056,
289
+ "f":0.9454783748
290
  },
291
  "AdpType":{
292
+ "p":0.9991087344,
293
+ "r":0.991158267,
294
+ "f":0.9951176209
295
  },
296
  "PartType":{
297
  "p":1.0,
299
  "f":0.9983739837
300
  },
301
  "Case":{
302
+ "p":0.9774557166,
303
+ "r":0.9589257504,
304
+ "f":0.9681020734
305
  },
306
  "Person":{
307
+ "p":0.9771528998,
308
+ "r":0.9875666075,
309
+ "f":0.9823321555
310
  },
311
  "PronType":{
312
+ "p":0.9851851852,
313
+ "r":0.984375,
314
+ "f":0.984779926
315
  },
316
  "NumType":{
317
+ "p":0.9726027397,
318
  "r":0.940397351,
319
+ "f":0.9562289562
320
  },
321
  "Degree":{
322
+ "p":0.9394313968,
323
+ "r":0.9156626506,
324
+ "f":0.9273947529
325
  },
326
  "Reflex":{
327
  "p":1.0,
328
  "r":1.0,
329
  "f":1.0
330
  },
 
 
 
 
 
331
  "Number[psor]":{
332
+ "p":0.9772727273,
333
  "r":1.0,
334
+ "f":0.9885057471
335
  },
336
  "Poss":{
337
+ "p":0.9887640449,
338
  "r":1.0,
339
+ "f":0.9943502825
340
+ },
341
+ "Polite":{
342
+ "p":0.6,
343
+ "r":0.75,
344
+ "f":0.6666666667
345
  },
346
  "Foreign":{
347
+ "p":1.0,
348
+ "r":0.4,
349
+ "f":0.5714285714
350
  },
351
  "Abbr":{
352
+ "p":0.6666666667,
353
+ "r":0.4,
354
+ "f":0.5
355
  },
356
  "Style":{
357
  "p":1.0,
359
  "f":1.0
360
  }
361
  },
362
+ "sents_p":0.9012567325,
363
  "sents_r":0.890070922,
364
+ "sents_f":0.8956289028,
365
+ "dep_uas":0.8080446927,
366
+ "dep_las":0.7628624099,
367
  "dep_las_per_type":{
368
  "advmod":{
369
+ "p":0.6630434783,
370
+ "r":0.6892655367,
371
+ "f":0.675900277
372
  },
373
  "root":{
374
+ "p":0.8085867621,
375
+ "r":0.8014184397,
376
+ "f":0.8049866429
377
  },
378
  "nsubj":{
379
+ "p":0.8269639066,
380
+ "r":0.8217299578,
381
+ "f":0.8243386243
382
  },
383
  "case":{
384
+ "p":0.881372549,
385
+ "r":0.8865877712,
386
+ "f":0.883972468
387
  },
388
  "obl":{
389
+ "p":0.6786885246,
390
+ "r":0.6428571429,
391
+ "f":0.6602870813
392
  },
393
  "cc":{
394
+ "p":0.7614942529,
395
+ "r":0.7703488372,
396
+ "f":0.7658959538
397
  },
398
  "conj":{
399
+ "p":0.6264044944,
400
+ "r":0.5946666667,
401
+ "f":0.610123119
402
  },
403
  "obj":{
404
+ "p":0.761732852,
405
+ "r":0.8194174757,
406
+ "f":0.7895229186
407
  },
408
  "aux":{
409
+ "p":0.8795180723,
410
+ "r":0.8513119534,
411
+ "f":0.8651851852
412
  },
413
  "acl:relcl":{
414
+ "p":0.6054054054,
415
+ "r":0.6054054054,
416
+ "f":0.6054054054
417
  },
418
  "advmod:lmod":{
419
+ "p":0.6875,
420
+ "r":0.6567164179,
421
+ "f":0.6717557252
422
  },
423
  "det":{
424
+ "p":0.9126853377,
425
+ "r":0.9126853377,
426
+ "f":0.9126853377
427
  },
428
  "amod":{
429
+ "p":0.7852348993,
430
+ "r":0.7986348123,
431
+ "f":0.7918781726
432
  },
433
  "nmod:poss":{
434
+ "p":0.6767676768,
435
+ "r":0.6633663366,
436
+ "f":0.67
437
  },
438
  "ccomp":{
439
+ "p":0.6290322581,
440
+ "r":0.6290322581,
441
+ "f":0.6290322581
442
  },
443
  "nummod":{
444
+ "p":0.824,
445
+ "r":0.8583333333,
446
+ "f":0.8408163265
447
  },
448
  "flat":{
449
+ "p":0.7901234568,
450
  "r":0.8476821192,
451
+ "f":0.8178913738
452
  },
453
  "compound:prt":{
454
+ "p":0.44,
455
+ "r":0.2682926829,
456
+ "f":0.3333333333
457
  },
458
  "advcl":{
459
+ "p":0.5877192982,
460
+ "r":0.5775862069,
461
+ "f":0.5826086957
462
  },
463
  "mark":{
464
+ "p":0.872651357,
465
+ "r":0.8583162218,
466
+ "f":0.8654244306
467
  },
468
  "cop":{
469
+ "p":0.752688172,
470
+ "r":0.8,
471
+ "f":0.7756232687
472
  },
473
  "dep":{
474
+ "p":0.1707317073,
475
  "r":0.2641509434,
476
+ "f":0.2074074074
477
  },
478
  "nmod":{
479
+ "p":0.6122840691,
480
+ "r":0.623046875,
481
+ "f":0.6176185866
482
  },
483
  "iobj":{
484
+ "p":0.6923076923,
485
  "r":0.4090909091,
486
+ "f":0.5142857143
487
  },
488
  "xcomp":{
489
+ "p":0.5161290323,
490
+ "r":0.2711864407,
491
+ "f":0.3555555556
492
  },
493
  "list":{
494
+ "p":0.3636363636,
495
+ "r":0.2222222222,
496
+ "f":0.275862069
497
  },
498
  "vocative":{
499
  "p":0.0,
501
  "f":0.0
502
  },
503
  "fixed":{
504
+ "p":0.8684210526,
505
+ "r":0.8048780488,
506
+ "f":0.835443038
 
 
 
 
 
507
  },
508
  "expl":{
509
+ "p":0.84375,
510
  "r":0.7941176471,
511
+ "f":0.8181818182
512
  },
513
  "appos":{
514
+ "p":0.4666666667,
515
+ "r":0.4242424242,
516
+ "f":0.4444444444
517
  },
518
  "obl:tmod":{
519
+ "p":0.7777777778,
520
+ "r":0.3888888889,
521
+ "f":0.5185185185
522
  },
523
  "discourse":{
524
  "p":0.0,
525
  "r":0.0,
526
  "f":0.0
527
+ },
528
+ "obl:lmod":{
529
+ "p":0.0,
530
+ "r":0.0,
531
+ "f":0.0
532
  }
533
  },
534
+ "lemma_acc":0.9430508475,
535
+ "tag_acc":0.9506053269,
536
+ "ents_p":0.7450110865,
537
+ "ents_r":0.7,
538
+ "ents_f":0.7218045113,
539
  "ents_per_type":{
540
  "PER":{
541
+ "p":0.7818181818,
542
+ "r":0.7771084337,
543
+ "f":0.7794561934
544
  },
545
  "ORG":{
546
+ "p":0.6463414634,
547
+ "r":0.5888888889,
548
+ "f":0.6162790698
549
  },
550
  "MISC":{
551
+ "p":0.6698113208,
552
+ "r":0.6283185841,
553
+ "f":0.6484018265
554
  },
555
  "LOC":{
556
+ "p":0.8469387755,
557
+ "r":0.7477477477,
558
+ "f":0.7942583732
559
  }
560
  },
561
+ "speed":12430.3213337348
562
  },
563
  "sources":[
564
  {
572
  "url":"https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane",
573
  "license":"CC BY-SA 4.0",
574
  "author":"Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders S\u00f8gaard"
 
 
 
 
 
 
575
  }
576
  ],
577
  "requirements":[
morphologizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fc827bd86802a4c5ddbcecf523334d45998608ab95bb7863e0571e05d76a1acb
3
- size 61299
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:736b0297474d1c5d0be1e66c6d23e8c384d1bf28b10f4e1e95e3fb7764a92c34
3
+ size 61351
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:99839f30744b1a951b0b091026a35ea17c069fdfc3dab9547ba53df02ac282f7
3
- size 6865402
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:79b97fd3fce37d0b85461d97c019507cf25c5b151f9e7abc0e77c2196cd129dd
3
+ size 6270202
parser/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ffd6867cd67551ffc79830cb1c78f31df311fc1173b0eb4ac6de1ddb040a5698
3
  size 308728
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:798992f319db8baad07af96a89069a6ae7c4c3119c34414dbb8913fe28316ded
3
  size 308728
parser/moves CHANGED
@@ -1 +1 @@
1
- ��moves�D{"0":{"":41514},"1":{"":34295},"2":{"case":7489,"nsubj":6009,"det":4334,"amod":3968,"advmod":3657,"mark":3529,"aux":2432,"cc":2261,"punct":2182,"cop":1329,"obl":894,"nummod":799,"nmod:poss":651,"nmod":460,"expl":291,"ccomp":202,"obj":195,"xcomp":122,"case||nmod":73,"obl:tmod":53,"dep":49,"acl:relcl":43},"3":{"punct":8601,"obl":3949,"obj":3758,"nmod":3565,"conj":2745,"advmod":2095,"flat":1295,"nsubj":1172,"acl:relcl":1131,"advcl":808,"amod":628,"advmod:lmod":423,"fixed":390,"dep":322,"xcomp":272,"appos":268,"compound:prt":261,"ccomp":252,"acl:relcl||nsubj":237,"case":202,"nummod":167,"list":161,"nmod:poss":156,"punct||conj":151,"mark":137,"cc":135,"iobj":107,"expl":77,"cop":69,"nmod||case":60,"aux":48,"obl:tmod":45,"obl:lmod":44,"cc||case":43,"advcl||advmod":43,"cc||conj":40,"case||obl":38,"punct||case":33},"4":{"ROOT":4367}}�cfg��neg_key�
1
+ ��moves�D{"0":{"":41615},"1":{"":34382},"2":{"case":7526,"nsubj":6005,"det":4341,"amod":3967,"advmod":3662,"mark":3530,"aux":2436,"cc":2264,"punct":2187,"cop":1330,"obl":894,"nummod":834,"nmod:poss":656,"nmod":463,"expl":291,"ccomp":203,"obj":195,"xcomp":122,"case||nmod":73,"obl:tmod":53,"dep":48,"acl:relcl":43},"3":{"punct":8693,"obl":3951,"obj":3760,"nmod":3569,"conj":2747,"advmod":2087,"flat":1302,"nsubj":1169,"acl:relcl":1132,"advcl":809,"amod":622,"advmod:lmod":423,"fixed":390,"dep":322,"xcomp":272,"appos":268,"compound:prt":261,"ccomp":252,"acl:relcl||nsubj":237,"case":202,"nummod":168,"list":159,"nmod:poss":156,"punct||conj":151,"cc":135,"mark":133,"iobj":107,"expl":77,"cop":69,"nmod||case":60,"aux":48,"obl:tmod":45,"obl:lmod":44,"cc||case":43,"advcl||advmod":43,"cc||conj":40,"case||obl":38,"punct||case":33},"4":{"ROOT":4383}}�cfg��neg_key�
senter/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4c548261cd1397f13fc5ae1c09dd73fb4524fd4b9bc71b7a723c4bade8bc684e
3
- size 197037
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5abe471624831d024895d5002a4be6534d1f42f6e3de13b7ca64640817a1217b
3
+ size 197089
tok2vec/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cf1b9c713729c38573ce5323323a04711a79737dce823eb894a8dc6fb41c8c19
3
- size 6734429
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d982b57011d1906e0a52edfa4b89a060ce1218934e614dea14994e8d92184a71
3
+ size 6139229
tokenizer CHANGED
The diff for this file is too large to render. See raw diff
vocab/key2row CHANGED
@@ -1 +1,3 @@
1
-
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:76be8b528d0075f7aae98d6fa57a6d3c83ae480a8469e668d7b0af968995ac71
3
+ size 1
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:786ff7139c6dd7568c66e2ae810f42fa3afc33860aa3d81bd5dfeb263295d80c
3
- size 459696
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9888253982b5db9c45575c9444fc932377cfb0dc94bb1426febaa4678402755f
3
+ size 474473