oroszgy commited on
Commit
a3a6ae5
1 Parent(s): 0d89649

Update spacy pipeline to 0.4.1

Browse files
README.md CHANGED
@@ -14,61 +14,69 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.8510250569
18
  - name: NER Recall
19
  type: recall
20
- value: 0.852189781
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.8516070207
 
 
 
 
 
 
 
24
  - task:
25
  name: POS
26
  type: token-classification
27
  metrics:
28
- - name: POS Accuracy
29
  type: accuracy
30
- value: 0.9634910761
31
  - task:
32
- name: SENTER
33
  type: token-classification
34
  metrics:
35
- - name: SENTER Precision
36
- type: precision
37
- value: 0.9776785714
38
- - name: SENTER Recall
39
- type: recall
40
- value: 0.9755011136
41
- - name: SENTER F Score
42
- type: f_score
43
- value: 0.9765886288
44
  - task:
45
  name: UNLABELED_DEPENDENCIES
46
  type: token-classification
47
  metrics:
48
- - name: Unlabeled Dependencies Accuracy
49
- type: accuracy
50
- value: 0.8213306474
51
  - task:
52
  name: LABELED_DEPENDENCIES
53
  type: token-classification
54
  metrics:
55
- - name: Labeled Dependencies Accuracy
56
- type: accuracy
57
- value: 0.8213306474
 
 
 
 
 
 
 
58
  ---
59
- Hungarian Spacy pipeline. Components: tok2vec, hun_sentencizer, tagger, morphologizer, hun_lemmy, parser, ner
60
 
61
  | Feature | Description |
62
  | --- | --- |
63
  | **Name** | `hu_core_news_lg` |
64
- | **Version** | `0.4.0` |
65
- | **spaCy** | `>=3.2.0,<3.3.0` |
66
- | **Default Pipeline** | `tok2vec`, `senter`, `tagger`, `morphologizer`, `hun_lemmy`, `parser`, `ner` |
67
- | **Components** | `tok2vec`, `senter`, `tagger`, `morphologizer`, `hun_lemmy`, `parser`, `ner` |
68
  | **Vectors** | 1140008 keys, 1140008 unique vectors (300 dimensions) |
69
- | **Sources** | [UD Hungarian Szeged](https://universaldependencies.org/treebanks/hu_szeged/index.html) (Richárd Farkas, Katalin Simkó, Zsolt Szántó, Viktor Varga, Veronika Vincze ((MTA-SZTE Research Group on Artificial Intelligence)))<br />[NYTK-NerKor corpus](https://github.com/nytud/NYTK-NerKor) (Eszter Simon & Noémi Vadász (Department of Language Technology and Applied Linguistics))<br />[hunNERwiki](http://hlt.sztaki.hu/resources/hunnerwiki.html) (Eszter Simon)<br />[Szeged NER Corpus](https://rgai.inf.u-szeged.hu/node/130) (Richárd Farkas (MTA-SZTE Research Group on Artificial Intelligence))<br />[Webcorpuswiki word2vec model](https://github.com/oroszgy/hunlp-resources/releases/tag/webcorpuswiki_word2vec_v0.1) (György Orosz) |
70
  | **License** | `cc-by-sa-4.0` |
71
- | **Author** | [MILAB Spacy Research Group](https://github.com/spacy-hu/spacy-hungarian-models) |
72
 
73
  ### Label Scheme
74
 
@@ -94,17 +102,17 @@ Hungarian Spacy pipeline. Components: tok2vec, hun_sentencizer, tagger, morpholo
94
  | `TOKEN_P` | 99.86 |
95
  | `TOKEN_R` | 99.93 |
96
  | `TOKEN_F` | 99.89 |
97
- | `SENTS_P` | 97.77 |
98
  | `SENTS_R` | 97.55 |
99
- | `SENTS_F` | 97.66 |
100
- | `TAG_ACC` | 96.35 |
101
- | `POS_ACC` | 96.32 |
102
- | `MORPH_ACC` | 92.64 |
103
- | `MORPH_MICRO_P` | 96.80 |
104
- | `MORPH_MICRO_R` | 95.27 |
105
- | `MORPH_MICRO_F` | 96.03 |
106
- | `DEP_UAS` | 82.13 |
107
- | `DEP_LAS` | 74.72 |
108
- | `ENTS_P` | 85.10 |
109
- | `ENTS_R` | 85.22 |
110
- | `ENTS_F` | 85.16 |
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.856968588
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.854622871
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.8557941221
24
+ - task:
25
+ name: TAG
26
+ type: token-classification
27
+ metrics:
28
+ - name: TAG (XPOS) Accuracy
29
+ type: accuracy
30
+ value: 0.9643523614
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
+ - name: POS (UPOS) Accuracy
36
  type: accuracy
37
+ value: 0.9621512991
38
  - task:
39
+ name: MORPH
40
  type: token-classification
41
  metrics:
42
+ - name: Morph (UFeats) Accuracy
43
+ type: accuracy
44
+ value: 0.9253517083
 
 
 
 
 
 
45
  - task:
46
  name: UNLABELED_DEPENDENCIES
47
  type: token-classification
48
  metrics:
49
+ - name: Unlabeled Attachment Score (UAS)
50
+ type: f_score
51
+ value: 0.8193704057
52
  - task:
53
  name: LABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
+ - name: Labeled Attachment Score (LAS)
57
+ type: f_score
58
+ value: 0.7497475031
59
+ - task:
60
+ name: SENTS
61
+ type: token-classification
62
+ metrics:
63
+ - name: Sentences F-Score
64
+ type: f_score
65
+ value: 0.9755011136
66
  ---
67
+ Core Hungarian model for HuSpaCy. Components: tok2vec, senter, tagger, morphologizer, lemmatizer, parser, ner
68
 
69
  | Feature | Description |
70
  | --- | --- |
71
  | **Name** | `hu_core_news_lg` |
72
+ | **Version** | `0.4.1` |
73
+ | **spaCy** | `>=3.2.1,<3.3.0` |
74
+ | **Default Pipeline** | `tok2vec`, `senter`, `tagger`, `morphologizer`, `lemmatizer`, `parser`, `ner` |
75
+ | **Components** | `tok2vec`, `senter`, `tagger`, `morphologizer`, `lemmatizer`, `parser`, `ner` |
76
  | **Vectors** | 1140008 keys, 1140008 unique vectors (300 dimensions) |
77
+ | **Sources** | [UD Hungarian Szeged](https://universaldependencies.org/treebanks/hu_szeged/index.html) (Richárd Farkas, Katalin Simkó, Zsolt Szántó, Viktor Varga, Veronika Vincze (MTA-SZTE Research Group on Artificial Intelligence))<br />[NYTK-NerKor corpus](https://github.com/nytud/NYTK-NerKor) (Eszter Simon, Noémi Vadász (Department of Language Technology and Applied Linguistics))<br />[hunNERwiki](http://hlt.sztaki.hu/resources/hunnerwiki.html) (Eszter Simon, Dávid Márk Nemeskey (HLT Group, Budapest University of Technology and Economics))<br />[Szeged NER Corpus](https://rgai.inf.u-szeged.hu/node/130) (György Szarvas, Richárd Farkas, László Felföldi, András Kocsor, János Csirik (MTA-SZTE Research Group on Artificial Intelligence))<br />[Webcorpuswiki word2vec model](https://github.com/oroszgy/hunlp-resources/releases/tag/webcorpuswiki_word2vec_v0.1) (György Orosz) |
78
  | **License** | `cc-by-sa-4.0` |
79
+ | **Author** | [MILAB Spacy Research Group](https://github.com/huspacy/huspacy) |
80
 
81
  ### Label Scheme
82
 
 
102
  | `TOKEN_P` | 99.86 |
103
  | `TOKEN_R` | 99.93 |
104
  | `TOKEN_F` | 99.89 |
105
+ | `SENTS_P` | 97.55 |
106
  | `SENTS_R` | 97.55 |
107
+ | `SENTS_F` | 97.55 |
108
+ | `TAG_ACC` | 96.44 |
109
+ | `POS_ACC` | 96.22 |
110
+ | `MORPH_ACC` | 92.54 |
111
+ | `MORPH_MICRO_P` | 96.66 |
112
+ | `MORPH_MICRO_R` | 95.38 |
113
+ | `MORPH_MICRO_F` | 96.02 |
114
+ | `DEP_UAS` | 81.94 |
115
+ | `DEP_LAS` | 74.97 |
116
+ | `ENTS_P` | 85.70 |
117
+ | `ENTS_R` | 85.46 |
118
+ | `ENTS_F` | 85.58 |
config.cfg CHANGED
@@ -1,7 +1,7 @@
1
  [paths]
2
- parser_model = "../models/hu_core_news_lg-parser-0.4.0/model-best"
3
- lemmy_model = "../models/lemmy-0.4.0.bin"
4
- ner_model = "../models/hu_core_news_lg-ner_merged-0.4.0/model-best"
5
  train = null
6
  dev = null
7
  vectors = null
@@ -114,6 +114,7 @@ upstream = "*"
114
 
115
  [components.tagger]
116
  factory = "tagger"
 
117
  overwrite = false
118
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
119
 
 
1
  [paths]
2
+ parser_model = "../models/hu_core_news_lg-parser-0.4.1/model-best"
3
+ lemmy_model = "../models/lemmy-0.4.1.bin"
4
+ ner_model = "../models/hu_core_news_lg-ner_merged-0.4.1/model-best"
5
  train = null
6
  dev = null
7
  vectors = null
 
114
 
115
  [components.tagger]
116
  factory = "tagger"
117
+ neg_prefix = "!"
118
  overwrite = false
119
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
120
 
hu_core_news_lg-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0b0f3b738813413de5fb22eb46d0b68500b598b21d4840e5ec5c56c56e9fad1b
3
- size 1421621826
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:554b8935082f9351924a4e9142a1c7341187dd21550c719af8d3ae4d96d556fb
3
+ size 1419956175
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"hu",
3
  "name":"core_news_lg",
4
- "version":"0.4.0",
5
- "description":"Hungarian Spacy pipeline. Components: tok2vec, hun_sentencizer, tagger, morphologizer, hun_lemmy, parser, ner",
6
  "author":"MILAB Spacy Research Group",
7
  "email":"gyorgy@orosz.link",
8
- "url":"https://github.com/spacy-hu/spacy-hungarian-models",
9
  "license":"cc-by-sa-4.0",
10
- "spacy_version":">=3.2.0,<3.3.0",
11
- "spacy_git_version":"0fc3dee77",
12
  "vectors":{
13
  "width":300,
14
  "vectors":1140008,
@@ -1271,85 +1271,80 @@
1271
  "token_p":0.998565417,
1272
  "token_r":0.9993300153,
1273
  "token_f":0.9989475698,
1274
- "sents_p":0.9776785714,
1275
  "sents_r":0.9755011136,
1276
- "sents_f":0.9765886288,
1277
- "tag_acc":0.9634910761,
1278
- "pos_acc":0.9632039811,
1279
- "morph_acc":0.9264044406,
1280
- "morph_micro_p":0.9680363303,
1281
- "morph_micro_r":0.9526858616,
1282
- "morph_micro_f":0.9602997553,
1283
  "morph_per_feat":{
1284
  "Definite":{
1285
- "p":0.9690531178,
1286
- "r":0.9790013999,
1287
- "f":0.974001857
1288
  },
1289
  "PronType":{
1290
- "p":0.9713656388,
1291
- "r":0.9735099338,
1292
- "f":0.9724366042
1293
  },
1294
  "Case":{
1295
- "p":0.9669059011,
1296
- "r":0.9583086347,
1297
- "f":0.9625880718
1298
  },
1299
  "Degree":{
1300
- "p":0.9285059578,
1301
- "r":0.8427620632,
1302
- "f":0.8835586568
1303
  },
1304
  "Number":{
1305
- "p":0.9790398918,
1306
- "r":0.9706720295,
1307
- "f":0.9748380039
1308
  },
1309
  "Mood":{
1310
- "p":0.9530726257,
1311
- "r":0.9456762749,
1312
- "f":0.9493600445
1313
  },
1314
  "Person":{
1315
- "p":0.9689336692,
1316
- "r":0.9490131579,
1317
- "f":0.9588699626
1318
  },
1319
  "Tense":{
1320
- "p":0.9787946429,
1321
- "r":0.9690607735,
1322
- "f":0.973903387
1323
  },
1324
  "VerbForm":{
1325
- "p":0.9698239732,
1326
- "r":0.9278267843,
1327
- "f":0.9483606557
1328
  },
1329
  "Voice":{
1330
- "p":0.9762886598,
1331
- "r":0.9683026585,
1332
- "f":0.9722792608
1333
  },
1334
  "Number[psor]":{
1335
- "p":0.9551569507,
1336
- "r":0.9102564103,
1337
- "f":0.932166302
1338
  },
1339
  "Person[psor]":{
1340
- "p":0.9596412556,
1341
- "r":0.9158345221,
1342
- "f":0.9372262774
1343
  },
1344
  "NumType":{
1345
- "p":0.927680798,
1346
- "r":0.9073170732,
1347
- "f":0.9173859433
1348
- },
1349
- "Poss":{
1350
- "p":0.6,
1351
- "r":1.0,
1352
- "f":0.75
1353
  },
1354
  "Reflex":{
1355
  "p":1.0,
@@ -1362,118 +1357,123 @@
1362
  "f":0.0
1363
  },
1364
  "Number[psed]":{
 
 
 
 
 
1365
  "p":1.0,
1366
- "r":0.1111111111,
1367
- "f":0.2
1368
  }
1369
  },
1370
- "dep_uas":0.8213306474,
1371
- "dep_las":0.7472023277,
1372
  "dep_las_per_type":{
1373
  "det":{
1374
- "p":0.8512585812,
1375
- "r":0.8885350318,
1376
- "f":0.8694974679
1377
  },
1378
  "amod:att":{
1379
- "p":0.8627787307,
1380
- "r":0.8225674571,
1381
- "f":0.8421933864
1382
  },
1383
  "nsubj":{
1384
- "p":0.7560192616,
1385
- "r":0.7359375,
1386
- "f":0.7458432304
1387
  },
1388
  "advmod:mode":{
1389
- "p":0.5863746959,
1390
- "r":0.5906862745,
1391
- "f":0.5885225885
1392
  },
1393
  "nmod:att":{
1394
- "p":0.8050541516,
1395
- "r":0.7559322034,
1396
- "f":0.7797202797
1397
  },
1398
  "obl":{
1399
- "p":0.7108626198,
1400
- "r":0.801080108,
1401
- "f":0.7532797292
1402
  },
1403
  "obj":{
1404
- "p":0.8677884615,
1405
- "r":0.8112359551,
1406
- "f":0.8385598142
1407
  },
1408
  "root":{
1409
- "p":0.8080357143,
1410
- "r":0.8062360802,
1411
- "f":0.8071348941
1412
  },
1413
  "cc":{
1414
- "p":0.6709129512,
1415
- "r":0.6652631579,
1416
- "f":0.6680761099
1417
  },
1418
  "conj":{
1419
- "p":0.5074626866,
1420
- "r":0.5666666667,
1421
- "f":0.5354330709
1422
  },
1423
  "advmod":{
1424
- "p":0.79,
1425
- "r":0.8315789474,
1426
- "f":0.8102564103
1427
  },
1428
  "flat:name":{
1429
- "p":0.8867924528,
1430
- "r":0.8785046729,
1431
- "f":0.882629108
1432
  },
1433
  "appos":{
1434
- "p":0.3979591837,
1435
- "r":0.414893617,
1436
- "f":0.40625
1437
  },
1438
  "advcl":{
1439
- "p":0.2842105263,
1440
- "r":0.2755102041,
1441
- "f":0.2797927461
1442
  },
1443
  "advmod:tlocy":{
1444
- "p":0.7348837209,
1445
- "r":0.6869565217,
1446
- "f":0.7101123596
1447
  },
1448
  "ccomp:obj":{
1449
- "p":0.3023255814,
1450
- "r":0.3939393939,
1451
- "f":0.3421052632
1452
  },
1453
  "mark":{
1454
- "p":0.8447204969,
1455
- "r":0.8607594937,
1456
- "f":0.8526645768
1457
  },
1458
  "compound:preverb":{
1459
- "p":0.9345794393,
1460
- "r":0.9174311927,
1461
- "f":0.9259259259
1462
  },
1463
  "advmod:locy":{
1464
- "p":0.85,
1465
- "r":0.53125,
1466
- "f":0.6538461538
1467
  },
1468
  "cop":{
1469
- "p":0.6764705882,
1470
- "r":0.5609756098,
1471
- "f":0.6133333333
1472
  },
1473
  "nmod:obl":{
1474
- "p":0.2941176471,
1475
  "r":0.125,
1476
- "f":0.1754385965
1477
  },
1478
  "advmod:to":{
1479
  "p":0.0,
@@ -1481,85 +1481,90 @@
1481
  "f":0.0
1482
  },
1483
  "obj:lvc":{
1484
- "p":0.2,
1485
- "r":0.0833333333,
1486
- "f":0.1176470588
1487
  },
1488
  "ccomp:obl":{
1489
- "p":0.5833333333,
1490
- "r":0.4375,
1491
- "f":0.5
1492
  },
1493
  "iobj":{
1494
- "p":0.3,
1495
- "r":0.2,
1496
- "f":0.24
 
 
 
 
 
1497
  },
1498
  "case":{
1499
- "p":0.9210526316,
1500
- "r":0.8928571429,
1501
- "f":0.9067357513
1502
  },
1503
  "csubj":{
1504
- "p":0.56,
1505
- "r":0.3783783784,
1506
- "f":0.4516129032
1507
- },
1508
- "parataxis":{
1509
- "p":0.0769230769,
1510
- "r":0.0136986301,
1511
- "f":0.023255814
1512
  },
1513
  "xcomp":{
1514
- "p":0.858974359,
1515
- "r":0.9054054054,
1516
- "f":0.8815789474
1517
  },
1518
  "nummod":{
1519
- "p":0.5321100917,
1520
- "r":0.623655914,
1521
- "f":0.5742574257
1522
  },
1523
  "acl":{
1524
- "p":0.3384615385,
1525
- "r":0.3055555556,
1526
- "f":0.3211678832
1527
  },
1528
  "advmod:tto":{
1529
- "p":0.75,
1530
- "r":0.3,
1531
- "f":0.4285714286
1532
  },
1533
  "nmod":{
1534
- "p":0.0,
1535
- "r":0.0,
1536
- "f":0.0
 
 
 
 
 
1537
  },
1538
  "aux":{
1539
- "p":0.8181818182,
1540
- "r":0.75,
1541
- "f":0.7826086957
1542
  },
1543
  "advmod:tfrom":{
1544
- "p":1.0,
1545
  "r":0.1666666667,
1546
- "f":0.2857142857
1547
  },
1548
- "goeswith":{
1549
  "p":0.0,
1550
  "r":0.0,
1551
  "f":0.0
1552
  },
1553
- "compound":{
1554
- "p":0.7959183673,
1555
- "r":0.975,
1556
- "f":0.8764044944
1557
- },
1558
- "dep":{
1559
  "p":0.0,
1560
  "r":0.0,
1561
  "f":0.0
1562
  },
 
 
 
 
 
1563
  "obl:lvc":{
1564
  "p":0.0,
1565
  "r":0.0,
@@ -1570,25 +1575,20 @@
1570
  "r":0.0,
1571
  "f":0.0
1572
  },
1573
- "ccomp":{
1574
- "p":0.0,
1575
- "r":0.0,
1576
- "f":0.0
1577
- },
1578
  "nsubj:lvc":{
1579
  "p":0.0,
1580
  "r":0.0,
1581
  "f":0.0
1582
  },
1583
  "list":{
1584
- "p":0.1666666667,
1585
  "r":0.1666666667,
1586
- "f":0.1666666667
1587
  },
1588
  "advmod:que":{
1589
  "p":1.0,
1590
- "r":0.75,
1591
- "f":0.8571428571
1592
  },
1593
  "ccomp:pred":{
1594
  "p":0.0,
@@ -1596,57 +1596,57 @@
1596
  "f":0.0
1597
  }
1598
  },
1599
- "ents_p":0.8510250569,
1600
- "ents_r":0.852189781,
1601
- "ents_f":0.8516070207,
1602
  "ents_per_type":{
1603
  "ORG":{
1604
- "p":0.8780487805,
1605
- "r":0.888018794,
1606
- "f":0.8830056453
1607
  },
1608
  "LOC":{
1609
- "p":0.8424110385,
1610
- "r":0.874811463,
1611
- "f":0.8583055864
1612
  },
1613
  "MISC":{
1614
- "p":0.6740088106,
1615
- "r":0.5884615385,
1616
- "f":0.6283367556
1617
  },
1618
  "PER":{
1619
- "p":0.8832304527,
1620
- "r":0.8961377871,
1621
- "f":0.8896373057
1622
  }
1623
  },
1624
- "speed":1540.3572169653
1625
  },
1626
  "sources":[
1627
  {
1628
  "name":"UD Hungarian Szeged",
1629
  "url":"https://universaldependencies.org/treebanks/hu_szeged/index.html",
1630
  "license":"CC-BY-NC-SA-3.0",
1631
- "author":"Rich\u00e1rd Farkas, Katalin Simk\u00f3, Zsolt Sz\u00e1nt\u00f3, Viktor Varga, Veronika Vincze ((MTA-SZTE Research Group on Artificial Intelligence))"
1632
  },
1633
  {
1634
  "name":"NYTK-NerKor corpus",
1635
  "url":"https://github.com/nytud/NYTK-NerKor",
1636
  "license":"CC BY-SA 4.0",
1637
- "author":"Eszter Simon & No\u00e9mi Vad\u00e1sz (Department of Language Technology and Applied Linguistics)"
1638
  },
1639
  {
1640
  "name":"hunNERwiki",
1641
  "url":"http://hlt.sztaki.hu/resources/hunnerwiki.html",
1642
- "license":"CC-BY-SA-4.0",
1643
- "author":"Eszter Simon"
1644
  },
1645
  {
1646
  "name":"Szeged NER Corpus",
1647
  "url":"https://rgai.inf.u-szeged.hu/node/130",
1648
  "license":"CC-BY-NC-SA-3.0",
1649
- "author":"Rich\u00e1rd Farkas (MTA-SZTE Research Group on Artificial Intelligence)"
1650
  },
1651
  {
1652
  "name":"Webcorpuswiki word2vec model",
 
1
  {
2
  "lang":"hu",
3
  "name":"core_news_lg",
4
+ "version":"0.4.1",
5
+ "description":"Core Hungarian model for HuSpaCy. Components: tok2vec, senter, tagger, morphologizer, lemmatizer, parser, ner",
6
  "author":"MILAB Spacy Research Group",
7
  "email":"gyorgy@orosz.link",
8
+ "url":"https://github.com/huspacy/huspacy",
9
  "license":"cc-by-sa-4.0",
10
+ "spacy_version":">=3.2.1,<3.3.0",
11
+ "spacy_git_version":"800737b41",
12
  "vectors":{
13
  "width":300,
14
  "vectors":1140008,
 
1271
  "token_p":0.998565417,
1272
  "token_r":0.9993300153,
1273
  "token_f":0.9989475698,
1274
+ "sents_p":0.9755011136,
1275
  "sents_r":0.9755011136,
1276
+ "sents_f":0.9755011136,
1277
+ "tag_acc":0.9643523614,
1278
+ "pos_acc":0.9621512991,
1279
+ "morph_acc":0.9253517083,
1280
+ "morph_micro_p":0.9666376307,
1281
+ "morph_micro_r":0.9537602063,
1282
+ "morph_micro_f":0.960155743,
1283
  "morph_per_feat":{
1284
  "Definite":{
1285
+ "p":0.9669269637,
1286
+ "r":0.9822678488,
1287
+ "f":0.974537037
1288
  },
1289
  "PronType":{
1290
+ "p":0.9739900387,
1291
+ "r":0.9713024283,
1292
+ "f":0.9726443769
1293
  },
1294
  "Case":{
1295
+ "p":0.9718731299,
1296
+ "r":0.9626556017,
1297
+ "f":0.9672424062
1298
  },
1299
  "Degree":{
1300
+ "p":0.9248395967,
1301
+ "r":0.8394342762,
1302
+ "f":0.8800697776
1303
  },
1304
  "Number":{
1305
+ "p":0.9785617826,
1306
+ "r":0.9715099715,
1307
+ "f":0.9750231267
1308
  },
1309
  "Mood":{
1310
+ "p":0.936123348,
1311
+ "r":0.9423503326,
1312
+ "f":0.9392265193
1313
  },
1314
  "Person":{
1315
+ "p":0.9641666667,
1316
+ "r":0.9514802632,
1317
+ "f":0.957781457
1318
  },
1319
  "Tense":{
1320
+ "p":0.9702643172,
1321
+ "r":0.973480663,
1322
+ "f":0.971869829
1323
  },
1324
  "VerbForm":{
1325
+ "p":0.9615705931,
1326
+ "r":0.9230152366,
1327
+ "f":0.941898527
1328
  },
1329
  "Voice":{
1330
+ "p":0.967413442,
1331
+ "r":0.9713701431,
1332
+ "f":0.9693877551
1333
  },
1334
  "Number[psor]":{
1335
+ "p":0.9506726457,
1336
+ "r":0.905982906,
1337
+ "f":0.9277899344
1338
  },
1339
  "Person[psor]":{
1340
+ "p":0.9506726457,
1341
+ "r":0.907275321,
1342
+ "f":0.9284671533
1343
  },
1344
  "NumType":{
1345
+ "p":0.9382716049,
1346
+ "r":0.9268292683,
1347
+ "f":0.9325153374
 
 
 
 
 
1348
  },
1349
  "Reflex":{
1350
  "p":1.0,
 
1357
  "f":0.0
1358
  },
1359
  "Number[psed]":{
1360
+ "p":0.0,
1361
+ "r":0.0,
1362
+ "f":0.0
1363
+ },
1364
+ "Poss":{
1365
  "p":1.0,
1366
+ "r":1.0,
1367
+ "f":1.0
1368
  }
1369
  },
1370
+ "dep_uas":0.8193704057,
1371
+ "dep_las":0.7497475031,
1372
  "dep_las_per_type":{
1373
  "det":{
1374
+ "p":0.8756841282,
1375
+ "r":0.8917197452,
1376
+ "f":0.8836291913
1377
  },
1378
  "amod:att":{
1379
+ "p":0.8552522746,
1380
+ "r":0.8454619787,
1381
+ "f":0.8503289474
1382
  },
1383
  "nsubj":{
1384
+ "p":0.7605863192,
1385
+ "r":0.7296875,
1386
+ "f":0.7448165869
1387
  },
1388
  "advmod:mode":{
1389
+ "p":0.6159793814,
1390
+ "r":0.5857843137,
1391
+ "f":0.6005025126
1392
  },
1393
  "nmod:att":{
1394
+ "p":0.7643207856,
1395
+ "r":0.7915254237,
1396
+ "f":0.7776852623
1397
  },
1398
  "obl":{
1399
+ "p":0.7665198238,
1400
+ "r":0.7830783078,
1401
+ "f":0.7747105966
1402
  },
1403
  "obj":{
1404
+ "p":0.8306997743,
1405
+ "r":0.8269662921,
1406
+ "f":0.8288288288
1407
  },
1408
  "root":{
1409
+ "p":0.8106904232,
1410
+ "r":0.8106904232,
1411
+ "f":0.8106904232
1412
  },
1413
  "cc":{
1414
+ "p":0.6974248927,
1415
+ "r":0.6842105263,
1416
+ "f":0.6907545165
1417
  },
1418
  "conj":{
1419
+ "p":0.4454545455,
1420
+ "r":0.5104166667,
1421
+ "f":0.4757281553
1422
  },
1423
  "advmod":{
1424
+ "p":0.7843137255,
1425
+ "r":0.8421052632,
1426
+ "f":0.8121827411
1427
  },
1428
  "flat:name":{
1429
+ "p":0.8362831858,
1430
+ "r":0.8831775701,
1431
+ "f":0.8590909091
1432
  },
1433
  "appos":{
1434
+ "p":0.4444444444,
1435
+ "r":0.2978723404,
1436
+ "f":0.3566878981
1437
  },
1438
  "advcl":{
1439
+ "p":0.3974358974,
1440
+ "r":0.3163265306,
1441
+ "f":0.3522727273
1442
  },
1443
  "advmod:tlocy":{
1444
+ "p":0.6538461538,
1445
+ "r":0.6652173913,
1446
+ "f":0.6594827586
1447
  },
1448
  "ccomp:obj":{
1449
+ "p":0.2545454545,
1450
+ "r":0.4242424242,
1451
+ "f":0.3181818182
1452
  },
1453
  "mark":{
1454
+ "p":0.825,
1455
+ "r":0.835443038,
1456
+ "f":0.8301886792
1457
  },
1458
  "compound:preverb":{
1459
+ "p":0.8717948718,
1460
+ "r":0.9357798165,
1461
+ "f":0.9026548673
1462
  },
1463
  "advmod:locy":{
1464
+ "p":0.7894736842,
1465
+ "r":0.46875,
1466
+ "f":0.5882352941
1467
  },
1468
  "cop":{
1469
+ "p":0.7222222222,
1470
+ "r":0.6341463415,
1471
+ "f":0.6753246753
1472
  },
1473
  "nmod:obl":{
1474
+ "p":0.2380952381,
1475
  "r":0.125,
1476
+ "f":0.1639344262
1477
  },
1478
  "advmod:to":{
1479
  "p":0.0,
 
1481
  "f":0.0
1482
  },
1483
  "obj:lvc":{
1484
+ "p":0.3333333333,
1485
+ "r":0.1666666667,
1486
+ "f":0.2222222222
1487
  },
1488
  "ccomp:obl":{
1489
+ "p":0.5789473684,
1490
+ "r":0.34375,
1491
+ "f":0.431372549
1492
  },
1493
  "iobj":{
1494
+ "p":0.4666666667,
1495
+ "r":0.4666666667,
1496
+ "f":0.4666666667
1497
+ },
1498
+ "parataxis":{
1499
+ "p":0.1724137931,
1500
+ "r":0.0684931507,
1501
+ "f":0.0980392157
1502
  },
1503
  "case":{
1504
+ "p":0.9315789474,
1505
+ "r":0.9030612245,
1506
+ "f":0.9170984456
1507
  },
1508
  "csubj":{
1509
+ "p":0.5416666667,
1510
+ "r":0.3513513514,
1511
+ "f":0.4262295082
 
 
 
 
 
1512
  },
1513
  "xcomp":{
1514
+ "p":0.8181818182,
1515
+ "r":0.8513513514,
1516
+ "f":0.8344370861
1517
  },
1518
  "nummod":{
1519
+ "p":0.5247524752,
1520
+ "r":0.5698924731,
1521
+ "f":0.5463917526
1522
  },
1523
  "acl":{
1524
+ "p":0.36,
1525
+ "r":0.25,
1526
+ "f":0.2950819672
1527
  },
1528
  "advmod:tto":{
1529
+ "p":0.5714285714,
1530
+ "r":0.4,
1531
+ "f":0.4705882353
1532
  },
1533
  "nmod":{
1534
+ "p":0.3333333333,
1535
+ "r":0.0909090909,
1536
+ "f":0.1428571429
1537
+ },
1538
+ "ccomp":{
1539
+ "p":0.1,
1540
+ "r":0.0769230769,
1541
+ "f":0.0869565217
1542
  },
1543
  "aux":{
1544
+ "p":0.9090909091,
1545
+ "r":0.8333333333,
1546
+ "f":0.8695652174
1547
  },
1548
  "advmod:tfrom":{
1549
+ "p":0.3333333333,
1550
  "r":0.1666666667,
1551
+ "f":0.2222222222
1552
  },
1553
+ "dep":{
1554
  "p":0.0,
1555
  "r":0.0,
1556
  "f":0.0
1557
  },
1558
+ "goeswith":{
 
 
 
 
 
1559
  "p":0.0,
1560
  "r":0.0,
1561
  "f":0.0
1562
  },
1563
+ "compound":{
1564
+ "p":1.0,
1565
+ "r":0.975,
1566
+ "f":0.9873417722
1567
+ },
1568
  "obl:lvc":{
1569
  "p":0.0,
1570
  "r":0.0,
 
1575
  "r":0.0,
1576
  "f":0.0
1577
  },
 
 
 
 
 
1578
  "nsubj:lvc":{
1579
  "p":0.0,
1580
  "r":0.0,
1581
  "f":0.0
1582
  },
1583
  "list":{
1584
+ "p":0.2,
1585
  "r":0.1666666667,
1586
+ "f":0.1818181818
1587
  },
1588
  "advmod:que":{
1589
  "p":1.0,
1590
+ "r":0.25,
1591
+ "f":0.4
1592
  },
1593
  "ccomp:pred":{
1594
  "p":0.0,
 
1596
  "f":0.0
1597
  }
1598
  },
1599
+ "ents_p":0.856968588,
1600
+ "ents_r":0.854622871,
1601
+ "ents_f":0.8557941221,
1602
  "ents_per_type":{
1603
  "ORG":{
1604
+ "p":0.8991157556,
1605
+ "r":0.875880971,
1606
+ "f":0.8873462912
1607
  },
1608
  "LOC":{
1609
+ "p":0.8272980501,
1610
+ "r":0.8959276018,
1611
+ "f":0.8602461984
1612
  },
1613
  "MISC":{
1614
+ "p":0.684287812,
1615
+ "r":0.5974358974,
1616
+ "f":0.6379192334
1617
  },
1618
  "PER":{
1619
+ "p":0.8853046595,
1620
+ "r":0.9024008351,
1621
+ "f":0.8937710003
1622
  }
1623
  },
1624
+ "speed":1560.7043552634
1625
  },
1626
  "sources":[
1627
  {
1628
  "name":"UD Hungarian Szeged",
1629
  "url":"https://universaldependencies.org/treebanks/hu_szeged/index.html",
1630
  "license":"CC-BY-NC-SA-3.0",
1631
+ "author":"Rich\u00e1rd Farkas, Katalin Simk\u00f3, Zsolt Sz\u00e1nt\u00f3, Viktor Varga, Veronika Vincze (MTA-SZTE Research Group on Artificial Intelligence)"
1632
  },
1633
  {
1634
  "name":"NYTK-NerKor corpus",
1635
  "url":"https://github.com/nytud/NYTK-NerKor",
1636
  "license":"CC BY-SA 4.0",
1637
+ "author":"Eszter Simon, No\u00e9mi Vad\u00e1sz (Department of Language Technology and Applied Linguistics)"
1638
  },
1639
  {
1640
  "name":"hunNERwiki",
1641
  "url":"http://hlt.sztaki.hu/resources/hunnerwiki.html",
1642
+ "license":"CC-BY-SA-3.0",
1643
+ "author":"Eszter Simon, D\u00e1vid M\u00e1rk Nemeskey (HLT Group, Budapest University of Technology and Economics)"
1644
  },
1645
  {
1646
  "name":"Szeged NER Corpus",
1647
  "url":"https://rgai.inf.u-szeged.hu/node/130",
1648
  "license":"CC-BY-NC-SA-3.0",
1649
+ "author":"Gy\u00f6rgy Szarvas, Rich\u00e1rd Farkas, L\u00e1szl\u00f3 Felf\u00f6ldi, Andr\u00e1s Kocsor, J\u00e1nos Csirik (MTA-SZTE Research Group on Artificial Intelligence)"
1650
  },
1651
  {
1652
  "name":"Webcorpuswiki word2vec model",
morphologizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2e75b87033ee7bc70eafa5e4585a5cee7bb044e461e97b985fc95776d34c6363
3
  size 1383794
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d516c696a86b9e1c23a34c6d429c62e2b2c5f28a81813ed13863a41afeb06dc6
3
  size 1383794
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ff08ccbb0c86101c70a87a1b5d2c66bd1af919dedf048df64317a3d707ed53c5
3
  size 56989356
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e376b09a99732aa3ba59acafa63af2a9b8f6b256932d708976fa37729c04eaaa
3
  size 56989356
parser/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:64b27d45504e6afb5e4005ac5fd149e9881112c231075d036930a57d0d542053
3
  size 26010735
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2f333789b506d2d673146abf86c3b0a27ac002e9e4c28b16d696c42733a28c53
3
  size 26010735
senter/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3a5461e00e2f1fc35eee273420586815615642d962d3ad46da4f6b5eeb53afac
3
  size 2793
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:667d590c7593b4eb3e061dd5ba3e1b946368f28f9d84fd33310efa0a249ff5a5
3
  size 2793
tagger/cfg CHANGED
@@ -18,5 +18,6 @@
18
  "VERB",
19
  "X"
20
  ],
 
21
  "overwrite":false
22
  }
 
18
  "VERB",
19
  "X"
20
  ],
21
+ "neg_prefix":"!",
22
  "overwrite":false
23
  }
tagger/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:255929182a17aa054048268b8ad2b4099e7ec4c246c22211f5c8ebb0113e451e
3
  size 20853
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8f1a2d4d6db8ad9155d22ae9ec9cd4156fb9bd4a559f200c98bca3060a1fff05
3
  size 20853
tok2vec/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6456c23979d1fb5ebca0b0ec34280ff99e7954c7dee5700a8b32a95a067acd00
3
  size 56806592
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3362ed475999bb8ed77600b04f30d01bcf99f9ad2bae95eba9bb4074390e985b
3
  size 56806592