oroszgy commited on
Commit
f050265
1 Parent(s): 633d5a9

Update spacy pipeline to 3.6.0

Browse files
README.md CHANGED
@@ -14,74 +14,74 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.8625558534
18
  - name: NER Recall
19
  type: recall
20
- value: 0.8484528833
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.8554462466
24
  - task:
25
  name: TAG
26
  type: token-classification
27
  metrics:
28
  - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
- value: 0.9631543688
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
- value: 0.9651641305
38
  - task:
39
  name: MORPH
40
  type: token-classification
41
  metrics:
42
  - name: Morph (UFeats) Accuracy
43
  type: accuracy
44
- value: 0.9281270935
45
  - task:
46
  name: LEMMA
47
  type: token-classification
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
- value: 0.9750263133
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
- value: 0.8058213942
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
- value: 0.7387315968
66
  - task:
67
  name: SENTS
68
  type: token-classification
69
  metrics:
70
  - name: Sentences F-Score
71
  type: f_score
72
- value: 0.9753914989
73
  ---
74
  Core Hungarian model for HuSpaCy. Components: tok2vec, senter, tagger, morphologizer, lemmatizer, parser, ner
75
 
76
  | Feature | Description |
77
  | --- | --- |
78
  | **Name** | `hu_core_news_lg` |
79
- | **Version** | `3.5.2` |
80
- | **spaCy** | `>=3.5.0,<3.6.0` |
81
  | **Default Pipeline** | `tok2vec`, `senter`, `tagger`, `morphologizer`, `lookup_lemmatizer`, `trainable_lemmatizer`, `parser`, `ner` |
82
  | **Components** | `tok2vec`, `senter`, `tagger`, `morphologizer`, `lookup_lemmatizer`, `trainable_lemmatizer`, `parser`, `ner` |
83
  | **Vectors** | -1 keys, 200000 unique vectors (300 dimensions) |
84
- | **Sources** | [UD Hungarian Szeged](https://universaldependencies.org/treebanks/hu_szeged/index.html) (Richárd Farkas, Katalin Simkó, Zsolt Szántó, Viktor Varga, Veronika Vincze (MTA-SZTE Research Group on Artificial Intelligence))<br />[NYTK-NerKor Corpus](https://github.com/nytud/NYTK-NerKor) (Eszter Simon, Noémi Vadász (Department of Language Technology and Applied Linguistics))<br />[hunNERwiki](http://hlt.sztaki.hu/resources/hunnerwiki.html) (Eszter Simon, Dávid Márk Nemeskey (HLT Group, Budapest University of Technology and Economics))<br />[Szeged NER Corpus](https://rgai.inf.u-szeged.hu/node/130) (György Szarvas, Richárd Farkas, László Felföldi, András Kocsor, János Csirik (MTA-SZTE Research Group on Artificial Intelligence))<br />[Webcorpuswiki word2vec model](https://github.com/oroszgy/hunlp-resources/releases/tag/webcorpuswiki_word2vec_v0.1) (György Orosz) |
85
  | **License** | `cc-by-sa-4.0` |
86
  | **Author** | [SzegedAI, MILAB](https://github.com/huspacy/huspacy) |
87
 
@@ -108,18 +108,18 @@ Core Hungarian model for HuSpaCy. Components: tok2vec, senter, tagger, morpholog
108
  | `TOKEN_P` | 99.86 |
109
  | `TOKEN_R` | 99.93 |
110
  | `TOKEN_F` | 99.89 |
111
- | `SENTS_P` | 97.98 |
112
- | `SENTS_R` | 97.10 |
113
- | `SENTS_F` | 97.54 |
114
- | `TAG_ACC` | 96.32 |
115
- | `POS_ACC` | 96.52 |
116
- | `MORPH_ACC` | 92.81 |
117
- | `MORPH_MICRO_P` | 96.62 |
118
- | `MORPH_MICRO_R` | 95.86 |
119
- | `MORPH_MICRO_F` | 96.24 |
120
- | `LEMMA_ACC` | 97.50 |
121
- | `DEP_UAS` | 80.58 |
122
- | `DEP_LAS` | 73.87 |
123
- | `ENTS_P` | 86.26 |
124
- | `ENTS_R` | 84.85 |
125
- | `ENTS_F` | 85.54 |
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.8636042403
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.8593530239
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.8614733874
24
  - task:
25
  name: TAG
26
  type: token-classification
27
  metrics:
28
  - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
+ value: 0.964256663
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
+ value: 0.9640652663
38
  - task:
39
  name: MORPH
40
  type: token-classification
41
  metrics:
42
  - name: Morph (UFeats) Accuracy
43
  type: accuracy
44
+ value: 0.9316681022
45
  - task:
46
  name: LEMMA
47
  type: token-classification
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
+ value: 0.9736867285
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
+ value: 0.8163795538
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
+ value: 0.7454391415
66
  - task:
67
  name: SENTS
68
  type: token-classification
69
  metrics:
70
  - name: Sentences F-Score
71
  type: f_score
72
+ value: 0.9776286353
73
  ---
74
  Core Hungarian model for HuSpaCy. Components: tok2vec, senter, tagger, morphologizer, lemmatizer, parser, ner
75
 
76
  | Feature | Description |
77
  | --- | --- |
78
  | **Name** | `hu_core_news_lg` |
79
+ | **Version** | `3.6.0` |
80
+ | **spaCy** | `>=3.6.0,<3.7.0` |
81
  | **Default Pipeline** | `tok2vec`, `senter`, `tagger`, `morphologizer`, `lookup_lemmatizer`, `trainable_lemmatizer`, `parser`, `ner` |
82
  | **Components** | `tok2vec`, `senter`, `tagger`, `morphologizer`, `lookup_lemmatizer`, `trainable_lemmatizer`, `parser`, `ner` |
83
  | **Vectors** | -1 keys, 200000 unique vectors (300 dimensions) |
84
+ | **Sources** | [UD Hungarian Szeged](https://universaldependencies.org/treebanks/hu_szeged/index.html) (Richárd Farkas, Katalin Simkó, Zsolt Szántó, Viktor Varga, Veronika Vincze (MTA-SZTE Research Group on Artificial Intelligence))<br />[NYTK-NerKor Corpus](https://github.com/nytud/NYTK-NerKor) (Eszter Simon, Noémi Vadász (Department of Language Technology and Applied Linguistics))<br />[Szeged NER Corpus](https://rgai.inf.u-szeged.hu/node/130) (György Szarvas, Richárd Farkas, László Felföldi, András Kocsor, János Csirik (MTA-SZTE Research Group on Artificial Intelligence))<br />[Hungarian lg Floret vectors](https://huggingface.co/huspacy/hu_vectors_web_lg) (Szeged AI) |
85
  | **License** | `cc-by-sa-4.0` |
86
  | **Author** | [SzegedAI, MILAB](https://github.com/huspacy/huspacy) |
87
 
 
108
  | `TOKEN_P` | 99.86 |
109
  | `TOKEN_R` | 99.93 |
110
  | `TOKEN_F` | 99.89 |
111
+ | `SENTS_P` | 98.20 |
112
+ | `SENTS_R` | 97.33 |
113
+ | `SENTS_F` | 97.76 |
114
+ | `TAG_ACC` | 96.43 |
115
+ | `POS_ACC` | 96.41 |
116
+ | `MORPH_ACC` | 93.17 |
117
+ | `MORPH_MICRO_P` | 96.48 |
118
+ | `MORPH_MICRO_R` | 95.78 |
119
+ | `MORPH_MICRO_F` | 96.13 |
120
+ | `LEMMA_ACC` | 97.37 |
121
+ | `DEP_UAS` | 81.64 |
122
+ | `DEP_LAS` | 74.54 |
123
+ | `ENTS_P` | 86.36 |
124
+ | `ENTS_R` | 85.94 |
125
+ | `ENTS_F` | 86.15 |
config.cfg CHANGED
@@ -1,8 +1,8 @@
1
  [paths]
2
- parser_model = "models/hu_core_news_lg-parser-3.5.2/model-best"
3
- ner_model = "models/hu_core_news_lg-ner-3.5.2/model-best"
4
- lemmatizer_lookups = "models/hu_core_news_lg-lookup-lemmatizer-3.5.2"
5
- tagger_model = "models/hu_core_news_lg-tagger-3.5.2/model-best"
6
  train = null
7
  dev = null
8
  vectors = null
@@ -32,6 +32,7 @@ source = ${paths.lemmatizer_lookups}
32
  [components.morphologizer]
33
  factory = "morphologizer"
34
  extend = false
 
35
  overwrite = true
36
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
37
 
@@ -118,6 +119,7 @@ upstream = "*"
118
 
119
  [components.tagger]
120
  factory = "tagger"
 
121
  neg_prefix = "!"
122
  overwrite = false
123
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
 
1
  [paths]
2
+ parser_model = "models/hu_core_news_lg-parser-3.6.0/model-best"
3
+ ner_model = "models/hu_core_news_lg-ner-3.6.0/model-best"
4
+ lemmatizer_lookups = "models/hu_core_news_lg-lookup-lemmatizer-3.6.0"
5
+ tagger_model = "models/hu_core_news_lg-tagger-3.6.0/model-best"
6
  train = null
7
  dev = null
8
  vectors = null
 
32
  [components.morphologizer]
33
  factory = "morphologizer"
34
  extend = false
35
+ label_smoothing = 0.0
36
  overwrite = true
37
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
38
 
 
119
 
120
  [components.tagger]
121
  factory = "tagger"
122
+ label_smoothing = 0.0
123
  neg_prefix = "!"
124
  overwrite = false
125
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
hu_core_news_lg-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:770c036d6223475a79c96dff245a193f4c9df56c8951fd5c1d7c0be59032124b
3
- size 401397054
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8f18cfe459ea0cbdccee0dcb624defd4cc23459940d4ef1803e6a24fb0f76d6d
3
+ size 401395351
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"hu",
3
  "name":"core_news_lg",
4
- "version":"3.5.2",
5
  "description":"Core Hungarian model for HuSpaCy. Components: tok2vec, senter, tagger, morphologizer, lemmatizer, parser, ner",
6
  "author":"SzegedAI, MILAB",
7
  "email":"gyorgy@orosz.link",
8
  "url":"https://github.com/huspacy/huspacy",
9
  "license":"cc-by-sa-4.0",
10
- "spacy_version":">=3.5.0,<3.6.0",
11
- "spacy_git_version":"Unknown",
12
  "vectors":{
13
  "width":300,
14
  "vectors":200000,
@@ -1268,85 +1268,90 @@
1268
  "token_p":0.998565417,
1269
  "token_r":0.9993300153,
1270
  "token_f":0.9989475698,
1271
- "sents_p":0.9797752809,
1272
- "sents_r":0.9710467706,
1273
- "sents_f":0.9753914989,
1274
- "tag_acc":0.9631543688,
1275
- "pos_acc":0.9651641305,
1276
- "morph_acc":0.9281270935,
1277
- "morph_micro_p":0.9661729037,
1278
- "morph_micro_r":0.9586162441,
1279
- "morph_micro_f":0.9623797403,
1280
  "morph_per_feat":{
1281
  "Definite":{
1282
- "p":0.9589416058,
1283
- "r":0.9808679421,
1284
- "f":0.9697808535
1285
  },
1286
  "PronType":{
1287
- "p":0.9690778575,
1288
- "r":0.9685430464,
1289
- "f":0.9688103781
1290
  },
1291
  "Case":{
1292
- "p":0.9735523943,
1293
- "r":0.9600869393,
1294
- "f":0.9667727815
1295
  },
1296
  "Degree":{
1297
- "p":0.9171075838,
1298
- "r":0.8652246256,
1299
- "f":0.8904109589
1300
  },
1301
  "Number":{
1302
- "p":0.9839418526,
1303
- "r":0.9755320932,
1304
- "f":0.9797189262
1305
  },
1306
  "Mood":{
1307
- "p":0.9307359307,
1308
- "r":0.9534368071,
1309
- "f":0.9419496166
1310
  },
1311
  "Person":{
1312
- "p":0.9507389163,
1313
- "r":0.9523026316,
1314
- "f":0.9515201315
1315
  },
1316
  "Tense":{
1317
- "p":0.961038961,
1318
- "r":0.9812154696,
1319
- "f":0.9710224166
1320
  },
1321
  "VerbForm":{
1322
- "p":0.9555555556,
1323
- "r":0.9310344828,
1324
- "f":0.9431356621
1325
  },
1326
  "Voice":{
1327
- "p":0.96,
1328
- "r":0.981595092,
1329
- "f":0.970677452
1330
  },
1331
  "Number[psor]":{
1332
- "p":0.9733333333,
1333
- "r":0.9358974359,
1334
- "f":0.954248366
1335
  },
1336
  "Person[psor]":{
1337
- "p":0.9777777778,
1338
- "r":0.9415121255,
1339
- "f":0.9593023256
1340
  },
1341
  "NumType":{
1342
- "p":0.9331683168,
1343
- "r":0.9195121951,
1344
- "f":0.9262899263
 
 
 
 
 
1345
  },
1346
  "Reflex":{
1347
  "p":1.0,
1348
- "r":0.75,
1349
- "f":0.8571428571
1350
  },
1351
  "Aspect":{
1352
  "p":0.0,
@@ -1357,121 +1362,116 @@
1357
  "p":0.0,
1358
  "r":0.0,
1359
  "f":0.0
1360
- },
1361
- "Poss":{
1362
- "p":1.0,
1363
- "r":1.0,
1364
- "f":1.0
1365
  }
1366
  },
1367
- "lemma_acc":0.9750263133,
1368
- "dep_uas":0.8058213942,
1369
- "dep_las":0.7387315968,
1370
  "dep_las_per_type":{
1371
  "det":{
1372
- "p":0.8639562158,
1373
- "r":0.8797770701,
1374
- "f":0.8717948718
1375
  },
1376
  "amod:att":{
1377
- "p":0.8263565891,
1378
- "r":0.8716271464,
1379
- "f":0.8483883804
1380
  },
1381
  "nsubj":{
1382
- "p":0.7944444444,
1383
- "r":0.6703125,
1384
- "f":0.7271186441
1385
  },
1386
  "advmod:mode":{
1387
- "p":0.5974025974,
1388
- "r":0.5637254902,
1389
- "f":0.580075662
1390
  },
1391
  "nmod:att":{
1392
- "p":0.8021390374,
1393
- "r":0.7627118644,
1394
- "f":0.7819287576
1395
  },
1396
  "obl":{
1397
- "p":0.7427055703,
1398
- "r":0.7560756076,
1399
- "f":0.7493309545
1400
  },
1401
  "obj":{
1402
- "p":0.8490153173,
1403
- "r":0.8719101124,
1404
- "f":0.8603104213
1405
  },
1406
  "root":{
1407
- "p":0.7617977528,
1408
- "r":0.7550111359,
1409
- "f":0.7583892617
1410
  },
1411
  "cc":{
1412
- "p":0.6495726496,
1413
- "r":0.64,
1414
- "f":0.6447507953
1415
  },
1416
  "conj":{
1417
- "p":0.4672489083,
1418
- "r":0.4458333333,
1419
- "f":0.4562899787
1420
  },
1421
  "advmod":{
1422
- "p":0.8333333333,
1423
- "r":0.8421052632,
1424
- "f":0.8376963351
1425
  },
1426
  "flat:name":{
1427
- "p":0.865470852,
1428
- "r":0.9018691589,
1429
- "f":0.8832951945
1430
  },
1431
  "appos":{
1432
- "p":0.5357142857,
1433
- "r":0.3191489362,
1434
- "f":0.4
1435
  },
1436
  "advcl":{
1437
- "p":0.2459016393,
1438
- "r":0.306122449,
1439
- "f":0.2727272727
1440
  },
1441
  "advmod:tlocy":{
1442
- "p":0.7323943662,
1443
- "r":0.6782608696,
1444
- "f":0.7042889391
1445
  },
1446
  "ccomp:obj":{
1447
- "p":0.2258064516,
1448
- "r":0.2121212121,
1449
- "f":0.21875
1450
  },
1451
  "mark":{
1452
- "p":0.8108108108,
1453
- "r":0.7594936709,
1454
- "f":0.7843137255
1455
  },
1456
  "compound:preverb":{
1457
- "p":0.9142857143,
1458
- "r":0.880733945,
1459
- "f":0.8971962617
1460
  },
1461
  "advmod:locy":{
1462
- "p":0.9230769231,
1463
- "r":0.375,
1464
- "f":0.5333333333
1465
  },
1466
  "cop":{
1467
- "p":0.7142857143,
1468
- "r":0.7317073171,
1469
- "f":0.7228915663
1470
  },
1471
  "nmod:obl":{
1472
- "p":0.25,
1473
- "r":0.075,
1474
- "f":0.1153846154
1475
  },
1476
  "advmod:to":{
1477
  "p":0.0,
@@ -1479,54 +1479,54 @@
1479
  "f":0.0
1480
  },
1481
  "obj:lvc":{
1482
- "p":0.3333333333,
1483
- "r":0.0833333333,
1484
- "f":0.1333333333
1485
  },
1486
  "ccomp:obl":{
1487
- "p":0.3111111111,
1488
- "r":0.4375,
1489
- "f":0.3636363636
1490
  },
1491
- "iobj":{
1492
- "p":0.3846153846,
1493
- "r":0.3333333333,
1494
- "f":0.3571428571
1495
  },
1496
  "csubj":{
1497
- "p":0.4583333333,
1498
- "r":0.2972972973,
1499
- "f":0.3606557377
1500
  },
1501
  "parataxis":{
1502
- "p":0.2051282051,
1503
- "r":0.1095890411,
1504
- "f":0.1428571429
 
 
 
 
 
 
 
 
 
 
1505
  },
1506
  "dep":{
1507
  "p":0.0,
1508
  "r":0.0,
1509
  "f":0.0
1510
  },
1511
- "case":{
1512
- "p":0.9090909091,
1513
- "r":0.9183673469,
1514
- "f":0.9137055838
1515
- },
1516
- "xcomp":{
1517
- "p":0.7948717949,
1518
- "r":0.8378378378,
1519
- "f":0.8157894737
1520
- },
1521
- "nummod":{
1522
- "p":0.5421686747,
1523
- "r":0.4838709677,
1524
- "f":0.5113636364
1525
- },
1526
  "acl":{
1527
- "p":0.4561403509,
1528
- "r":0.3611111111,
1529
- "f":0.4031007752
 
 
 
 
 
1530
  },
1531
  "advmod:tto":{
1532
  "p":0.6666666667,
@@ -1534,14 +1534,14 @@
1534
  "f":0.3076923077
1535
  },
1536
  "nmod":{
1537
- "p":0.25,
1538
  "r":0.0909090909,
1539
- "f":0.1333333333
1540
  },
1541
  "aux":{
1542
- "p":0.875,
1543
- "r":0.5833333333,
1544
- "f":0.7
1545
  },
1546
  "advmod:tfrom":{
1547
  "p":0.0,
@@ -1554,9 +1554,9 @@
1554
  "f":0.0
1555
  },
1556
  "compound":{
1557
- "p":0.9736842105,
1558
- "r":0.925,
1559
- "f":0.9487179487
1560
  },
1561
  "obl:lvc":{
1562
  "p":0.0,
@@ -1584,9 +1584,9 @@
1584
  "f":0.0
1585
  },
1586
  "advmod:que":{
1587
- "p":1.0,
1588
- "r":0.25,
1589
- "f":0.4
1590
  },
1591
  "ccomp:pred":{
1592
  "p":0.0,
@@ -1594,32 +1594,32 @@
1594
  "f":0.0
1595
  }
1596
  },
1597
- "ents_p":0.8625558534,
1598
- "ents_r":0.8484528833,
1599
- "ents_f":0.8554462466,
1600
  "ents_per_type":{
1601
  "ORG":{
1602
- "p":0.8912839738,
1603
- "r":0.8817802503,
1604
- "f":0.8865066418
1605
  },
1606
  "PER":{
1607
- "p":0.8894202032,
1608
- "r":0.8888888889,
1609
- "f":0.8891544667
1610
  },
1611
  "LOC":{
1612
- "p":0.8619173263,
1613
  "r":0.8506944444,
1614
- "f":0.8562691131
1615
  },
1616
  "MISC":{
1617
- "p":0.7004608295,
1618
- "r":0.6468085106,
1619
- "f":0.6725663717
1620
  }
1621
  },
1622
- "speed":869.7495281146
1623
  },
1624
  "sources":[
1625
  {
@@ -1634,12 +1634,6 @@
1634
  "license":"CC BY-SA 4.0",
1635
  "author":"Eszter Simon, No\u00e9mi Vad\u00e1sz (Department of Language Technology and Applied Linguistics)"
1636
  },
1637
- {
1638
- "name":"hunNERwiki",
1639
- "url":"http://hlt.sztaki.hu/resources/hunnerwiki.html",
1640
- "license":"CC-BY-SA-3.0",
1641
- "author":"Eszter Simon, D\u00e1vid M\u00e1rk Nemeskey (HLT Group, Budapest University of Technology and Economics)"
1642
- },
1643
  {
1644
  "name":"Szeged NER Corpus",
1645
  "url":"https://rgai.inf.u-szeged.hu/node/130",
@@ -1647,10 +1641,10 @@
1647
  "author":"Gy\u00f6rgy Szarvas, Rich\u00e1rd Farkas, L\u00e1szl\u00f3 Felf\u00f6ldi, Andr\u00e1s Kocsor, J\u00e1nos Csirik (MTA-SZTE Research Group on Artificial Intelligence)"
1648
  },
1649
  {
1650
- "name":"Webcorpuswiki word2vec model",
1651
- "url":"https://github.com/oroszgy/hunlp-resources/releases/tag/webcorpuswiki_word2vec_v0.1",
1652
  "license":"CC-BY-SA-4.0",
1653
- "author":"Gy\u00f6rgy Orosz"
1654
  }
1655
  ],
1656
  "requirements":[
 
1
  {
2
  "lang":"hu",
3
  "name":"core_news_lg",
4
+ "version":"3.6.0",
5
  "description":"Core Hungarian model for HuSpaCy. Components: tok2vec, senter, tagger, morphologizer, lemmatizer, parser, ner",
6
  "author":"SzegedAI, MILAB",
7
  "email":"gyorgy@orosz.link",
8
  "url":"https://github.com/huspacy/huspacy",
9
  "license":"cc-by-sa-4.0",
10
+ "spacy_version":">=3.6.0,<3.7.0",
11
+ "spacy_git_version":"6fc153a26",
12
  "vectors":{
13
  "width":300,
14
  "vectors":200000,
 
1268
  "token_p":0.998565417,
1269
  "token_r":0.9993300153,
1270
  "token_f":0.9989475698,
1271
+ "sents_p":0.9820224719,
1272
+ "sents_r":0.9732739421,
1273
+ "sents_f":0.9776286353,
1274
+ "tag_acc":0.964256663,
1275
+ "pos_acc":0.9640652663,
1276
+ "morph_acc":0.9316681022,
1277
+ "morph_micro_p":0.9648484848,
1278
+ "morph_micro_r":0.9577997422,
1279
+ "morph_micro_f":0.9613111926,
1280
  "morph_per_feat":{
1281
  "Definite":{
1282
+ "p":0.9579908676,
1283
+ "r":0.9790013999,
1284
+ "f":0.9683821832
1285
  },
1286
  "PronType":{
1287
+ "p":0.9712707182,
1288
+ "r":0.9701986755,
1289
+ "f":0.9707344009
1290
  },
1291
  "Case":{
1292
+ "p":0.9725835501,
1293
+ "r":0.9602845287,
1294
+ "f":0.9663949095
1295
  },
1296
  "Degree":{
1297
+ "p":0.9126637555,
1298
+ "r":0.8693843594,
1299
+ "f":0.8904985087
1300
  },
1301
  "Number":{
1302
+ "p":0.9851351351,
1303
+ "r":0.9773755656,
1304
+ "f":0.9812400101
1305
  },
1306
  "Mood":{
1307
+ "p":0.9326818675,
1308
+ "r":0.9523281596,
1309
+ "f":0.942402633
1310
  },
1311
  "Person":{
1312
+ "p":0.9488026424,
1313
+ "r":0.9449013158,
1314
+ "f":0.9468479604
1315
  },
1316
  "Tense":{
1317
+ "p":0.9543973941,
1318
+ "r":0.9712707182,
1319
+ "f":0.9627601314
1320
  },
1321
  "VerbForm":{
1322
+ "p":0.950166113,
1323
+ "r":0.9174017642,
1324
+ "f":0.933496532
1325
  },
1326
  "Voice":{
1327
+ "p":0.9508525577,
1328
+ "r":0.9693251534,
1329
+ "f":0.96
1330
  },
1331
  "Number[psor]":{
1332
+ "p":0.9708029197,
1333
+ "r":0.9472934473,
1334
+ "f":0.9589041096
1335
  },
1336
  "Person[psor]":{
1337
+ "p":0.9722627737,
1338
+ "r":0.9500713267,
1339
+ "f":0.961038961
1340
  },
1341
  "NumType":{
1342
+ "p":0.9305210918,
1343
+ "r":0.9146341463,
1344
+ "f":0.9225092251
1345
+ },
1346
+ "Poss":{
1347
+ "p":0.75,
1348
+ "r":1.0,
1349
+ "f":0.8571428571
1350
  },
1351
  "Reflex":{
1352
  "p":1.0,
1353
+ "r":0.875,
1354
+ "f":0.9333333333
1355
  },
1356
  "Aspect":{
1357
  "p":0.0,
 
1362
  "p":0.0,
1363
  "r":0.0,
1364
  "f":0.0
 
 
 
 
 
1365
  }
1366
  },
1367
+ "lemma_acc":0.9736867285,
1368
+ "dep_uas":0.8163795538,
1369
+ "dep_las":0.7454391415,
1370
  "dep_las_per_type":{
1371
  "det":{
1372
+ "p":0.8554125662,
1373
+ "r":0.8996815287,
1374
+ "f":0.8769887466
1375
  },
1376
  "amod:att":{
1377
+ "p":0.8253968254,
1378
+ "r":0.8503679477,
1379
+ "f":0.8376963351
1380
  },
1381
  "nsubj":{
1382
+ "p":0.7557755776,
1383
+ "r":0.715625,
1384
+ "f":0.735152488
1385
  },
1386
  "advmod:mode":{
1387
+ "p":0.6124031008,
1388
+ "r":0.5808823529,
1389
+ "f":0.5962264151
1390
  },
1391
  "nmod:att":{
1392
+ "p":0.8036697248,
1393
+ "r":0.7423728814,
1394
+ "f":0.7718061674
1395
  },
1396
  "obl":{
1397
+ "p":0.7622005324,
1398
+ "r":0.7731773177,
1399
+ "f":0.7676496872
1400
  },
1401
  "obj":{
1402
+ "p":0.842920354,
1403
+ "r":0.8561797753,
1404
+ "f":0.8494983278
1405
  },
1406
  "root":{
1407
+ "p":0.806741573,
1408
+ "r":0.7995545657,
1409
+ "f":0.8031319911
1410
  },
1411
  "cc":{
1412
+ "p":0.6993318486,
1413
+ "r":0.6610526316,
1414
+ "f":0.6796536797
1415
  },
1416
  "conj":{
1417
+ "p":0.5103448276,
1418
+ "r":0.4625,
1419
+ "f":0.4852459016
1420
  },
1421
  "advmod":{
1422
+ "p":0.7572815534,
1423
+ "r":0.8210526316,
1424
+ "f":0.7878787879
1425
  },
1426
  "flat:name":{
1427
+ "p":0.7791164659,
1428
+ "r":0.9065420561,
1429
+ "f":0.838012959
1430
  },
1431
  "appos":{
1432
+ "p":0.4426229508,
1433
+ "r":0.2872340426,
1434
+ "f":0.3483870968
1435
  },
1436
  "advcl":{
1437
+ "p":0.3186813187,
1438
+ "r":0.2959183673,
1439
+ "f":0.3068783069
1440
  },
1441
  "advmod:tlocy":{
1442
+ "p":0.7327586207,
1443
+ "r":0.7391304348,
1444
+ "f":0.7359307359
1445
  },
1446
  "ccomp:obj":{
1447
+ "p":0.2105263158,
1448
+ "r":0.3636363636,
1449
+ "f":0.2666666667
1450
  },
1451
  "mark":{
1452
+ "p":0.7701863354,
1453
+ "r":0.7848101266,
1454
+ "f":0.7774294671
1455
  },
1456
  "compound:preverb":{
1457
+ "p":0.9026548673,
1458
+ "r":0.9357798165,
1459
+ "f":0.9189189189
1460
  },
1461
  "advmod:locy":{
1462
+ "p":0.652173913,
1463
+ "r":0.46875,
1464
+ "f":0.5454545455
1465
  },
1466
  "cop":{
1467
+ "p":0.6666666667,
1468
+ "r":0.6829268293,
1469
+ "f":0.6746987952
1470
  },
1471
  "nmod:obl":{
1472
+ "p":0.2962962963,
1473
+ "r":0.2,
1474
+ "f":0.2388059701
1475
  },
1476
  "advmod:to":{
1477
  "p":0.0,
 
1479
  "f":0.0
1480
  },
1481
  "obj:lvc":{
1482
+ "p":0.0,
1483
+ "r":0.0,
1484
+ "f":0.0
1485
  },
1486
  "ccomp:obl":{
1487
+ "p":0.3684210526,
1488
+ "r":0.21875,
1489
+ "f":0.2745098039
1490
  },
1491
+ "case":{
1492
+ "p":0.9468085106,
1493
+ "r":0.9081632653,
1494
+ "f":0.9270833333
1495
  },
1496
  "csubj":{
1497
+ "p":0.4375,
1498
+ "r":0.1891891892,
1499
+ "f":0.2641509434
1500
  },
1501
  "parataxis":{
1502
+ "p":0.1612903226,
1503
+ "r":0.0684931507,
1504
+ "f":0.0961538462
1505
+ },
1506
+ "xcomp":{
1507
+ "p":0.8472222222,
1508
+ "r":0.8243243243,
1509
+ "f":0.8356164384
1510
+ },
1511
+ "nummod":{
1512
+ "p":0.6111111111,
1513
+ "r":0.4731182796,
1514
+ "f":0.5333333333
1515
  },
1516
  "dep":{
1517
  "p":0.0,
1518
  "r":0.0,
1519
  "f":0.0
1520
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1521
  "acl":{
1522
+ "p":0.3043478261,
1523
+ "r":0.2916666667,
1524
+ "f":0.2978723404
1525
+ },
1526
+ "iobj":{
1527
+ "p":0.0,
1528
+ "r":0.0,
1529
+ "f":0.0
1530
  },
1531
  "advmod:tto":{
1532
  "p":0.6666666667,
 
1534
  "f":0.3076923077
1535
  },
1536
  "nmod":{
1537
+ "p":0.1666666667,
1538
  "r":0.0909090909,
1539
+ "f":0.1176470588
1540
  },
1541
  "aux":{
1542
+ "p":1.0,
1543
+ "r":0.6666666667,
1544
+ "f":0.8
1545
  },
1546
  "advmod:tfrom":{
1547
  "p":0.0,
 
1554
  "f":0.0
1555
  },
1556
  "compound":{
1557
+ "p":0.8666666667,
1558
+ "r":0.975,
1559
+ "f":0.9176470588
1560
  },
1561
  "obl:lvc":{
1562
  "p":0.0,
 
1584
  "f":0.0
1585
  },
1586
  "advmod:que":{
1587
+ "p":0.0,
1588
+ "r":0.0,
1589
+ "f":0.0
1590
  },
1591
  "ccomp:pred":{
1592
  "p":0.0,
 
1594
  "f":0.0
1595
  }
1596
  },
1597
+ "ents_p":0.8636042403,
1598
+ "ents_r":0.8593530239,
1599
+ "ents_f":0.8614733874,
1600
  "ents_per_type":{
1601
  "ORG":{
1602
+ "p":0.8953974895,
1603
+ "r":0.892906815,
1604
+ "f":0.8941504178
1605
  },
1606
  "PER":{
1607
+ "p":0.8699830413,
1608
+ "r":0.9193548387,
1609
+ "f":0.8939878013
1610
  },
1611
  "LOC":{
1612
+ "p":0.8781362007,
1613
  "r":0.8506944444,
1614
+ "f":0.8641975309
1615
  },
1616
  "MISC":{
1617
+ "p":0.7099358974,
1618
+ "r":0.6283687943,
1619
+ "f":0.6666666667
1620
  }
1621
  },
1622
+ "speed":901.0325291331
1623
  },
1624
  "sources":[
1625
  {
 
1634
  "license":"CC BY-SA 4.0",
1635
  "author":"Eszter Simon, No\u00e9mi Vad\u00e1sz (Department of Language Technology and Applied Linguistics)"
1636
  },
 
 
 
 
 
 
1637
  {
1638
  "name":"Szeged NER Corpus",
1639
  "url":"https://rgai.inf.u-szeged.hu/node/130",
 
1641
  "author":"Gy\u00f6rgy Szarvas, Rich\u00e1rd Farkas, L\u00e1szl\u00f3 Felf\u00f6ldi, Andr\u00e1s Kocsor, J\u00e1nos Csirik (MTA-SZTE Research Group on Artificial Intelligence)"
1642
  },
1643
  {
1644
+ "name":"Hungarian lg Floret vectors",
1645
+ "url":"https://huggingface.co/huspacy/hu_vectors_web_lg",
1646
  "license":"CC-BY-SA-4.0",
1647
+ "author":"Szeged AI"
1648
  }
1649
  ],
1650
  "requirements":[
morphologizer/cfg CHANGED
@@ -1,5 +1,6 @@
1
  {
2
  "extend":false,
 
3
  "labels_morph":{
4
  "Definite=Def|POS=DET|PronType=Art":"Definite=Def|PronType=Art",
5
  "Case=Ine|Number=Sing|POS=NOUN":"Case=Ine|Number=Sing",
 
1
  {
2
  "extend":false,
3
+ "label_smoothing":0.0,
4
  "labels_morph":{
5
  "Definite=Def|POS=DET|PronType=Art":"Definite=Def|PronType=Art",
6
  "Case=Ine|Number=Sing|POS=NOUN":"Case=Ine|Number=Sing",
morphologizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:81dd0c7d95f61db5a02b45252a60262bd43f82cfddf57926cc4f51fc711e2e87
3
  size 1379030
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:19646b3038a0758c374f45ec4672ed54cdc468fbeacbad4d3b9075092a5c8529
3
  size 1379030
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:67053f5ddd197bbe6da50d1c3d6dc7f16fbea7972fb9ce2c6cc098d583f085a4
3
  size 56989063
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0a2ea09876719885709fb906c812cbfcc4ed4549056b415d6ab627f3811dbaa1
3
  size 56989063
parser/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5069eb5a7525340710678346a2d99a2d85995cccba9421cb37ad6799241f697d
3
  size 26010735
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3c9faebc55d312e2e1b98b2b118e47145563123f788c619cf3c6301c7ddd31e0
3
  size 26010735
senter/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bf184bac777da8b0bcbe134f241efe0c178723d4bcf170dc914e82fda3029e5f
3
  size 2845
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f028f27d316a6a9d513f27769200316f1d691a112a4b16253592dbd10789158d
3
  size 2845
tagger/cfg CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "labels":[
3
  "ADJ",
4
  "ADP",
 
1
  {
2
+ "label_smoothing":0.0,
3
  "labels":[
4
  "ADJ",
5
  "ADP",
tagger/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e7ba90d7dd109b57031df893dc0c918ddb54fb9594a3c7f67fa60d0170c18bf6
3
  size 20905
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5343a2575e4e3f902f4753fd6a6b8bc61258b2d83f47d57baba684f1b71084e3
3
  size 20905
tok2vec/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1a8d6df82fe78e90f71b04506b779522c126239fc1c3c96870cb8d9741575ad2
3
  size 56806299
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:55093574bbfc26486020ca0de33e8e1e92f6f58ea68e58b086d20eb79fa55ac5
3
  size 56806299
trainable_lemmatizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2437c4135a099927993ed593d33d38563cc3ca389b42c6c1e8278945b61ea3c3
3
  size 61643136
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b88a18f7e6e0950d64532696fbeb0da25566cc4a4c2a6370bd333b26980377d
3
  size 61643136
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c40d7446c11ac590183dda3c5e311b20ef1c4b76f691884a30db78453b3789db
3
- size 6402547
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:578a7114ebd95230499087da9ca620a5ddfb83a4d47155461531e1871435a3a6
3
+ size 6402680
vocab/vectors.cfg CHANGED
@@ -5,5 +5,6 @@
5
  "hash_count":2,
6
  "hash_seed":2166136261,
7
  "bow":"<",
8
- "eow":">"
 
9
  }
 
5
  "hash_count":2,
6
  "hash_seed":2166136261,
7
  "bow":"<",
8
+ "eow":">",
9
+ "attr":65
10
  }