oroszgy commited on
Commit
00d3554
1 Parent(s): fba6ab3

Update spacy pipeline to 3.7.0

Browse files
README.md CHANGED
@@ -14,74 +14,74 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.9096153846
18
  - name: NER Recall
19
  type: recall
20
- value: 0.9147327707
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.9121669004
24
  - task:
25
  name: TAG
26
  type: token-classification
27
  metrics:
28
  - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
- value: 0.9840662233
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
- value: 0.983204938
38
  - task:
39
  name: MORPH
40
  type: token-classification
41
  metrics:
42
  - name: Morph (UFeats) Accuracy
43
  type: accuracy
44
- value: 0.9666953775
45
  - task:
46
  name: LEMMA
47
  type: token-classification
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
- value: 0.9864127835
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
- value: 0.9010957462
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
- value: 0.8617769485
66
  - task:
67
  name: SENTS
68
  type: token-classification
69
  metrics:
70
  - name: Sentences F-Score
71
  type: f_score
72
- value: 0.9966555184
73
  ---
74
  Hungarian transformer pipeline (huBERT) for HuSpaCy. Components: transformer, senter, tagger, morphologizer, lemmatizer, parser, ner
75
 
76
  | Feature | Description |
77
  | --- | --- |
78
  | **Name** | `hu_core_news_trf` |
79
- | **Version** | `3.5.4` |
80
- | **spaCy** | `>=3.5.0,<3.6.0` |
81
  | **Default Pipeline** | `transformer`, `senter`, `tagger`, `morphologizer`, `lookup_lemmatizer`, `trainable_lemmatizer`, `experimental_arc_predicter`, `experimental_arc_labeler`, `ner` |
82
  | **Components** | `transformer`, `senter`, `tagger`, `morphologizer`, `lookup_lemmatizer`, `trainable_lemmatizer`, `experimental_arc_predicter`, `experimental_arc_labeler`, `ner` |
83
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
84
- | **Sources** | [UD Hungarian Szeged](https://universaldependencies.org/treebanks/hu_szeged/index.html) (Richárd Farkas, Katalin Simkó, Zsolt Szántó, Viktor Varga, Veronika Vincze (MTA-SZTE Research Group on Artificial Intelligence))<br />[NYTK-NerKor Corpus](https://github.com/nytud/NYTK-NerKor) (Eszter Simon, Noémi Vadász (Department of Language Technology and Applied Linguistics))<br />[Szeged NER Corpus](https://rgai.inf.u-szeged.hu/node/130) (György Szarvas, Richárd Farkas, László Felföldi, András Kocsor, János Csirik (MTA-SZTE Research Group on Artificial Intelligence))<br />[huBERT base model (cased)](https://huggingface.co/SZTAKI-HLT/hubert-base-cc) (Dávid Márk Nemeskey (SZTAKI-HLT)) |
85
  | **License** | `cc-by-sa-4.0` |
86
  | **Author** | [SzegedAI, MILAB](https://github.com/huspacy/huspacy) |
87
 
@@ -108,20 +108,20 @@ Hungarian transformer pipeline (huBERT) for HuSpaCy. Components: transformer, se
108
  | `TOKEN_P` | 99.86 |
109
  | `TOKEN_R` | 99.93 |
110
  | `TOKEN_F` | 99.89 |
111
- | `SENTS_P` | 99.78 |
112
- | `SENTS_R` | 99.55 |
113
- | `SENTS_F` | 99.67 |
114
- | `TAG_ACC` | 98.41 |
115
- | `POS_ACC` | 98.32 |
116
- | `MORPH_ACC` | 96.67 |
117
- | `MORPH_MICRO_P` | 98.82 |
118
- | `MORPH_MICRO_R` | 98.53 |
119
- | `MORPH_MICRO_F` | 98.67 |
120
- | `LEMMA_ACC` | 98.64 |
121
- | `BOUND_DEP_LAS` | 86.17 |
122
- | `BOUND_DEP_UAS` | 90.11 |
123
- | `DEP_UAS` | 90.11 |
124
- | `DEP_LAS` | 86.18 |
125
- | `ENTS_P` | 90.96 |
126
- | `ENTS_R` | 91.47 |
127
- | `ENTS_F` | 91.22 |
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.9119332986
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.9229957806
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.9174311927
24
  - task:
25
  name: TAG
26
  type: token-classification
27
  metrics:
28
  - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
+ value: 0.9823906594
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
+ value: 0.9820078476
38
  - task:
39
  name: MORPH
40
  type: token-classification
41
  metrics:
42
  - name: Morph (UFeats) Accuracy
43
  type: accuracy
44
+ value: 0.9658340511
45
  - task:
46
  name: LEMMA
47
  type: token-classification
48
  metrics:
49
  - name: Lemma Accuracy
50
  type: accuracy
51
+ value: 0.9861257296
52
  - task:
53
  name: UNLABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Unlabeled Attachment Score (UAS)
57
  type: f_score
58
+ value: 0.9000861326
59
  - task:
60
  name: LABELED_DEPENDENCIES
61
  type: token-classification
62
  metrics:
63
  - name: Labeled Attachment Score (LAS)
64
  type: f_score
65
+ value: 0.8568421053
66
  - task:
67
  name: SENTS
68
  type: token-classification
69
  metrics:
70
  - name: Sentences F-Score
71
  type: f_score
72
+ value: 0.9899888765
73
  ---
74
  Hungarian transformer pipeline (huBERT) for HuSpaCy. Components: transformer, senter, tagger, morphologizer, lemmatizer, parser, ner
75
 
76
  | Feature | Description |
77
  | --- | --- |
78
  | **Name** | `hu_core_news_trf` |
79
+ | **Version** | `3.7.0` |
80
+ | **spaCy** | `>=3.7.0,<3.8.0` |
81
  | **Default Pipeline** | `transformer`, `senter`, `tagger`, `morphologizer`, `lookup_lemmatizer`, `trainable_lemmatizer`, `experimental_arc_predicter`, `experimental_arc_labeler`, `ner` |
82
  | **Components** | `transformer`, `senter`, `tagger`, `morphologizer`, `lookup_lemmatizer`, `trainable_lemmatizer`, `experimental_arc_predicter`, `experimental_arc_labeler`, `ner` |
83
  | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
84
+ | **Sources** | [UD Hungarian Szeged](https://universaldependencies.org/treebanks/hu_szeged/index.html) (Richárd Farkas, Katalin Simkó, Zsolt Szántó, Viktor Varga, Veronika Vincze (MTA-SZTE Research Group on Artificial Intelligence))<br>[NYTK-NerKor Corpus](https://github.com/nytud/NYTK-NerKor) (Eszter Simon, Noémi Vadász (Department of Language Technology and Applied Linguistics))<br>[Szeged NER Corpus](https://rgai.inf.u-szeged.hu/node/130) (György Szarvas, Richárd Farkas, László Felföldi, András Kocsor, János Csirik (MTA-SZTE Research Group on Artificial Intelligence))<br>[huBERT base model (cased)](https://huggingface.co/SZTAKI-HLT/hubert-base-cc) (Dávid Márk Nemeskey (SZTAKI-HLT)) |
85
  | **License** | `cc-by-sa-4.0` |
86
  | **Author** | [SzegedAI, MILAB](https://github.com/huspacy/huspacy) |
87
 
108
  | `TOKEN_P` | 99.86 |
109
  | `TOKEN_R` | 99.93 |
110
  | `TOKEN_F` | 99.89 |
111
+ | `SENTS_P` | 98.89 |
112
+ | `SENTS_R` | 99.11 |
113
+ | `SENTS_F` | 99.00 |
114
+ | `TAG_ACC` | 98.24 |
115
+ | `POS_ACC` | 98.20 |
116
+ | `MORPH_ACC` | 96.58 |
117
+ | `MORPH_MICRO_P` | 98.81 |
118
+ | `MORPH_MICRO_R` | 98.40 |
119
+ | `MORPH_MICRO_F` | 98.60 |
120
+ | `LEMMA_ACC` | 98.61 |
121
+ | `BOUND_DEP_LAS` | 85.75 |
122
+ | `BOUND_DEP_UAS` | 90.08 |
123
+ | `DEP_UAS` | 90.01 |
124
+ | `DEP_LAS` | 85.68 |
125
+ | `ENTS_P` | 91.19 |
126
+ | `ENTS_R` | 92.30 |
127
+ | `ENTS_F` | 91.74 |
config.cfg CHANGED
@@ -1,8 +1,8 @@
1
  [paths]
2
- tagger_model = "models/hu_core_news_trf-tagger-3.5.4/model-best"
3
- parser_model = "models/hu_core_news_trf-parser-3.5.4/model-best"
4
- ner_model = "models/hu_core_news_trf-ner-3.5.4/model-best"
5
- lemmatizer_lookups = "models/hu_core_news_trf-lookup-lemmatizer-3.5.4"
6
  train = null
7
  dev = null
8
  vectors = null
@@ -21,6 +21,7 @@ before_creation = null
21
  after_creation = null
22
  after_pipeline_creation = null
23
  batch_size = 1000
 
24
 
25
  [components]
26
 
@@ -68,6 +69,7 @@ source = ${paths.lemmatizer_lookups}
68
  [components.morphologizer]
69
  factory = "morphologizer"
70
  extend = false
 
71
  overwrite = true
72
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
73
 
@@ -137,6 +139,7 @@ pooling = {"@layers":"reduce_mean.v1"}
137
 
138
  [components.tagger]
139
  factory = "tagger"
 
140
  neg_prefix = "!"
141
  overwrite = false
142
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
1
  [paths]
2
+ tagger_model = "models/hu_core_news_trf-tagger-3.7.0/model-best"
3
+ parser_model = "models/hu_core_news_trf-parser-3.7.0/model-best"
4
+ ner_model = "models/hu_core_news_trf-ner-3.7.0/model-best"
5
+ lemmatizer_lookups = "models/hu_core_news_trf-lookup-lemmatizer-3.7.0"
6
  train = null
7
  dev = null
8
  vectors = null
21
  after_creation = null
22
  after_pipeline_creation = null
23
  batch_size = 1000
24
+ vectors = {"@vectors":"spacy.Vectors.v1"}
25
 
26
  [components]
27
 
69
  [components.morphologizer]
70
  factory = "morphologizer"
71
  extend = false
72
+ label_smoothing = 0.0
73
  overwrite = true
74
  scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
75
 
139
 
140
  [components.tagger]
141
  factory = "tagger"
142
+ label_smoothing = 0.0
143
  neg_prefix = "!"
144
  overwrite = false
145
  scorer = {"@scorers":"spacy.tagger_scorer.v1"}
experimental_arc_labeler/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cb59cc0205b86a94e103b7d2bbd865ad4922d3b91f186799cb03f80fb3577fdf
3
  size 14947179
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d37c8f0d5f91c56787bd856d7a12eee7247f72238fb7217b2aa12ac8ff2cc39f
3
  size 14947179
experimental_arc_predicter/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e46abc869cabbac21a82fad9264db23dc394088b627e22ccfa90d95eabd3713f
3
  size 413192
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ebd608dfc960e4eb7b25e9f8a776cc9efd36943a0f1aaee03333d44ff81062de
3
  size 413192
hu_core_news_trf-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6528cf2995cdd4eaed50ece5eab4e3134b57bb283f9c8b57c15feca4caf79888
3
- size 1266409271
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2cc6fe0a266c90737e20c9506be8d6a2f525dd610d559136270dae1dcc82c755
3
+ size 1266392839
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"hu",
3
  "name":"core_news_trf",
4
- "version":"3.5.4",
5
  "description":"Hungarian transformer pipeline (huBERT) for HuSpaCy. Components: transformer, senter, tagger, morphologizer, lemmatizer, parser, ner",
6
  "author":"SzegedAI, MILAB",
7
  "email":"gyorgy@orosz.link",
8
  "url":"https://github.com/huspacy/huspacy",
9
  "license":"cc-by-sa-4.0",
10
- "spacy_version":">=3.5.0,<3.6.0",
11
- "spacy_git_version":"Unknown",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
@@ -1281,85 +1281,80 @@
1281
  "token_p":0.998565417,
1282
  "token_r":0.9993300153,
1283
  "token_f":0.9989475698,
1284
- "sents_p":0.9977678571,
1285
- "sents_r":0.995545657,
1286
- "sents_f":0.9966555184,
1287
- "tag_acc":0.9840662233,
1288
- "pos_acc":0.983204938,
1289
- "morph_acc":0.9666953775,
1290
- "morph_micro_p":0.9881901642,
1291
- "morph_micro_r":0.9852599914,
1292
- "morph_micro_f":0.9867229025,
1293
  "morph_per_feat":{
1294
  "Definite":{
1295
- "p":0.9897531439,
1296
- "r":0.99160056,
1297
- "f":0.9906759907
1298
  },
1299
  "PronType":{
1300
- "p":0.9905921417,
1301
- "r":0.9878587196,
1302
- "f":0.9892235424
1303
  },
1304
  "Case":{
1305
- "p":0.9934614623,
1306
- "r":0.9907132978,
1307
- "f":0.9920854769
1308
  },
1309
  "Degree":{
1310
- "p":0.9576490925,
1311
- "r":0.921797005,
1312
- "f":0.9393810937
1313
  },
1314
  "Number":{
1315
- "p":0.9949655983,
1316
- "r":0.9936316407,
1317
- "f":0.9942981721
1318
  },
1319
  "Mood":{
1320
- "p":0.9812568908,
1321
- "r":0.9866962306,
1322
- "f":0.9839690437
1323
  },
1324
  "Person":{
1325
- "p":0.9803761243,
1326
- "r":0.9860197368,
1327
- "f":0.9831898319
1328
  },
1329
  "Tense":{
1330
- "p":0.9911894273,
1331
- "r":0.9944751381,
1332
- "f":0.9928295643
1333
  },
1334
  "VerbForm":{
1335
- "p":0.9876644737,
1336
- "r":0.9631114675,
1337
- "f":0.9752334551
1338
  },
1339
  "Voice":{
1340
- "p":0.983723296,
1341
- "r":0.9887525562,
1342
- "f":0.9862315145
1343
  },
1344
  "Number[psor]":{
1345
- "p":0.9872159091,
1346
- "r":0.99002849,
1347
- "f":0.9886201991
1348
  },
1349
  "Person[psor]":{
1350
- "p":0.9914772727,
1351
- "r":0.9957203994,
1352
- "f":0.993594306
1353
- },
1354
- "Number[psed]":{
1355
- "p":0.8,
1356
- "r":0.4444444444,
1357
- "f":0.5714285714
1358
  },
1359
  "NumType":{
1360
- "p":0.9410377358,
1361
- "r":0.9731707317,
1362
- "f":0.9568345324
1363
  },
1364
  "Reflex":{
1365
  "p":0.875,
@@ -1371,192 +1366,187 @@
1371
  "r":0.0,
1372
  "f":0.0
1373
  },
 
 
 
 
 
1374
  "Poss":{
1375
- "p":0.5,
1376
- "r":0.3333333333,
1377
- "f":0.4
1378
  }
1379
  },
1380
- "lemma_acc":0.9864127835,
1381
- "bound_dep_las":0.8616532721,
1382
- "bound_dep_uas":0.9010715653,
1383
- "dep_uas":0.9010957462,
1384
- "dep_las":0.8617769485,
1385
  "dep_las_per_type":{
1386
  "415":{
1387
- "p":0.9372517871,
1388
- "r":0.9394904459,
1389
- "f":0.9383697813
1390
  },
1391
  "7411097074813287689":{
1392
- "p":0.921221865,
1393
- "r":0.9370400654,
1394
- "f":0.92906364
1395
  },
1396
  "429":{
1397
- "p":0.9165354331,
1398
- "r":0.909375,
1399
- "f":0.9129411765
1400
  },
1401
  "15861261214731031920":{
1402
- "p":0.7475490196,
1403
- "r":0.7475490196,
1404
- "f":0.7475490196
1405
  },
1406
  "991268021520064439":{
1407
- "p":0.8752079867,
1408
- "r":0.8915254237,
1409
- "f":0.8832913518
1410
  },
1411
  "435":{
1412
- "p":0.8851590106,
1413
- "r":0.901890189,
1414
- "f":0.8934462773
1415
  },
1416
  "434":{
1417
- "p":0.9467849224,
1418
  "r":0.9595505618,
1419
- "f":0.953125
1420
  },
1421
  "8206900633647566924":{
1422
- "p":0.825095057,
1423
- "r":0.9665924276,
1424
- "f":0.8902564103
1425
  },
1426
  "407":{
1427
- "p":0.8295218295,
1428
- "r":0.84,
1429
- "f":0.8347280335
1430
  },
1431
  "410":{
1432
- "p":0.74375,
1433
- "r":0.74375,
1434
- "f":0.74375
1435
  },
1436
  "445":{
1437
- "p":0.8616734143,
1438
- "r":0.8634212306,
1439
- "f":0.862546437
1440
  },
1441
  "400":{
1442
- "p":0.8556701031,
1443
- "r":0.8736842105,
1444
- "f":0.8645833333
1445
  },
1446
  "17772752594865228322":{
1447
- "p":0.9619047619,
1448
- "r":0.9439252336,
1449
- "f":0.9528301887
1450
  },
1451
  "403":{
1452
- "p":0.8,
1453
- "r":0.5106382979,
1454
- "f":0.6233766234
1455
  },
1456
  "399":{
1457
- "p":0.4563106796,
1458
  "r":0.4795918367,
1459
- "f":0.4676616915
1460
  },
1461
  "3143985677199705895":{
1462
- "p":0.7983193277,
1463
- "r":0.8260869565,
1464
- "f":0.811965812
1465
  },
1466
  "9241468201421778905":{
1467
- "p":0.4117647059,
1468
- "r":0.4242424242,
1469
- "f":0.4179104478
1470
  },
1471
  "423":{
1472
- "p":0.9240506329,
1473
- "r":0.9240506329,
1474
- "f":0.9240506329
1475
  },
1476
  "13543738850102096385":{
1477
- "p":0.9357798165,
1478
  "r":0.9357798165,
1479
- "f":0.9357798165
1480
  },
1481
  "10901028881100056900":{
1482
- "p":0.8461538462,
1483
- "r":0.6875,
1484
- "f":0.7586206897
1485
  },
1486
  "411":{
1487
- "p":0.8823529412,
1488
- "r":0.7317073171,
1489
- "f":0.8
1490
  },
1491
  "12549387360942434255":{
1492
- "p":0.5151515152,
1493
- "r":0.425,
1494
- "f":0.4657534247
1495
  },
1496
  "303601073839818384":{
1497
- "p":0.3333333333,
1498
- "r":0.25,
1499
- "f":0.2857142857
1500
  },
1501
  "8884235091647096537":{
1502
- "p":0.3333333333,
1503
- "r":0.1666666667,
1504
- "f":0.2222222222
1505
  },
1506
  "2249809950233855422":{
1507
- "p":0.3225806452,
1508
  "r":0.3125,
1509
- "f":0.3174603175
1510
  },
1511
  "422":{
1512
- "p":0.5238095238,
1513
- "r":0.7333333333,
1514
- "f":0.6111111111
 
 
 
 
 
1515
  },
1516
  "8110129090154140942":{
1517
- "p":0.9639175258,
1518
- "r":0.9540816327,
1519
- "f":0.958974359
1520
  },
1521
  "412":{
1522
- "p":0.4193548387,
1523
- "r":0.3513513514,
1524
- "f":0.3823529412
1525
- },
1526
- "436":{
1527
- "p":0.4333333333,
1528
- "r":0.1780821918,
1529
- "f":0.2524271845
1530
  },
1531
  "450":{
1532
- "p":0.9210526316,
1533
- "r":0.9459459459,
1534
- "f":0.9333333333
1535
  },
1536
  "12837356684637874264":{
1537
- "p":0.6860465116,
1538
- "r":0.6344086022,
1539
- "f":0.6592178771
1540
- },
1541
- "408":{
1542
- "p":0.0,
1543
- "r":0.0,
1544
- "f":0.0
1545
- },
1546
- "2203702860706368571":{
1547
- "p":0.0,
1548
- "r":0.0,
1549
- "f":0.0
1550
  },
1551
  "3350290345017230236":{
1552
- "p":0.0,
1553
- "r":0.0,
1554
- "f":0.0
1555
  },
1556
  "451":{
1557
- "p":0.5074626866,
1558
- "r":0.4722222222,
1559
- "f":0.4892086331
1560
  },
1561
  "7349492218059511525":{
1562
  "p":0.6666666667,
@@ -1564,14 +1554,14 @@
1564
  "f":0.7272727273
1565
  },
1566
  "426":{
1567
- "p":0.5,
1568
  "r":0.3636363636,
1569
- "f":0.4210526316
1570
  },
1571
  "405":{
1572
- "p":0.9166666667,
1573
- "r":0.9166666667,
1574
- "f":0.9166666667
1575
  },
1576
  "17865338459503383721":{
1577
  "p":1.0,
@@ -1584,24 +1574,29 @@
1584
  "f":0.0
1585
  },
1586
  "7037928807040764755":{
1587
- "p":0.9756097561,
1588
  "r":1.0,
1589
- "f":0.987654321
1590
  },
1591
  "11190527879068114961":{
1592
  "p":0.0,
1593
  "r":0.0,
1594
  "f":0.0
1595
  },
 
 
 
 
 
1596
  "10069665988847657778":{
1597
  "p":0.0,
1598
  "r":0.0,
1599
  "f":0.0
1600
  },
1601
  "17473201795025412735":{
1602
- "p":1.0,
1603
  "r":0.1666666667,
1604
- "f":0.2857142857
1605
  },
1606
  "6522094215780122214":{
1607
  "p":1.0,
@@ -1614,32 +1609,32 @@
1614
  "f":0.0
1615
  }
1616
  },
1617
- "ents_p":0.9096153846,
1618
- "ents_r":0.9147327707,
1619
- "ents_f":0.9121669004,
1620
  "ents_per_type":{
1621
  "ORG":{
1622
- "p":0.9121621622,
1623
  "r":0.9388038943,
1624
- "f":0.9252912954
1625
  },
1626
  "PER":{
1627
- "p":0.937866354,
1628
- "r":0.9557945042,
1629
- "f":0.9467455621
1630
  },
1631
  "LOC":{
1632
- "p":0.9426751592,
1633
- "r":0.8993055556,
1634
- "f":0.9204797868
1635
  },
1636
  "MISC":{
1637
- "p":0.7798561151,
1638
- "r":0.7687943262,
1639
- "f":0.7742857143
1640
  }
1641
  },
1642
- "speed":3559.6059310631
1643
  },
1644
  "sources":[
1645
  {
1
  {
2
  "lang":"hu",
3
  "name":"core_news_trf",
4
+ "version":"3.7.0",
5
  "description":"Hungarian transformer pipeline (huBERT) for HuSpaCy. Components: transformer, senter, tagger, morphologizer, lemmatizer, parser, ner",
6
  "author":"SzegedAI, MILAB",
7
  "email":"gyorgy@orosz.link",
8
  "url":"https://github.com/huspacy/huspacy",
9
  "license":"cc-by-sa-4.0",
10
+ "spacy_version":">=3.7.0,<3.8.0",
11
+ "spacy_git_version":"a89eae928",
12
  "vectors":{
13
  "width":0,
14
  "vectors":0,
1281
  "token_p":0.998565417,
1282
  "token_r":0.9993300153,
1283
  "token_f":0.9989475698,
1284
+ "sents_p":0.9888888889,
1285
+ "sents_r":0.991091314,
1286
+ "sents_f":0.9899888765,
1287
+ "tag_acc":0.9823906594,
1288
+ "pos_acc":0.9820078476,
1289
+ "morph_acc":0.9658340511,
1290
+ "morph_micro_p":0.988090101,
1291
+ "morph_micro_r":0.9840137516,
1292
+ "morph_micro_f":0.9860477134,
1293
  "morph_per_feat":{
1294
  "Definite":{
1295
+ "p":0.9852125693,
1296
+ "r":0.9948670089,
1297
+ "f":0.9900162526
1298
  },
1299
  "PronType":{
1300
+ "p":0.9867768595,
1301
+ "r":0.988410596,
1302
+ "f":0.9875930521
1303
  },
1304
  "Case":{
1305
+ "p":0.9934458788,
1306
+ "r":0.9883422249,
1307
+ "f":0.9908874802
1308
  },
1309
  "Degree":{
1310
+ "p":0.9623137599,
1311
+ "r":0.9134775374,
1312
+ "f":0.9372599232
1313
  },
1314
  "Number":{
1315
+ "p":0.9966375252,
1316
+ "r":0.9934640523,
1317
+ "f":0.9950482585
1318
  },
1319
  "Mood":{
1320
+ "p":0.9801980198,
1321
+ "r":0.987804878,
1322
+ "f":0.9839867477
1323
  },
1324
  "Person":{
1325
+ "p":0.9819078947,
1326
+ "r":0.9819078947,
1327
+ "f":0.9819078947
1328
  },
1329
  "Tense":{
1330
+ "p":0.9901098901,
1331
+ "r":0.9955801105,
1332
+ "f":0.9928374656
1333
  },
1334
  "VerbForm":{
1335
+ "p":0.9868095631,
1336
+ "r":0.959903769,
1337
+ "f":0.9731707317
1338
  },
1339
  "Voice":{
1340
+ "p":0.9796954315,
1341
+ "r":0.9867075665,
1342
+ "f":0.9831889964
1343
  },
1344
  "Number[psor]":{
1345
+ "p":0.9857752489,
1346
+ "r":0.9871794872,
1347
+ "f":0.9864768683
1348
  },
1349
  "Person[psor]":{
1350
+ "p":0.9914651494,
1351
+ "r":0.9942938659,
1352
+ "f":0.9928774929
 
 
 
 
 
1353
  },
1354
  "NumType":{
1355
+ "p":0.9473684211,
1356
+ "r":0.9658536585,
1357
+ "f":0.9565217391
1358
  },
1359
  "Reflex":{
1360
  "p":0.875,
1366
  "r":0.0,
1367
  "f":0.0
1368
  },
1369
+ "Number[psed]":{
1370
+ "p":1.0,
1371
+ "r":0.4444444444,
1372
+ "f":0.6153846154
1373
+ },
1374
  "Poss":{
1375
+ "p":1.0,
1376
+ "r":0.6666666667,
1377
+ "f":0.8
1378
  }
1379
  },
1380
+ "lemma_acc":0.9861257296,
1381
+ "bound_dep_las":0.8574985635,
1382
+ "bound_dep_uas":0.9007852902,
1383
+ "dep_uas":0.9000861326,
1384
+ "dep_las":0.8568421053,
1385
  "dep_las_per_type":{
1386
  "415":{
1387
+ "p":0.9286833856,
1388
+ "r":0.9434713376,
1389
+ "f":0.9360189573
1390
  },
1391
  "7411097074813287689":{
1392
+ "p":0.9288107203,
1393
+ "r":0.9067865904,
1394
+ "f":0.9176665288
1395
  },
1396
  "429":{
1397
+ "p":0.9156441718,
1398
+ "r":0.9328125,
1399
+ "f":0.9241486068
1400
  },
1401
  "15861261214731031920":{
1402
+ "p":0.6926713948,
1403
+ "r":0.7181372549,
1404
+ "f":0.7051744886
1405
  },
1406
  "991268021520064439":{
1407
+ "p":0.8718801997,
1408
+ "r":0.8881355932,
1409
+ "f":0.8799328296
1410
  },
1411
  "435":{
1412
+ "p":0.8917480035,
1413
+ "r":0.904590459,
1414
+ "f":0.8981233244
1415
  },
1416
  "434":{
1417
+ "p":0.9343544858,
1418
  "r":0.9595505618,
1419
+ "f":0.9467849224
1420
  },
1421
  "8206900633647566924":{
1422
+ "p":0.8216318786,
1423
+ "r":0.9643652561,
1424
+ "f":0.887295082
1425
  },
1426
  "407":{
1427
+ "p":0.8141962422,
1428
+ "r":0.8210526316,
1429
+ "f":0.8176100629
1430
  },
1431
  "410":{
1432
+ "p":0.737394958,
1433
+ "r":0.73125,
1434
+ "f":0.7343096234
1435
  },
1436
  "445":{
1437
+ "p":0.8593644354,
1438
+ "r":0.8593644354,
1439
+ "f":0.8593644354
1440
  },
1441
  "400":{
1442
+ "p":0.8865979381,
1443
+ "r":0.9052631579,
1444
+ "f":0.8958333333
1445
  },
1446
  "17772752594865228322":{
1447
+ "p":0.9624413146,
1448
+ "r":0.9579439252,
1449
+ "f":0.9601873536
1450
  },
1451
  "403":{
1452
+ "p":0.7419354839,
1453
+ "r":0.4893617021,
1454
+ "f":0.5897435897
1455
  },
1456
  "399":{
1457
+ "p":0.4947368421,
1458
  "r":0.4795918367,
1459
+ "f":0.4870466321
1460
  },
1461
  "3143985677199705895":{
1462
+ "p":0.8032128514,
1463
+ "r":0.8695652174,
1464
+ "f":0.8350730689
1465
  },
1466
  "9241468201421778905":{
1467
+ "p":0.32,
1468
+ "r":0.4848484848,
1469
+ "f":0.3855421687
1470
  },
1471
  "423":{
1472
+ "p":0.9166666667,
1473
+ "r":0.9050632911,
1474
+ "f":0.9108280255
1475
  },
1476
  "13543738850102096385":{
1477
+ "p":0.9189189189,
1478
  "r":0.9357798165,
1479
+ "f":0.9272727273
1480
  },
1481
  "10901028881100056900":{
1482
+ "p":0.875,
1483
+ "r":0.65625,
1484
+ "f":0.75
1485
  },
1486
  "411":{
1487
+ "p":0.8,
1488
+ "r":0.7804878049,
1489
+ "f":0.7901234568
1490
  },
1491
  "12549387360942434255":{
1492
+ "p":0.4285714286,
1493
+ "r":0.375,
1494
+ "f":0.4
1495
  },
1496
  "303601073839818384":{
1497
+ "p":0.0,
1498
+ "r":0.0,
1499
+ "f":0.0
1500
  },
1501
  "8884235091647096537":{
1502
+ "p":0.5,
1503
+ "r":0.0833333333,
1504
+ "f":0.1428571429
1505
  },
1506
  "2249809950233855422":{
1507
+ "p":0.3703703704,
1508
  "r":0.3125,
1509
+ "f":0.3389830508
1510
  },
1511
  "422":{
1512
+ "p":0.3461538462,
1513
+ "r":0.6,
1514
+ "f":0.4390243902
1515
+ },
1516
+ "436":{
1517
+ "p":0.15625,
1518
+ "r":0.0684931507,
1519
+ "f":0.0952380952
1520
  },
1521
  "8110129090154140942":{
1522
+ "p":0.9591836735,
1523
+ "r":0.9591836735,
1524
+ "f":0.9591836735
1525
  },
1526
  "412":{
1527
+ "p":0.5882352941,
1528
+ "r":0.2702702703,
1529
+ "f":0.3703703704
 
 
 
 
 
1530
  },
1531
  "450":{
1532
+ "p":0.96,
1533
+ "r":0.972972973,
1534
+ "f":0.966442953
1535
  },
1536
  "12837356684637874264":{
1537
+ "p":0.6489361702,
1538
+ "r":0.6559139785,
1539
+ "f":0.6524064171
 
 
 
 
 
 
 
 
 
 
1540
  },
1541
  "3350290345017230236":{
1542
+ "p":0.0588235294,
1543
+ "r":0.0416666667,
1544
+ "f":0.0487804878
1545
  },
1546
  "451":{
1547
+ "p":0.5636363636,
1548
+ "r":0.4305555556,
1549
+ "f":0.4881889764
1550
  },
1551
  "7349492218059511525":{
1552
  "p":0.6666666667,
1554
  "f":0.7272727273
1555
  },
1556
  "426":{
1557
+ "p":0.6666666667,
1558
  "r":0.3636363636,
1559
+ "f":0.4705882353
1560
  },
1561
  "405":{
1562
+ "p":0.9090909091,
1563
+ "r":0.8333333333,
1564
+ "f":0.8695652174
1565
  },
1566
  "17865338459503383721":{
1567
  "p":1.0,
1574
  "f":0.0
1575
  },
1576
  "7037928807040764755":{
1577
+ "p":1.0,
1578
  "r":1.0,
1579
+ "f":1.0
1580
  },
1581
  "11190527879068114961":{
1582
  "p":0.0,
1583
  "r":0.0,
1584
  "f":0.0
1585
  },
1586
+ "408":{
1587
+ "p":0.0,
1588
+ "r":0.0,
1589
+ "f":0.0
1590
+ },
1591
  "10069665988847657778":{
1592
  "p":0.0,
1593
  "r":0.0,
1594
  "f":0.0
1595
  },
1596
  "17473201795025412735":{
1597
+ "p":0.5,
1598
  "r":0.1666666667,
1599
+ "f":0.25
1600
  },
1601
  "6522094215780122214":{
1602
  "p":1.0,
1609
  "f":0.0
1610
  }
1611
  },
1612
+ "ents_p":0.9119332986,
1613
+ "ents_r":0.9229957806,
1614
+ "ents_f":0.9174311927,
1615
  "ents_per_type":{
1616
  "ORG":{
1617
+ "p":0.9221311475,
1618
  "r":0.9388038943,
1619
+ "f":0.9303928325
1620
  },
1621
  "PER":{
1622
+ "p":0.9537640782,
1623
+ "r":0.9611708483,
1624
+ "f":0.9574531389
1625
  },
1626
  "LOC":{
1627
+ "p":0.952079566,
1628
+ "r":0.9140625,
1629
+ "f":0.932683791
1630
  },
1631
  "MISC":{
1632
+ "p":0.7330729167,
1633
+ "r":0.7985815603,
1634
+ "f":0.7644263408
1635
  }
1636
  },
1637
+ "speed":2821.7094967965
1638
  },
1639
  "sources":[
1640
  {
morphologizer/cfg CHANGED
@@ -1,5 +1,6 @@
1
  {
2
  "extend":false,
 
3
  "labels_morph":{
4
  "Definite=Def|POS=DET|PronType=Art":"Definite=Def|PronType=Art",
5
  "Case=Ine|Number=Sing|POS=NOUN":"Case=Ine|Number=Sing",
1
  {
2
  "extend":false,
3
+ "label_smoothing":0.0,
4
  "labels_morph":{
5
  "Definite=Def|POS=DET|PronType=Art":"Definite=Def|PronType=Art",
6
  "Case=Ine|Number=Sing|POS=NOUN":"Case=Ine|Number=Sing",
morphologizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:27ab24cfac026c32a10861669e716af00ea48f283b93f051ca0c37bb0e98c216
3
  size 3522673
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4a240ce87872810a3b97b037e172335d29a5b6435855ed334a098f02e36ceced
3
  size 3522673
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:20a5b8e858df786f2a71cbdf6ac23a785d27f67c9031a681f967011cba11d961
3
  size 443884420
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bea64fc6d6f08d9295f6693c280c6ffcfac59579ad32cd810e30ede19d1176c2
3
  size 443884420
senter/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a57c60560a7245599c8a79ef85d4a16ceeb7a30f33a2612f18cf1f391f5b1374
3
  size 6792
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5d2d998d2a2ae4176fed4cb1105a3a22ce542a23ac6c2e67832482f4d0f16418
3
  size 6792
tagger/cfg CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "labels":[
3
  "ADJ",
4
  "ADP",
1
  {
2
+ "label_smoothing":0.0,
3
  "labels":[
4
  "ADJ",
5
  "ADP",
tagger/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:dc357fa3e92c695224dd310ef6c2fe9c9e11981d9acfe8846eca141bc64f7050
3
  size 52932
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:48c734dcdc44e9b766f8cbded37a797ecf622132e2fc4cd5346fac7f68ca59a3
3
  size 52932
trainable_lemmatizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0f8c6bd62be4a9a776c8b11254fcf93a0757707f4cf2ed0365851a5b7a550ae8
3
  size 455946865
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c59b756bf3a956e678bd0ff951a1d839a6911824363c64805e9079aafd2be2c7
3
  size 455946865
transformer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:93dba2735ef5f21353aa67cfb381761c679916c8460a00f1e5e209b300c59495
3
  size 443602220
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2dd66e9dcbd0011b4bbb5a73019c00c66583084e6bd221699f5d600f8ddd9242
3
  size 443602220
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f8838525e60d6735d68084aaebbcee8797079852831e4301da84bd183dbab205
3
- size 6387261
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8c7637aca86998aeff3df061da292f20c4ea01c2f6d72badfc8172dd79da7e8c
3
+ size 6387165