adrianeboyd commited on
Commit
f989ad0
1 Parent(s): cb7be32

Update spaCy pipeline

Browse files
LICENSES_SOURCES CHANGED
@@ -878,7 +878,7 @@ Creative Commons may be contacted at creativecommons.org.
878
 
879
 
880
 
881
- # Explosion floret Vectors (OSCAR 2109 + Wikipedia + OpenSubtitles + WMT News Crawl)
882
 
883
  * Author: Explosion
884
  * URL: https://github.com/explosion/spacy-vectors-builder
 
878
 
879
 
880
 
881
+ # Explosion Vectors (OSCAR 2109 + Wikipedia + OpenSubtitles + WMT News Crawl)
882
 
883
  * Author: Explosion
884
  * URL: https://github.com/explosion/spacy-vectors-builder
README.md CHANGED
@@ -14,55 +14,55 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.850539861
18
  - name: NER Recall
19
  type: recall
20
- value: 0.8125044154
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.8310871843
24
  - task:
25
  name: TAG
26
  type: token-classification
27
  metrics:
28
  - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
- value: 0.8341706555
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
- value: 0.9467937814
38
  - task:
39
  name: LEMMA
40
  type: token-classification
41
  metrics:
42
  - name: Lemma Accuracy
43
  type: accuracy
44
- value: 0.89985957
45
  - task:
46
  name: UNLABELED_DEPENDENCIES
47
  type: token-classification
48
  metrics:
49
  - name: Unlabeled Attachment Score (UAS)
50
  type: f_score
51
- value: 0.8385098703
52
  - task:
53
  name: LABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Labeled Attachment Score (LAS)
57
  type: f_score
58
- value: 0.8093222227
59
  - task:
60
  name: SENTS
61
  type: token-classification
62
  metrics:
63
  - name: Sentences F-Score
64
  type: f_score
65
- value: 1.0
66
  ---
67
  ### Details: https://spacy.io/models/ko#ko_core_news_md
68
 
@@ -71,12 +71,12 @@ Korean pipeline optimized for CPU. Components: tok2vec, tagger, morphologizer, p
71
  | Feature | Description |
72
  | --- | --- |
73
  | **Name** | `ko_core_news_md` |
74
- | **Version** | `3.3.0` |
75
- | **spaCy** | `>=3.3.0.dev0,<3.4.0` |
76
  | **Default Pipeline** | `tok2vec`, `tagger`, `morphologizer`, `parser`, `lemmatizer`, `attribute_ruler`, `ner` |
77
  | **Components** | `tok2vec`, `tagger`, `morphologizer`, `parser`, `lemmatizer`, `senter`, `attribute_ruler`, `ner` |
78
  | **Vectors** | floret (50000, 300) |
79
- | **Sources** | [UD Korean Kaist v2.8](https://github.com/UniversalDependencies/UD_Korean-Kaist) (Choi, Jinho; Han, Na-Rae; Hwang, Jena; Chun, Jayeol)<br />[KLUE v1.1.0](https://github.com/KLUE-benchmark/KLUE) (Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Youngsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Ryu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho)<br />[Explosion floret Vectors (OSCAR 2109 + Wikipedia + OpenSubtitles + WMT News Crawl)](https://github.com/explosion/spacy-vectors-builder) (Explosion) |
80
  | **License** | `CC BY-SA 4.0` |
81
  | **Author** | [Explosion](https://explosion.ai) |
82
 
@@ -84,12 +84,12 @@ Korean pipeline optimized for CPU. Components: tok2vec, tagger, morphologizer, p
84
 
85
  <details>
86
 
87
- <summary>View label scheme (2026 labels for 4 components)</summary>
88
 
89
  | Component | Labels |
90
  | --- | --- |
91
- | **`tagger`** | `ecs`, `etm`, `f`, `f+f+jcj`, `f+f+jcs`, `f+f+jct`, `f+f+jxt`, `f+jca`, `f+jca+jp+ecc`, `f+jca+jp+ep+ef`, `f+jca+jxc`, `f+jca+jxc+jcm`, `f+jca+jxt`, `f+jcj`, `f+jcm`, `f+jco`, `f+jcs`, `f+jct`, `f+jct+jcm`, `f+jp+ef`, `f+jp+ep+ef`, `f+jp+etm`, `f+jxc`, `f+jxt`, `f+ncn`, `f+ncn+jcm`, `f+ncn+jcs`, `f+ncn+jp+ecc`, `f+ncn+jxt`, `f+ncpa+jcm`, `f+npp+jcs`, `f+nq`, `f+xsn`, `f+xsn+jco`, `f+xsn+jxt`, `ii`, `jca`, `jca+jcm`, `jca+jxc`, `jca+jxt`, `jcc`, `jcj`, `jcm`, `jco`, `jcr`, `jcr+jxc`, `jcs`, `jct`, `jct+jcm`, `jct+jxt`, `jp+ecc`, `jp+ecs`, `jp+ef`, `jp+ef+jcr`, `jp+ef+jcr+jxc`, `jp+ep+ecs`, `jp+ep+ef`, `jp+ep+etm`, `jp+ep+etn`, `jp+etm`, `jp+etn`, `jp+etn+jco`, `jp+etn+jxc`, `jxc`, `jxc+jca`, `jxc+jco`, `jxc+jcs`, `jxt`, `mad`, `mad+jxc`, `mad+jxt`, `mag`, `mag+jca`, `mag+jcm`, `mag+jcs`, `mag+jp+ef+jcr`, `mag+jxc`, `mag+jxc+jxc`, `mag+jxt`, `mag+xsn`, `maj`, `maj+jxc`, `maj+jxt`, `mma`, `mmd`, `nbn`, `nbn+jca`, `nbn+jca+jcj`, `nbn+jca+jcm`, `nbn+jca+jp+ef`, `nbn+jca+jxc`, `nbn+jca+jxt`, `nbn+jcc`, `nbn+jcj`, `nbn+jcm`, `nbn+jco`, `nbn+jcr`, `nbn+jcs`, `nbn+jct`, `nbn+jct+jcm`, `nbn+jct+jxt`, `nbn+jp+ecc`, `nbn+jp+ecs`, `nbn+jp+ecs+jca`, `nbn+jp+ecs+jcm`, `nbn+jp+ecs+jco`, `nbn+jp+ecs+jxc`, `nbn+jp+ecs+jxt`, `nbn+jp+ecx`, `nbn+jp+ef`, `nbn+jp+ef+jca`, `nbn+jp+ef+jco`, `nbn+jp+ef+jcr`, `nbn+jp+ef+jcr+jxc`, `nbn+jp+ef+jcr+jxt`, `nbn+jp+ef+jcs`, `nbn+jp+ef+jxc`, `nbn+jp+ef+jxc+jco`, `nbn+jp+ef+jxf`, `nbn+jp+ef+jxt`, `nbn+jp+ep+ecc`, `nbn+jp+ep+ecs`, `nbn+jp+ep+ecs+jxc`, `nbn+jp+ep+ef`, `nbn+jp+ep+ef+jcr`, `nbn+jp+ep+etm`, `nbn+jp+ep+etn`, `nbn+jp+ep+etn+jco`, `nbn+jp+ep+etn+jcs`, `nbn+jp+etm`, `nbn+jp+etn`, `nbn+jp+etn+jca`, `nbn+jp+etn+jca+jxt`, `nbn+jp+etn+jco`, `nbn+jp+etn+jcs`, `nbn+jp+etn+jxc`, `nbn+jp+etn+jxt`, `nbn+jxc`, `nbn+jxc+jca`, `nbn+jxc+jca+jxc`, `nbn+jxc+jca+jxt`, `nbn+jxc+jcc`, `nbn+jxc+jcm`, `nbn+jxc+jco`, `nbn+jxc+jcs`, `nbn+jxc+jp+ef`, `nbn+jxc+jxc`, `nbn+jxc+jxt`, `nbn+jxt`, `nbn+nbn`, `nbn+nbn+jp+ef`, `nbn+xsm+ecs`, `nbn+xsm+ef`, `nbn+xsm+ep+ef`, `nbn+xsm+ep+ef+jcr`, `nbn+xsm+etm`, `nbn+xsn`, `nbn+xsn+jca`, `nbn+xsn+jca+jp+ef+jcr`, `nbn+xsn+jca+jxc`, `nbn+xsn+jca+jxt`, `nbn+xsn+jcm`, `nbn+xsn+jco`, `nbn+xsn+jcs`, `nbn+xsn+jct`, `nbn+xsn+jp+ecc`, `nbn+xsn+jp+ecs`, `nbn+xsn+jp+ef`, `nbn+xsn+jp+ef+jcr`, `nbn+xsn+jp+ep+ef`, `nbn+xsn+jxc`, `nbn+xsn+jxt`, `nbn+xsv+etm`, `nbu`, `nbu+jca`, `nbu+jca+jxc`, `nbu+jca+jxt`, `nbu+jcc`, `nbu+jcc+jxc`, `nbu+jcj`, `nbu+jcm`, `nbu+jco`, `nbu+jcs`, `nbu+jct`, `nbu+jct+jxc`, `nbu+jp+ecc`, `nbu+jp+ecs`, `nbu+jp+ef`, `nbu+jp+ef+jcr`, `nbu+jp+ef+jxc`, `nbu+jp+ep+ecc`, `nbu+jp+ep+ecs`, `nbu+jp+ep+ef`, `nbu+jp+ep+ef+jcr`, `nbu+jp+ep+etm`, `nbu+jp+ep+etn+jco`, `nbu+jp+etm`, `nbu+jxc`, `nbu+jxc+jca`, `nbu+jxc+jcs`, `nbu+jxc+jp+ef`, `nbu+jxc+jp+ep+ef`, `nbu+jxc+jxt`, `nbu+jxt`, `nbu+ncn`, `nbu+ncn+jca`, `nbu+ncn+jcm`, `nbu+xsn`, `nbu+xsn+jca`, `nbu+xsn+jca+jxc`, `nbu+xsn+jca+jxt`, `nbu+xsn+jcm`, `nbu+xsn+jco`, `nbu+xsn+jcs`, `nbu+xsn+jp+ecs`, `nbu+xsn+jp+ep+ef`, `nbu+xsn+jxc`, `nbu+xsn+jxc+jxt`, `nbu+xsn+jxt`, `nbu+xsv+ecc`, `nbu+xsv+etm`, `ncn`, `ncn+f+ncpa+jco`, `ncn+jca`, `ncn+jca+jca`, `ncn+jca+jcc`, `ncn+jca+jcj`, `ncn+jca+jcm`, `ncn+jca+jcs`, `ncn+jca+jct`, `ncn+jca+jp+ecc`, `ncn+jca+jp+ecs`, `ncn+jca+jp+ef`, `ncn+jca+jp+ep+ef`, `ncn+jca+jp+etm`, `ncn+jca+jp+etn+jxt`, `ncn+jca+jxc`, `ncn+jca+jxc+jcc`, `ncn+jca+jxc+jcm`, `ncn+jca+jxc+jxc`, `ncn+jca+jxc+jxt`, `ncn+jca+jxt`, `ncn+jcc`, `ncn+jcc+jxc`, `ncn+jcj`, `ncn+jcj+jxt`, `ncn+jcm`, `ncn+jco`, `ncn+jcr`, `ncn+jcr+jxc`, `ncn+jcs`, `ncn+jcs+jxt`, `ncn+jct`, `ncn+jct+jcm`, `ncn+jct+jxc`, `ncn+jct+jxt`, `ncn+jcv`, `ncn+jp+ecc`, `ncn+jp+ecc+jct`, `ncn+jp+ecc+jxc`, `ncn+jp+ecs`, `ncn+jp+ecs+jcm`, `ncn+jp+ecs+jco`, `ncn+jp+ecs+jxc`, `ncn+jp+ecs+jxt`, `ncn+jp+ecx`, `ncn+jp+ef`, `ncn+jp+ef+jca`, `ncn+jp+ef+jcm`, `ncn+jp+ef+jco`, `ncn+jp+ef+jcr`, `ncn+jp+ef+jcr+jxc`, `ncn+jp+ef+jcr+jxt`, `ncn+jp+ef+jp+etm`, `ncn+jp+ef+jxc`, `ncn+jp+ef+jxf`, `ncn+jp+ef+jxt`, `ncn+jp+ep+ecc`, `ncn+jp+ep+ecs`, `ncn+jp+ep+ecs+jxc`, `ncn+jp+ep+ecx`, `ncn+jp+ep+ef`, `ncn+jp+ep+ef+jcr`, `ncn+jp+ep+ef+jcr+jxc`, `ncn+jp+ep+ef+jxc`, `ncn+jp+ep+ef+jxf`, `ncn+jp+ep+ef+jxt`, `ncn+jp+ep+ep+etm`, `ncn+jp+ep+etm`, `ncn+jp+ep+etn`, `ncn+jp+ep+etn+jca`, `ncn+jp+ep+etn+jca+jxc`, `ncn+jp+ep+etn+jco`, `ncn+jp+ep+etn+jcs`, `ncn+jp+ep+etn+jxt`, `ncn+jp+etm`, `ncn+jp+etn`, `ncn+jp+etn+jca`, `ncn+jp+etn+jca+jxc`, `ncn+jp+etn+jca+jxt`, `ncn+jp+etn+jco`, `ncn+jp+etn+jcs`, `ncn+jp+etn+jct`, `ncn+jp+etn+jxc`, `ncn+jp+etn+jxt`, `ncn+jxc`, `ncn+jxc+jca`, `ncn+jxc+jca+jxc`, `ncn+jxc+jca+jxt`, `ncn+jxc+jcc`, `ncn+jxc+jcm`, `ncn+jxc+jco`, `ncn+jxc+jcs`, `ncn+jxc+jct+jxt`, `ncn+jxc+jp+ef`, `ncn+jxc+jp+ef+jcr`, `ncn+jxc+jp+ep+ecs`, `ncn+jxc+jp+ep+ef`, `ncn+jxc+jp+etm`, `ncn+jxc+jxc`, `ncn+jxc+jxt`, `ncn+jxt`, `ncn+jxt+jcm`, `ncn+jxt+jxc`, `ncn+nbn`, `ncn+nbn+jca`, `ncn+nbn+jcm`, `ncn+nbn+jcs`, `ncn+nbn+jp+ecc`, `ncn+nbn+jp+ep+ef`, `ncn+nbn+jxc`, `ncn+nbn+jxt`, `ncn+nbu`, `ncn+nbu+jca`, `ncn+nbu+jcm`, `ncn+nbu+jco`, `ncn+nbu+jp+ef`, `ncn+nbu+jxc`, `ncn+nbu+ncn`, `ncn+ncn`, `ncn+ncn+jca`, `ncn+ncn+jca+jcc`, `ncn+ncn+jca+jcm`, `ncn+ncn+jca+jxc`, `ncn+ncn+jca+jxc+jcm`, `ncn+ncn+jca+jxc+jxc`, `ncn+ncn+jca+jxt`, `ncn+ncn+jcc`, `ncn+ncn+jcj`, `ncn+ncn+jcm`, `ncn+ncn+jco`, `ncn+ncn+jcr`, `ncn+ncn+jcs`, `ncn+ncn+jct`, `ncn+ncn+jct+jcm`, `ncn+ncn+jct+jxc`, `ncn+ncn+jct+jxt`, `ncn+ncn+jp+ecc`, `ncn+ncn+jp+ecs`, `ncn+ncn+jp+ef`, `ncn+ncn+jp+ef+jcm`, `ncn+ncn+jp+ef+jcr`, `ncn+ncn+jp+ef+jcs`, `ncn+ncn+jp+ep+ecc`, `ncn+ncn+jp+ep+ecs`, `ncn+ncn+jp+ep+ef`, `ncn+ncn+jp+ep+ef+jcr`, `ncn+ncn+jp+ep+ep+etm`, `ncn+ncn+jp+ep+etm`, `ncn+ncn+jp+ep+etn`, `ncn+ncn+jp+etm`, `ncn+ncn+jp+etn`, `ncn+ncn+jp+etn+jca`, `ncn+ncn+jp+etn+jco`, `ncn+ncn+jp+etn+jxc`, `ncn+ncn+jxc`, `ncn+ncn+jxc+jca`, `ncn+ncn+jxc+jcc`, `ncn+ncn+jxc+jcm`, `ncn+ncn+jxc+jco`, `ncn+ncn+jxc+jcs`, `ncn+ncn+jxc+jxc`, `ncn+ncn+jxt`, `ncn+ncn+nbn`, `ncn+ncn+ncn`, `ncn+ncn+ncn+jca`, `ncn+ncn+ncn+jca+jcm`, `ncn+ncn+ncn+jca+jxt`, `ncn+ncn+ncn+jcj`, `ncn+ncn+ncn+jcm`, `ncn+ncn+ncn+jco`, `ncn+ncn+ncn+jcs`, `ncn+ncn+ncn+jct+jxt`, `ncn+ncn+ncn+jp+etn+jxc`, `ncn+ncn+ncn+jxt`, `ncn+ncn+ncn+ncn+jca`, `ncn+ncn+ncn+ncn+jca+jxt`, `ncn+ncn+ncn+ncn+jco`, `ncn+ncn+ncn+xsn+jp+etm`, `ncn+ncn+ncpa`, `ncn+ncn+ncpa+jca`, `ncn+ncn+ncpa+jcm`, `ncn+ncn+ncpa+jco`, `ncn+ncn+ncpa+jcs`, `ncn+ncn+ncpa+jxc`, `ncn+ncn+ncpa+jxt`, `ncn+ncn+ncpa+ncn`, `ncn+ncn+ncpa+ncn+jca`, `ncn+ncn+ncpa+ncn+jcj`, `ncn+ncn+ncpa+ncn+jcm`, `ncn+ncn+ncpa+ncn+jxt`, `ncn+ncn+xsn`, `ncn+ncn+xsn+jca`, `ncn+ncn+xsn+jca+jxt`, `ncn+ncn+xsn+jcj`, `ncn+ncn+xsn+jcm`, `ncn+ncn+xsn+jco`, `ncn+ncn+xsn+jcs`, `ncn+ncn+xsn+jct`, `ncn+ncn+xsn+jp+ecs`, `ncn+ncn+xsn+jp+ep+ef`, `ncn+ncn+xsn+jp+etm`, `ncn+ncn+xsn+jxc`, `ncn+ncn+xsn+jxc+jcs`, `ncn+ncn+xsn+jxt`, `ncn+ncn+xsv+ecc`, `ncn+ncn+xsv+etm`, `ncn+ncpa`, `ncn+ncpa+jca`, `ncn+ncpa+jca+jcm`, `ncn+ncpa+jca+jxc`, `ncn+ncpa+jca+jxt`, `ncn+ncpa+jcc`, `ncn+ncpa+jcj`, `ncn+ncpa+jcm`, `ncn+ncpa+jco`, `ncn+ncpa+jcr`, `ncn+ncpa+jcs`, `ncn+ncpa+jct`, `ncn+ncpa+jct+jcm`, `ncn+ncpa+jct+jxt`, `ncn+ncpa+jp+ecc`, `ncn+ncpa+jp+ecc+jxc`, `ncn+ncpa+jp+ecs`, `ncn+ncpa+jp+ecs+jxc`, `ncn+ncpa+jp+ef`, `ncn+ncpa+jp+ef+jcr`, `ncn+ncpa+jp+ef+jcr+jxc`, `ncn+ncpa+jp+ep+ef`, `ncn+ncpa+jp+ep+etm`, `ncn+ncpa+jp+ep+etn`, `ncn+ncpa+jp+etm`, `ncn+ncpa+jxc`, `ncn+ncpa+jxc+jca+jxc`, `ncn+ncpa+jxc+jco`, `ncn+ncpa+jxc+jcs`, `ncn+ncpa+jxt`, `ncn+ncpa+nbn+jcs`, `ncn+ncpa+ncn`, `ncn+ncpa+ncn+jca`, `ncn+ncpa+ncn+jca+jcm`, `ncn+ncpa+ncn+jca+jxc`, `ncn+ncpa+ncn+jca+jxt`, `ncn+ncpa+ncn+jcj`, `ncn+ncpa+ncn+jcm`, `ncn+ncpa+ncn+jco`, `ncn+ncpa+ncn+jcs`, `ncn+ncpa+ncn+jct`, `ncn+ncpa+ncn+jct+jcm`, `ncn+ncpa+ncn+jp+ef+jcr`, `ncn+ncpa+ncn+jp+ep+etm`, `ncn+ncpa+ncn+jxc`, `ncn+ncpa+ncn+jxt`, `ncn+ncpa+ncn+xsn+jcm`, `ncn+ncpa+ncn+xsn+jxt`, `ncn+ncpa+ncpa`, `ncn+ncpa+ncpa+jca`, `ncn+ncpa+ncpa+jcj`, `ncn+ncpa+ncpa+jcm`, `ncn+ncpa+ncpa+jco`, `ncn+ncpa+ncpa+jcs`, `ncn+ncpa+ncpa+jp+ep+ef`, `ncn+ncpa+ncpa+jxt`, `ncn+ncpa+ncpa+ncn`, `ncn+ncpa+xsn`, `ncn+ncpa+xsn+jcm`, `ncn+ncpa+xsn+jco`, `ncn+ncpa+xsn+jcs`, `ncn+ncpa+xsn+jp+ecc`, `ncn+ncpa+xsn+jp+etm`, `ncn+ncpa+xsn+jxt`, `ncn+ncpa+xsv+ecc`, `ncn+ncpa+xsv+ecs`, `ncn+ncpa+xsv+ecx`, `ncn+ncpa+xsv+ecx+px+etm`, `ncn+ncpa+xsv+ef`, `ncn+ncpa+xsv+ef+jcm`, `ncn+ncpa+xsv+ef+jcr`, `ncn+ncpa+xsv+etm`, `ncn+ncpa+xsv+etn`, _(truncated: full list in pipeline meta)_ |
92
- | **`morphologizer`** | `POS=CCONJ`, `POS=ADV`, `POS=SCONJ`, `POS=DET`, `POS=NOUN`, `POS=VERB`, `POS=ADJ`, `POS=PUNCT`, `POS=AUX`, `POS=PRON`, `POS=PROPN`, `POS=NUM`, `POS=INTJ`, `POS=PART`, `POS=X`, `POS=ADP`, `POS=SYM` |
93
  | **`parser`** | `ROOT`, `acl`, `advcl`, `advmod`, `amod`, `appos`, `aux`, `case`, `cc`, `ccomp`, `compound`, `conj`, `cop`, `csubj`, `dep`, `det`, `dislocated`, `fixed`, `flat`, `iobj`, `mark`, `nmod`, `nsubj`, `nummod`, `obj`, `obl`, `punct`, `xcomp` |
94
  | **`ner`** | `DT`, `LC`, `OG`, `PS`, `QT`, `TI` |
95
 
@@ -103,14 +103,14 @@ Korean pipeline optimized for CPU. Components: tok2vec, tagger, morphologizer, p
103
  | `TOKEN_P` | 100.00 |
104
  | `TOKEN_R` | 100.00 |
105
  | `TOKEN_F` | 100.00 |
106
- | `TAG_ACC` | 83.42 |
107
- | `POS_ACC` | 94.68 |
108
- | `SENTS_P` | 100.00 |
109
- | `SENTS_R` | 100.00 |
110
- | `SENTS_F` | 100.00 |
111
- | `DEP_UAS` | 83.85 |
112
- | `DEP_LAS` | 80.93 |
113
- | `LEMMA_ACC` | 89.99 |
114
- | `ENTS_P` | 85.05 |
115
- | `ENTS_R` | 81.25 |
116
- | `ENTS_F` | 83.11 |
 
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
+ value: 0.8490038962
18
  - name: NER Recall
19
  type: recall
20
+ value: 0.8158954433
21
  - name: NER F Score
22
  type: f_score
23
+ value: 0.8321204698
24
  - task:
25
  name: TAG
26
  type: token-classification
27
  metrics:
28
  - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
+ value: 0.8336399059
31
  - task:
32
  name: POS
33
  type: token-classification
34
  metrics:
35
  - name: POS (UPOS) Accuracy
36
  type: accuracy
37
+ value: 0.946239962
38
  - task:
39
  name: LEMMA
40
  type: token-classification
41
  metrics:
42
  - name: Lemma Accuracy
43
  type: accuracy
44
+ value: 0.8970905279
45
  - task:
46
  name: UNLABELED_DEPENDENCIES
47
  type: token-classification
48
  metrics:
49
  - name: Unlabeled Attachment Score (UAS)
50
  type: f_score
51
+ value: 0.83875855
52
  - task:
53
  name: LABELED_DEPENDENCIES
54
  type: token-classification
55
  metrics:
56
  - name: Labeled Attachment Score (LAS)
57
  type: f_score
58
+ value: 0.8092235713
59
  - task:
60
  name: SENTS
61
  type: token-classification
62
  metrics:
63
  - name: Sentences F-Score
64
  type: f_score
65
+ value: 0.9985486212
66
  ---
67
  ### Details: https://spacy.io/models/ko#ko_core_news_md
68
 
 
71
  | Feature | Description |
72
  | --- | --- |
73
  | **Name** | `ko_core_news_md` |
74
+ | **Version** | `3.4.0` |
75
+ | **spaCy** | `>=3.4.0,<3.5.0` |
76
  | **Default Pipeline** | `tok2vec`, `tagger`, `morphologizer`, `parser`, `lemmatizer`, `attribute_ruler`, `ner` |
77
  | **Components** | `tok2vec`, `tagger`, `morphologizer`, `parser`, `lemmatizer`, `senter`, `attribute_ruler`, `ner` |
78
  | **Vectors** | floret (50000, 300) |
79
+ | **Sources** | [UD Korean Kaist v2.8](https://github.com/UniversalDependencies/UD_Korean-Kaist) (Choi, Jinho; Han, Na-Rae; Hwang, Jena; Chun, Jayeol)<br />[KLUE v1.1.0](https://github.com/KLUE-benchmark/KLUE) (Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Youngsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Ryu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho)<br />[Explosion Vectors (OSCAR 2109 + Wikipedia + OpenSubtitles + WMT News Crawl)](https://github.com/explosion/spacy-vectors-builder) (Explosion) |
80
  | **License** | `CC BY-SA 4.0` |
81
  | **Author** | [Explosion](https://explosion.ai) |
82
 
 
84
 
85
  <details>
86
 
87
+ <summary>View label scheme (2028 labels for 4 components)</summary>
88
 
89
  | Component | Labels |
90
  | --- | --- |
91
+ | **`tagger`** | `_SP`, `ecs`, `etm`, `f`, `f+f+jcj`, `f+f+jcs`, `f+f+jct`, `f+f+jxt`, `f+jca`, `f+jca+jp+ecc`, `f+jca+jp+ep+ef`, `f+jca+jxc`, `f+jca+jxc+jcm`, `f+jca+jxt`, `f+jcj`, `f+jcm`, `f+jco`, `f+jcs`, `f+jct`, `f+jct+jcm`, `f+jp+ef`, `f+jp+ep+ef`, `f+jp+etm`, `f+jxc`, `f+jxt`, `f+ncn`, `f+ncn+jcm`, `f+ncn+jcs`, `f+ncn+jp+ecc`, `f+ncn+jxt`, `f+ncpa+jcm`, `f+npp+jcs`, `f+nq`, `f+xsn`, `f+xsn+jco`, `f+xsn+jxt`, `ii`, `jca`, `jca+jcm`, `jca+jxc`, `jca+jxt`, `jcc`, `jcj`, `jcm`, `jco`, `jcr`, `jcr+jxc`, `jcs`, `jct`, `jct+jcm`, `jct+jxt`, `jp+ecc`, `jp+ecs`, `jp+ef`, `jp+ef+jcr`, `jp+ef+jcr+jxc`, `jp+ep+ecs`, `jp+ep+ef`, `jp+ep+etm`, `jp+ep+etn`, `jp+etm`, `jp+etn`, `jp+etn+jco`, `jp+etn+jxc`, `jxc`, `jxc+jca`, `jxc+jco`, `jxc+jcs`, `jxt`, `mad`, `mad+jxc`, `mad+jxt`, `mag`, `mag+jca`, `mag+jcm`, `mag+jcs`, `mag+jp+ef+jcr`, `mag+jxc`, `mag+jxc+jxc`, `mag+jxt`, `mag+xsn`, `maj`, `maj+jxc`, `maj+jxt`, `mma`, `mmd`, `nbn`, `nbn+jca`, `nbn+jca+jcj`, `nbn+jca+jcm`, `nbn+jca+jp+ef`, `nbn+jca+jxc`, `nbn+jca+jxt`, `nbn+jcc`, `nbn+jcj`, `nbn+jcm`, `nbn+jco`, `nbn+jcr`, `nbn+jcs`, `nbn+jct`, `nbn+jct+jcm`, `nbn+jct+jxt`, `nbn+jp+ecc`, `nbn+jp+ecs`, `nbn+jp+ecs+jca`, `nbn+jp+ecs+jcm`, `nbn+jp+ecs+jco`, `nbn+jp+ecs+jxc`, `nbn+jp+ecs+jxt`, `nbn+jp+ecx`, `nbn+jp+ef`, `nbn+jp+ef+jca`, `nbn+jp+ef+jco`, `nbn+jp+ef+jcr`, `nbn+jp+ef+jcr+jxc`, `nbn+jp+ef+jcr+jxt`, `nbn+jp+ef+jcs`, `nbn+jp+ef+jxc`, `nbn+jp+ef+jxc+jco`, `nbn+jp+ef+jxf`, `nbn+jp+ef+jxt`, `nbn+jp+ep+ecc`, `nbn+jp+ep+ecs`, `nbn+jp+ep+ecs+jxc`, `nbn+jp+ep+ef`, `nbn+jp+ep+ef+jcr`, `nbn+jp+ep+etm`, `nbn+jp+ep+etn`, `nbn+jp+ep+etn+jco`, `nbn+jp+ep+etn+jcs`, `nbn+jp+etm`, `nbn+jp+etn`, `nbn+jp+etn+jca`, `nbn+jp+etn+jca+jxt`, `nbn+jp+etn+jco`, `nbn+jp+etn+jcs`, `nbn+jp+etn+jxc`, `nbn+jp+etn+jxt`, `nbn+jxc`, `nbn+jxc+jca`, `nbn+jxc+jca+jxc`, `nbn+jxc+jca+jxt`, `nbn+jxc+jcc`, `nbn+jxc+jcm`, `nbn+jxc+jco`, `nbn+jxc+jcs`, `nbn+jxc+jp+ef`, `nbn+jxc+jxc`, `nbn+jxc+jxt`, `nbn+jxt`, `nbn+nbn`, `nbn+nbn+jp+ef`, `nbn+xsm+ecs`, `nbn+xsm+ef`, `nbn+xsm+ep+ef`, `nbn+xsm+ep+ef+jcr`, `nbn+xsm+etm`, `nbn+xsn`, `nbn+xsn+jca`, `nbn+xsn+jca+jp+ef+jcr`, `nbn+xsn+jca+jxc`, `nbn+xsn+jca+jxt`, `nbn+xsn+jcm`, `nbn+xsn+jco`, `nbn+xsn+jcs`, `nbn+xsn+jct`, `nbn+xsn+jp+ecc`, `nbn+xsn+jp+ecs`, `nbn+xsn+jp+ef`, `nbn+xsn+jp+ef+jcr`, `nbn+xsn+jp+ep+ef`, `nbn+xsn+jxc`, `nbn+xsn+jxt`, `nbn+xsv+etm`, `nbu`, `nbu+jca`, `nbu+jca+jxc`, `nbu+jca+jxt`, `nbu+jcc`, `nbu+jcc+jxc`, `nbu+jcj`, `nbu+jcm`, `nbu+jco`, `nbu+jcs`, `nbu+jct`, `nbu+jct+jxc`, `nbu+jp+ecc`, `nbu+jp+ecs`, `nbu+jp+ef`, `nbu+jp+ef+jcr`, `nbu+jp+ef+jxc`, `nbu+jp+ep+ecc`, `nbu+jp+ep+ecs`, `nbu+jp+ep+ef`, `nbu+jp+ep+ef+jcr`, `nbu+jp+ep+etm`, `nbu+jp+ep+etn+jco`, `nbu+jp+etm`, `nbu+jxc`, `nbu+jxc+jca`, `nbu+jxc+jcs`, `nbu+jxc+jp+ef`, `nbu+jxc+jp+ep+ef`, `nbu+jxc+jxt`, `nbu+jxt`, `nbu+ncn`, `nbu+ncn+jca`, `nbu+ncn+jcm`, `nbu+xsn`, `nbu+xsn+jca`, `nbu+xsn+jca+jxc`, `nbu+xsn+jca+jxt`, `nbu+xsn+jcm`, `nbu+xsn+jco`, `nbu+xsn+jcs`, `nbu+xsn+jp+ecs`, `nbu+xsn+jp+ep+ef`, `nbu+xsn+jxc`, `nbu+xsn+jxc+jxt`, `nbu+xsn+jxt`, `nbu+xsv+ecc`, `nbu+xsv+etm`, `ncn`, `ncn+f+ncpa+jco`, `ncn+jca`, `ncn+jca+jca`, `ncn+jca+jcc`, `ncn+jca+jcj`, `ncn+jca+jcm`, `ncn+jca+jcs`, `ncn+jca+jct`, `ncn+jca+jp+ecc`, `ncn+jca+jp+ecs`, `ncn+jca+jp+ef`, `ncn+jca+jp+ep+ef`, `ncn+jca+jp+etm`, `ncn+jca+jp+etn+jxt`, `ncn+jca+jxc`, `ncn+jca+jxc+jcc`, `ncn+jca+jxc+jcm`, `ncn+jca+jxc+jxc`, `ncn+jca+jxc+jxt`, `ncn+jca+jxt`, `ncn+jcc`, `ncn+jcc+jxc`, `ncn+jcj`, `ncn+jcj+jxt`, `ncn+jcm`, `ncn+jco`, `ncn+jcr`, `ncn+jcr+jxc`, `ncn+jcs`, `ncn+jcs+jxt`, `ncn+jct`, `ncn+jct+jcm`, `ncn+jct+jxc`, `ncn+jct+jxt`, `ncn+jcv`, `ncn+jp+ecc`, `ncn+jp+ecc+jct`, `ncn+jp+ecc+jxc`, `ncn+jp+ecs`, `ncn+jp+ecs+jcm`, `ncn+jp+ecs+jco`, `ncn+jp+ecs+jxc`, `ncn+jp+ecs+jxt`, `ncn+jp+ecx`, `ncn+jp+ef`, `ncn+jp+ef+jca`, `ncn+jp+ef+jcm`, `ncn+jp+ef+jco`, `ncn+jp+ef+jcr`, `ncn+jp+ef+jcr+jxc`, `ncn+jp+ef+jcr+jxt`, `ncn+jp+ef+jp+etm`, `ncn+jp+ef+jxc`, `ncn+jp+ef+jxf`, `ncn+jp+ef+jxt`, `ncn+jp+ep+ecc`, `ncn+jp+ep+ecs`, `ncn+jp+ep+ecs+jxc`, `ncn+jp+ep+ecx`, `ncn+jp+ep+ef`, `ncn+jp+ep+ef+jcr`, `ncn+jp+ep+ef+jcr+jxc`, `ncn+jp+ep+ef+jxc`, `ncn+jp+ep+ef+jxf`, `ncn+jp+ep+ef+jxt`, `ncn+jp+ep+ep+etm`, `ncn+jp+ep+etm`, `ncn+jp+ep+etn`, `ncn+jp+ep+etn+jca`, `ncn+jp+ep+etn+jca+jxc`, `ncn+jp+ep+etn+jco`, `ncn+jp+ep+etn+jcs`, `ncn+jp+ep+etn+jxt`, `ncn+jp+etm`, `ncn+jp+etn`, `ncn+jp+etn+jca`, `ncn+jp+etn+jca+jxc`, `ncn+jp+etn+jca+jxt`, `ncn+jp+etn+jco`, `ncn+jp+etn+jcs`, `ncn+jp+etn+jct`, `ncn+jp+etn+jxc`, `ncn+jp+etn+jxt`, `ncn+jxc`, `ncn+jxc+jca`, `ncn+jxc+jca+jxc`, `ncn+jxc+jca+jxt`, `ncn+jxc+jcc`, `ncn+jxc+jcm`, `ncn+jxc+jco`, `ncn+jxc+jcs`, `ncn+jxc+jct+jxt`, `ncn+jxc+jp+ef`, `ncn+jxc+jp+ef+jcr`, `ncn+jxc+jp+ep+ecs`, `ncn+jxc+jp+ep+ef`, `ncn+jxc+jp+etm`, `ncn+jxc+jxc`, `ncn+jxc+jxt`, `ncn+jxt`, `ncn+jxt+jcm`, `ncn+jxt+jxc`, `ncn+nbn`, `ncn+nbn+jca`, `ncn+nbn+jcm`, `ncn+nbn+jcs`, `ncn+nbn+jp+ecc`, `ncn+nbn+jp+ep+ef`, `ncn+nbn+jxc`, `ncn+nbn+jxt`, `ncn+nbu`, `ncn+nbu+jca`, `ncn+nbu+jcm`, `ncn+nbu+jco`, `ncn+nbu+jp+ef`, `ncn+nbu+jxc`, `ncn+nbu+ncn`, `ncn+ncn`, `ncn+ncn+jca`, `ncn+ncn+jca+jcc`, `ncn+ncn+jca+jcm`, `ncn+ncn+jca+jxc`, `ncn+ncn+jca+jxc+jcm`, `ncn+ncn+jca+jxc+jxc`, `ncn+ncn+jca+jxt`, `ncn+ncn+jcc`, `ncn+ncn+jcj`, `ncn+ncn+jcm`, `ncn+ncn+jco`, `ncn+ncn+jcr`, `ncn+ncn+jcs`, `ncn+ncn+jct`, `ncn+ncn+jct+jcm`, `ncn+ncn+jct+jxc`, `ncn+ncn+jct+jxt`, `ncn+ncn+jp+ecc`, `ncn+ncn+jp+ecs`, `ncn+ncn+jp+ef`, `ncn+ncn+jp+ef+jcm`, `ncn+ncn+jp+ef+jcr`, `ncn+ncn+jp+ef+jcs`, `ncn+ncn+jp+ep+ecc`, `ncn+ncn+jp+ep+ecs`, `ncn+ncn+jp+ep+ef`, `ncn+ncn+jp+ep+ef+jcr`, `ncn+ncn+jp+ep+ep+etm`, `ncn+ncn+jp+ep+etm`, `ncn+ncn+jp+ep+etn`, `ncn+ncn+jp+etm`, `ncn+ncn+jp+etn`, `ncn+ncn+jp+etn+jca`, `ncn+ncn+jp+etn+jco`, `ncn+ncn+jp+etn+jxc`, `ncn+ncn+jxc`, `ncn+ncn+jxc+jca`, `ncn+ncn+jxc+jcc`, `ncn+ncn+jxc+jcm`, `ncn+ncn+jxc+jco`, `ncn+ncn+jxc+jcs`, `ncn+ncn+jxc+jxc`, `ncn+ncn+jxt`, `ncn+ncn+nbn`, `ncn+ncn+ncn`, `ncn+ncn+ncn+jca`, `ncn+ncn+ncn+jca+jcm`, `ncn+ncn+ncn+jca+jxt`, `ncn+ncn+ncn+jcj`, `ncn+ncn+ncn+jcm`, `ncn+ncn+ncn+jco`, `ncn+ncn+ncn+jcs`, `ncn+ncn+ncn+jct+jxt`, `ncn+ncn+ncn+jp+etn+jxc`, `ncn+ncn+ncn+jxt`, `ncn+ncn+ncn+ncn+jca`, `ncn+ncn+ncn+ncn+jca+jxt`, `ncn+ncn+ncn+ncn+jco`, `ncn+ncn+ncn+xsn+jp+etm`, `ncn+ncn+ncpa`, `ncn+ncn+ncpa+jca`, `ncn+ncn+ncpa+jcm`, `ncn+ncn+ncpa+jco`, `ncn+ncn+ncpa+jcs`, `ncn+ncn+ncpa+jxc`, `ncn+ncn+ncpa+jxt`, `ncn+ncn+ncpa+ncn`, `ncn+ncn+ncpa+ncn+jca`, `ncn+ncn+ncpa+ncn+jcj`, `ncn+ncn+ncpa+ncn+jcm`, `ncn+ncn+ncpa+ncn+jxt`, `ncn+ncn+xsn`, `ncn+ncn+xsn+jca`, `ncn+ncn+xsn+jca+jxt`, `ncn+ncn+xsn+jcj`, `ncn+ncn+xsn+jcm`, `ncn+ncn+xsn+jco`, `ncn+ncn+xsn+jcs`, `ncn+ncn+xsn+jct`, `ncn+ncn+xsn+jp+ecs`, `ncn+ncn+xsn+jp+ep+ef`, `ncn+ncn+xsn+jp+etm`, `ncn+ncn+xsn+jxc`, `ncn+ncn+xsn+jxc+jcs`, `ncn+ncn+xsn+jxt`, `ncn+ncn+xsv+ecc`, `ncn+ncn+xsv+etm`, `ncn+ncpa`, `ncn+ncpa+jca`, `ncn+ncpa+jca+jcm`, `ncn+ncpa+jca+jxc`, `ncn+ncpa+jca+jxt`, `ncn+ncpa+jcc`, `ncn+ncpa+jcj`, `ncn+ncpa+jcm`, `ncn+ncpa+jco`, `ncn+ncpa+jcr`, `ncn+ncpa+jcs`, `ncn+ncpa+jct`, `ncn+ncpa+jct+jcm`, `ncn+ncpa+jct+jxt`, `ncn+ncpa+jp+ecc`, `ncn+ncpa+jp+ecc+jxc`, `ncn+ncpa+jp+ecs`, `ncn+ncpa+jp+ecs+jxc`, `ncn+ncpa+jp+ef`, `ncn+ncpa+jp+ef+jcr`, `ncn+ncpa+jp+ef+jcr+jxc`, `ncn+ncpa+jp+ep+ef`, `ncn+ncpa+jp+ep+etm`, `ncn+ncpa+jp+ep+etn`, `ncn+ncpa+jp+etm`, `ncn+ncpa+jxc`, `ncn+ncpa+jxc+jca+jxc`, `ncn+ncpa+jxc+jco`, `ncn+ncpa+jxc+jcs`, `ncn+ncpa+jxt`, `ncn+ncpa+nbn+jcs`, `ncn+ncpa+ncn`, `ncn+ncpa+ncn+jca`, `ncn+ncpa+ncn+jca+jcm`, `ncn+ncpa+ncn+jca+jxc`, `ncn+ncpa+ncn+jca+jxt`, `ncn+ncpa+ncn+jcj`, `ncn+ncpa+ncn+jcm`, `ncn+ncpa+ncn+jco`, `ncn+ncpa+ncn+jcs`, `ncn+ncpa+ncn+jct`, `ncn+ncpa+ncn+jct+jcm`, `ncn+ncpa+ncn+jp+ef+jcr`, `ncn+ncpa+ncn+jp+ep+etm`, `ncn+ncpa+ncn+jxc`, `ncn+ncpa+ncn+jxt`, `ncn+ncpa+ncn+xsn+jcm`, `ncn+ncpa+ncn+xsn+jxt`, `ncn+ncpa+ncpa`, `ncn+ncpa+ncpa+jca`, `ncn+ncpa+ncpa+jcj`, `ncn+ncpa+ncpa+jcm`, `ncn+ncpa+ncpa+jco`, `ncn+ncpa+ncpa+jcs`, `ncn+ncpa+ncpa+jp+ep+ef`, `ncn+ncpa+ncpa+jxt`, `ncn+ncpa+ncpa+ncn`, `ncn+ncpa+xsn`, `ncn+ncpa+xsn+jcm`, `ncn+ncpa+xsn+jco`, `ncn+ncpa+xsn+jcs`, `ncn+ncpa+xsn+jp+ecc`, `ncn+ncpa+xsn+jp+etm`, `ncn+ncpa+xsn+jxt`, `ncn+ncpa+xsv+ecc`, `ncn+ncpa+xsv+ecs`, `ncn+ncpa+xsv+ecx`, `ncn+ncpa+xsv+ecx+px+etm`, `ncn+ncpa+xsv+ef`, `ncn+ncpa+xsv+ef+jcm`, `ncn+ncpa+xsv+ef+jcr`, `ncn+ncpa+xsv+etm`, _(truncated: full list in pipeline meta)_ |
92
+ | **`morphologizer`** | `POS=CCONJ`, `POS=ADV`, `POS=SCONJ`, `POS=DET`, `POS=NOUN`, `POS=VERB`, `POS=ADJ`, `POS=PUNCT`, `POS=SPACE`, `POS=AUX`, `POS=PRON`, `POS=PROPN`, `POS=NUM`, `POS=INTJ`, `POS=PART`, `POS=X`, `POS=ADP`, `POS=SYM` |
93
  | **`parser`** | `ROOT`, `acl`, `advcl`, `advmod`, `amod`, `appos`, `aux`, `case`, `cc`, `ccomp`, `compound`, `conj`, `cop`, `csubj`, `dep`, `det`, `dislocated`, `fixed`, `flat`, `iobj`, `mark`, `nmod`, `nsubj`, `nummod`, `obj`, `obl`, `punct`, `xcomp` |
94
  | **`ner`** | `DT`, `LC`, `OG`, `PS`, `QT`, `TI` |
95
 
 
103
  | `TOKEN_P` | 100.00 |
104
  | `TOKEN_R` | 100.00 |
105
  | `TOKEN_F` | 100.00 |
106
+ | `TAG_ACC` | 83.36 |
107
+ | `POS_ACC` | 94.62 |
108
+ | `SENTS_P` | 99.81 |
109
+ | `SENTS_R` | 99.90 |
110
+ | `SENTS_F` | 99.85 |
111
+ | `DEP_UAS` | 83.88 |
112
+ | `DEP_LAS` | 80.92 |
113
+ | `LEMMA_ACC` | 89.71 |
114
+ | `ENTS_P` | 84.90 |
115
+ | `ENTS_R` | 81.59 |
116
+ | `ENTS_F` | 83.21 |
accuracy.json CHANGED
@@ -3,113 +3,113 @@
3
  "token_p": 1.0,
4
  "token_r": 1.0,
5
  "token_f": 1.0,
6
- "tag_acc": 0.8341706555,
7
- "pos_acc": 0.9467937814,
8
- "sents_p": 1.0,
9
- "sents_r": 1.0,
10
- "sents_f": 1.0,
11
- "dep_uas": 0.8385098703,
12
- "dep_las": 0.8093222227,
13
  "dep_las_per_type": {
14
  "amod": {
15
- "p": 0.8346456693,
16
- "r": 0.8353765324,
17
- "f": 0.8350109409
18
  },
19
  "dislocated": {
20
- "p": 0.7644762524,
21
- "r": 0.7639791938,
22
- "f": 0.7642276423
23
  },
24
  "root": {
25
- "p": 0.8644724105,
26
- "r": 0.8644724105,
27
- "f": 0.8644724105
28
  },
29
  "nmod": {
30
- "p": 0.8822129685,
31
- "r": 0.8801186944,
32
- "f": 0.881164587
33
  },
34
  "nsubj": {
35
- "p": 0.8091844814,
36
- "r": 0.8149920255,
37
- "f": 0.8120778705
38
  },
39
  "advmod": {
40
- "p": 0.7287607171,
41
- "r": 0.7339089482,
42
- "f": 0.7313257724
43
  },
44
  "dep": {
45
- "p": 0.5090909091,
46
- "r": 0.4640883978,
47
- "f": 0.4855491329
48
  },
49
  "conj": {
50
- "p": 0.6899676375,
51
- "r": 0.6833333333,
52
- "f": 0.6866344605
53
  },
54
  "xcomp": {
55
- "p": 0.6398601399,
56
- "r": 0.6288659794,
57
- "f": 0.6343154246
58
  },
59
  "flat": {
60
- "p": 0.7058823529,
61
- "r": 0.2242990654,
62
- "f": 0.3404255319
63
  },
64
  "obj": {
65
- "p": 0.920212766,
66
- "r": 0.9296077378,
67
- "f": 0.924886394
68
  },
69
  "acl": {
70
- "p": 0.7983243567,
71
- "r": 0.800239952,
72
- "f": 0.7992810066
73
  },
74
  "advcl": {
75
- "p": 0.7842211733,
76
- "r": 0.789545146,
77
- "f": 0.7868741543
78
  },
79
  "det": {
80
- "p": 0.7737704918,
81
- "r": 0.8872180451,
82
- "f": 0.826619965
83
  },
84
  "compound": {
85
- "p": 0.8033980583,
86
- "r": 0.8083028083,
87
- "f": 0.8058429702
88
  },
89
  "ccomp": {
90
- "p": 0.6655433867,
91
- "r": 0.6556016598,
92
- "f": 0.6605351171
93
  },
94
  "obl": {
95
- "p": 0.8721399731,
96
- "r": 0.8840381992,
97
- "f": 0.8780487805
98
  },
99
  "aux": {
100
- "p": 0.9294605809,
101
- "r": 0.9424964937,
102
- "f": 0.9359331476
103
  },
104
  "cc": {
105
- "p": 0.8231511254,
106
- "r": 0.8476821192,
107
- "f": 0.8352365416
108
  },
109
  "nummod": {
110
- "p": 0.8502673797,
111
- "r": 0.832460733,
112
- "f": 0.8412698413
113
  },
114
  "discourse": {
115
  "p": 0.0,
@@ -117,34 +117,34 @@
117
  "f": 0.0
118
  },
119
  "fixed": {
120
- "p": 0.9407114625,
121
- "r": 0.9958158996,
122
- "f": 0.9674796748
123
  },
124
  "csubj": {
125
- "p": 0.6464646465,
126
- "r": 0.5818181818,
127
- "f": 0.6124401914
128
  },
129
  "mark": {
130
- "p": 0.66,
131
- "r": 0.5076923077,
132
- "f": 0.5739130435
133
  },
134
  "iobj": {
135
- "p": 0.8028169014,
136
- "r": 0.8028169014,
137
- "f": 0.8028169014
138
  },
139
  "case": {
140
- "p": 0.803030303,
141
- "r": 0.7794117647,
142
- "f": 0.7910447761
143
  },
144
  "cop": {
145
- "p": 0.6875,
146
- "r": 0.7333333333,
147
- "f": 0.7096774194
148
  },
149
  "vocative": {
150
  "p": 0.0,
@@ -152,46 +152,46 @@
152
  "f": 0.0
153
  },
154
  "appos": {
155
- "p": 0.625,
156
  "r": 0.7142857143,
157
- "f": 0.6666666667
158
  }
159
  },
160
- "lemma_acc": 0.89985957,
161
- "speed": 7799.3668113624,
162
- "ents_p": 0.850539861,
163
- "ents_r": 0.8125044154,
164
- "ents_f": 0.8310871843,
165
  "ents_per_type": {
166
  "OG": {
167
- "p": 0.776907001,
168
- "r": 0.6893834029,
169
- "f": 0.7305330386
170
  },
171
  "PS": {
172
- "p": 0.8594288528,
173
- "r": 0.7932288116,
174
- "f": 0.825002954
175
  },
176
  "QT": {
177
- "p": 0.9113005051,
178
- "r": 0.925617185,
179
- "f": 0.9184030539
180
  },
181
  "DT": {
182
- "p": 0.8889875666,
183
- "r": 0.8677936714,
184
- "f": 0.8782627769
185
  },
186
  "LC": {
187
- "p": 0.713747646,
188
- "r": 0.6992619926,
189
- "f": 0.7064305685
190
  },
191
  "TI": {
192
- "p": 0.9324577861,
193
- "r": 0.9119266055,
194
- "f": 0.9220779221
195
  }
196
  }
197
  }
 
3
  "token_p": 1.0,
4
  "token_r": 1.0,
5
  "token_f": 1.0,
6
+ "tag_acc": 0.8336399059,
7
+ "pos_acc": 0.946239962,
8
+ "sents_p": 0.998065764,
9
+ "sents_r": 0.9990319458,
10
+ "sents_f": 0.9985486212,
11
+ "dep_uas": 0.83875855,
12
+ "dep_las": 0.8092235713,
13
  "dep_las_per_type": {
14
  "amod": {
15
+ "p": 0.8202443281,
16
+ "r": 0.823117338,
17
+ "f": 0.8216783217
18
  },
19
  "dislocated": {
20
+ "p": 0.7434725849,
21
+ "r": 0.7405721717,
22
+ "f": 0.742019544
23
  },
24
  "root": {
25
+ "p": 0.8689555126,
26
+ "r": 0.8697967086,
27
+ "f": 0.8693759071
28
  },
29
  "nmod": {
30
+ "p": 0.8883610451,
31
+ "r": 0.8878338279,
32
+ "f": 0.8880973583
33
  },
34
  "nsubj": {
35
+ "p": 0.7961089494,
36
+ "r": 0.8157894737,
37
+ "f": 0.8058290666
38
  },
39
  "advmod": {
40
+ "p": 0.7191097467,
41
+ "r": 0.7354788069,
42
+ "f": 0.7272021731
43
  },
44
  "dep": {
45
+ "p": 0.6642335766,
46
+ "r": 0.5027624309,
47
+ "f": 0.572327044
48
  },
49
  "conj": {
50
+ "p": 0.7157057654,
51
+ "r": 0.6923076923,
52
+ "f": 0.7038123167
53
  },
54
  "xcomp": {
55
+ "p": 0.6344827586,
56
+ "r": 0.6323024055,
57
+ "f": 0.6333907057
58
  },
59
  "flat": {
60
+ "p": 0.625,
61
+ "r": 0.1869158879,
62
+ "f": 0.2877697842
63
  },
64
  "obj": {
65
+ "p": 0.9232832618,
66
+ "r": 0.9247716282,
67
+ "f": 0.9240268456
68
  },
69
  "acl": {
70
+ "p": 0.8018072289,
71
+ "r": 0.7984403119,
72
+ "f": 0.8001202284
73
  },
74
  "advcl": {
75
+ "p": 0.7902571042,
76
+ "r": 0.7929395791,
77
+ "f": 0.7915960691
78
  },
79
  "det": {
80
+ "p": 0.7652733119,
81
+ "r": 0.8947368421,
82
+ "f": 0.8249566724
83
  },
84
  "compound": {
85
+ "p": 0.7985823981,
86
+ "r": 0.8253968254,
87
+ "f": 0.8117682378
88
  },
89
  "ccomp": {
90
+ "p": 0.6488925349,
91
+ "r": 0.6564315353,
92
+ "f": 0.652640264
93
  },
94
  "obl": {
95
+ "p": 0.8827493261,
96
+ "r": 0.8935879945,
97
+ "f": 0.8881355932
98
  },
99
  "aux": {
100
+ "p": 0.9286713287,
101
+ "r": 0.9312762973,
102
+ "f": 0.9299719888
103
  },
104
  "cc": {
105
+ "p": 0.8193548387,
106
+ "r": 0.8410596026,
107
+ "f": 0.8300653595
108
  },
109
  "nummod": {
110
+ "p": 0.8540540541,
111
+ "r": 0.8272251309,
112
+ "f": 0.8404255319
113
  },
114
  "discourse": {
115
  "p": 0.0,
 
117
  "f": 0.0
118
  },
119
  "fixed": {
120
+ "p": 0.956,
121
+ "r": 1.0,
122
+ "f": 0.9775051125
123
  },
124
  "csubj": {
125
+ "p": 0.5393258427,
126
+ "r": 0.4363636364,
127
+ "f": 0.4824120603
128
  },
129
  "mark": {
130
+ "p": 0.6923076923,
131
+ "r": 0.5538461538,
132
+ "f": 0.6153846154
133
  },
134
  "iobj": {
135
+ "p": 0.8169014085,
136
+ "r": 0.8169014085,
137
+ "f": 0.8169014085
138
  },
139
  "case": {
140
+ "p": 0.8260869565,
141
+ "r": 0.8382352941,
142
+ "f": 0.8321167883
143
  },
144
  "cop": {
145
+ "p": 0.6666666667,
146
+ "r": 0.6666666667,
147
+ "f": 0.6666666667
148
  },
149
  "vocative": {
150
  "p": 0.0,
 
152
  "f": 0.0
153
  },
154
  "appos": {
155
+ "p": 0.7142857143,
156
  "r": 0.7142857143,
157
+ "f": 0.7142857143
158
  }
159
  },
160
+ "lemma_acc": 0.8970905279,
161
+ "speed": 5125.497744352,
162
+ "ents_p": 0.8490038962,
163
+ "ents_r": 0.8158954433,
164
+ "ents_f": 0.8321204698,
165
  "ents_per_type": {
166
  "OG": {
167
+ "p": 0.7671443193,
168
+ "r": 0.6949466852,
169
+ "f": 0.7292629531
170
  },
171
  "PS": {
172
+ "p": 0.859375,
173
+ "r": 0.7998182231,
174
+ "f": 0.8285277157
175
  },
176
  "QT": {
177
+ "p": 0.9160861305,
178
+ "r": 0.9275408785,
179
+ "f": 0.9217779194
180
  },
181
  "DT": {
182
+ "p": 0.8884462151,
183
+ "r": 0.8699609883,
184
+ "f": 0.8791064389
185
  },
186
  "LC": {
187
+ "p": 0.7082294264,
188
+ "r": 0.6986469865,
189
+ "f": 0.7034055728
190
  },
191
  "TI": {
192
+ "p": 0.9285714286,
193
+ "r": 0.9064220183,
194
+ "f": 0.9173630455
195
  }
196
  }
197
  }
ko_core_news_md-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:831de6925e663ec0e0ed26984756dc3ef5a694619dc63ffe6d4a0f7fc56b0267
3
- size 69047809
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5a92de93f560fa9352b87c71a93da1462c042752ab6dee2578b073c19f9c3559
3
+ size 69054189
lemmatizer/cfg CHANGED
@@ -903,250 +903,243 @@
903
  1708,
904
  1709,
905
  1711,
906
- 1713,
907
  1716,
908
- 1718,
909
- 1721,
910
- 1722,
911
- 1725,
912
- 1728,
913
  1729,
 
914
  1731,
915
- 1732,
916
- 1733,
917
- 1736,
918
  1739,
919
- 1741,
920
  1744,
921
  1746,
922
- 1748,
923
- 1751,
924
  1754,
925
- 1756,
926
- 1761,
927
  1764,
928
- 1766,
929
  1769,
930
  1771,
931
  1773,
932
- 1775,
933
  120,
934
- 1777,
 
935
  1778,
936
- 1780,
937
  1783,
938
- 1785,
939
  1786,
940
  1788,
 
941
  1790,
942
- 1791,
943
  1792,
944
  1794,
945
  1796,
946
  1798,
947
- 1800,
948
  1803,
949
- 1805,
950
  1806,
951
- 1808,
952
- 1811,
953
  1812,
954
- 1814,
955
  888,
 
 
956
  1816,
957
  1817,
958
- 1818,
959
- 1819,
960
- 1822,
961
- 1825,
962
- 1828,
963
  1700,
964
- 1830,
965
- 1833,
966
- 1834,
967
- 1837,
 
968
  1840,
969
- 1842,
970
- 1845,
971
  1848,
972
  1850,
973
- 1853,
974
  1855,
 
975
  1857,
976
- 1860,
977
- 1861,
978
  1862,
979
- 1863,
980
  1865,
981
  1867,
982
- 1868,
983
  1870,
984
  1872,
985
- 1873,
986
- 1875,
987
- 1877,
988
- 1879,
989
  1881,
990
- 1884,
991
  1885,
992
- 1888,
993
  1889,
994
- 1892,
995
- 1894,
 
996
  1898,
997
  1900,
998
- 1901,
999
- 1903,
1000
- 1905,
1001
  1907,
 
1002
  1910,
1003
- 1911,
1004
  1913,
1005
- 1915,
1006
- 1916,
1007
  1920,
1008
- 1922,
1009
  1923,
 
1010
  1926,
1011
  1928,
1012
  1929,
1013
- 1931,
1014
  1932,
1015
- 1935,
1016
  1936,
1017
- 1939,
1018
  1941,
1019
- 1944,
1020
- 1946,
1021
- 1948,
1022
  1950,
1023
- 1953,
 
1024
  1955,
1025
- 1957,
1026
  1958,
1027
- 1959,
1028
  1961,
 
1029
  1964,
1030
- 1966,
1031
  1967,
 
1032
  1970,
1033
  1971,
1034
  1973,
1035
- 1974,
1036
- 1976,
1037
- 1980,
1038
- 1981,
1039
  1982,
1040
- 1985,
1041
- 1987,
1042
- 1989,
1043
- 1991,
1044
- 1993,
1045
- 1995,
1046
  109,
1047
- 1997,
 
1048
  1999,
 
1049
  2002,
1050
  2004,
1051
- 2005,
1052
  2007,
1053
- 2010,
 
1054
  2011,
1055
- 2012,
1056
- 2014,
1057
- 2016,
1058
  2018,
 
1059
  2021,
1060
- 2023,
1061
  2024,
1062
- 2025,
1063
  2027,
1064
- 2030,
1065
- 2031,
 
1066
  2035,
1067
- 2036,
1068
  2038,
 
1069
  2041,
1070
  2042,
1071
- 2044,
1072
  2045,
 
1073
  2048,
1074
  2049,
1075
  2051,
1076
- 2052,
1077
- 2054,
1078
  2056,
1079
  2058,
1080
- 2059,
1081
- 2061,
1082
- 2063,
1083
- 2065,
1084
  2067,
1085
  2069,
1086
- 2070,
1087
- 2072,
1088
  2074,
1089
- 2077,
1090
  2079,
1091
- 2081,
1092
- 2084,
1093
- 2085,
1094
- 2088,
1095
  911,
 
 
1096
  2089,
1097
- 2092,
 
1098
  2094,
1099
- 2096,
1100
- 2098,
1101
- 2099,
1102
- 2102,
1103
- 2105,
1104
- 2109,
1105
- 2113,
1106
- 2115,
1107
  2117,
1108
- 2119,
1109
- 2122,
1110
- 2125,
1111
  2126,
1112
- 2128,
1113
- 2131,
1114
  2132,
1115
  2133,
1116
- 2136,
1117
- 2138,
 
 
1118
  2139,
1119
  2140,
1120
- 2141,
1121
  2143,
1122
- 611,
1123
  2145,
1124
- 2146,
1125
  2149,
1126
  2151,
1127
  2153,
1128
- 2155,
1129
  2157,
 
1130
  2159,
1131
- 2161,
1132
  2164,
1133
- 2165,
1134
- 2166,
1135
  2167,
1136
- 2171,
1137
  2172,
1138
  2175,
1139
  2178,
1140
  2180,
 
1141
  2183,
 
1142
  2186,
1143
- 2188,
1144
- 2190,
1145
- 2191,
1146
- 2193,
1147
- 2194,
1148
- 2196,
1149
- 2197,
1150
- 2199
1151
  ]
1152
  }
 
903
  1708,
904
  1709,
905
  1711,
906
+ 1714,
907
  1716,
908
+ 1719,
909
+ 1720,
910
+ 1723,
911
+ 1726,
912
+ 1727,
913
  1729,
914
+ 1730,
915
  1731,
916
+ 1734,
917
+ 1737,
 
918
  1739,
919
+ 1742,
920
  1744,
921
  1746,
922
+ 1749,
923
+ 1752,
924
  1754,
925
+ 1759,
926
+ 1762,
927
  1764,
928
+ 1767,
929
  1769,
930
  1771,
931
  1773,
 
932
  120,
933
+ 1775,
934
+ 1776,
935
  1778,
936
+ 1781,
937
  1783,
938
+ 1784,
939
  1786,
940
  1788,
941
+ 1789,
942
  1790,
 
943
  1792,
944
  1794,
945
  1796,
946
  1798,
947
+ 1801,
948
  1803,
949
+ 1804,
950
  1806,
951
+ 1809,
952
+ 1810,
953
  1812,
 
954
  888,
955
+ 1814,
956
+ 1815,
957
  1816,
958
  1817,
959
+ 1820,
960
+ 1823,
961
+ 1826,
 
 
962
  1700,
963
+ 1828,
964
+ 1831,
965
+ 1832,
966
+ 1835,
967
+ 1838,
968
  1840,
969
+ 1843,
970
+ 1846,
971
  1848,
972
  1850,
973
+ 1852,
974
  1855,
975
+ 1856,
976
  1857,
977
+ 1859,
 
978
  1862,
979
+ 1864,
980
  1865,
981
  1867,
982
+ 1869,
983
  1870,
984
  1872,
985
+ 1874,
986
+ 1876,
987
+ 1878,
 
988
  1881,
989
+ 1882,
990
  1885,
991
+ 1886,
992
  1889,
993
+ 1891,
994
+ 1895,
995
+ 1897,
996
  1898,
997
  1900,
998
+ 1902,
999
+ 1904,
 
1000
  1907,
1001
+ 1908,
1002
  1910,
1003
+ 1912,
1004
  1913,
1005
+ 1917,
1006
+ 1919,
1007
  1920,
 
1008
  1923,
1009
+ 1925,
1010
  1926,
1011
  1928,
1012
  1929,
 
1013
  1932,
1014
+ 1933,
1015
  1936,
1016
+ 1938,
1017
  1941,
1018
+ 1943,
1019
+ 1945,
1020
+ 1947,
1021
  1950,
1022
+ 1952,
1023
+ 1954,
1024
  1955,
1025
+ 1956,
1026
  1958,
 
1027
  1961,
1028
+ 1963,
1029
  1964,
 
1030
  1967,
1031
+ 1968,
1032
  1970,
1033
  1971,
1034
  1973,
1035
+ 1977,
1036
+ 1978,
1037
+ 1979,
 
1038
  1982,
1039
+ 1984,
1040
+ 1986,
1041
+ 1988,
1042
+ 1990,
1043
+ 1992,
 
1044
  109,
1045
+ 1994,
1046
+ 1996,
1047
  1999,
1048
+ 2001,
1049
  2002,
1050
  2004,
 
1051
  2007,
1052
+ 2008,
1053
+ 2009,
1054
  2011,
1055
+ 2013,
1056
+ 2015,
 
1057
  2018,
1058
+ 2020,
1059
  2021,
1060
+ 2022,
1061
  2024,
 
1062
  2027,
1063
+ 2028,
1064
+ 2032,
1065
+ 2033,
1066
  2035,
 
1067
  2038,
1068
+ 2039,
1069
  2041,
1070
  2042,
 
1071
  2045,
1072
+ 2046,
1073
  2048,
1074
  2049,
1075
  2051,
1076
+ 2053,
1077
+ 2055,
1078
  2056,
1079
  2058,
1080
+ 2060,
1081
+ 2062,
1082
+ 2064,
1083
+ 2066,
1084
  2067,
1085
  2069,
1086
+ 2071,
 
1087
  2074,
1088
+ 2076,
1089
  2079,
1090
+ 2080,
1091
+ 2083,
 
 
1092
  911,
1093
+ 2084,
1094
+ 2087,
1095
  2089,
1096
+ 2091,
1097
+ 2093,
1098
  2094,
1099
+ 2097,
1100
+ 2100,
1101
+ 2104,
1102
+ 2108,
1103
+ 2110,
1104
+ 2112,
1105
+ 2114,
 
1106
  2117,
1107
+ 2120,
1108
+ 2121,
1109
+ 2123,
1110
  2126,
1111
+ 2127,
1112
+ 2130,
1113
  2132,
1114
  2133,
1115
+ 2134,
1116
+ 2135,
1117
+ 2137,
1118
+ 611,
1119
  2139,
1120
  2140,
 
1121
  2143,
 
1122
  2145,
1123
+ 2147,
1124
  2149,
1125
  2151,
1126
  2153,
1127
+ 2156,
1128
  2157,
1129
+ 2158,
1130
  2159,
1131
+ 2163,
1132
  2164,
 
 
1133
  2167,
1134
+ 2170,
1135
  2172,
1136
  2175,
1137
  2178,
1138
  2180,
1139
+ 2182,
1140
  2183,
1141
+ 2185,
1142
  2186,
1143
+ 2188
 
 
 
 
 
 
 
1144
  ]
1145
  }
lemmatizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:610f98c803e83a29c52dfe6522c27b18db35bf79b7769294d5087978ea3dfa6a
3
- size 445866
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e184e0f2adc0f8513bb75dbcd61dc981ca0f9f94a96ba73bdc72b2000b36f879
3
+ size 443150
lemmatizer/trees CHANGED
Binary files a/lemmatizer/trees and b/lemmatizer/trees differ
 
meta.json CHANGED
@@ -1,14 +1,14 @@
1
  {
2
  "lang":"ko",
3
  "name":"core_news_md",
4
- "version":"3.3.0",
5
  "description":"Korean pipeline optimized for CPU. Components: tok2vec, tagger, morphologizer, parser, lemmatizer (trainable_lemmatizer), senter, ner.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
- "spacy_version":">=3.3.0.dev0,<3.4.0",
11
- "spacy_git_version":"849bef2de",
12
  "vectors":{
13
  "width":300,
14
  "vectors":50000,
@@ -20,6 +20,7 @@
20
 
21
  ],
22
  "tagger":[
 
23
  "ecs",
24
  "etm",
25
  "f",
@@ -2005,6 +2006,7 @@
2005
  "POS=VERB",
2006
  "POS=ADJ",
2007
  "POS=PUNCT",
 
2008
  "POS=AUX",
2009
  "POS=PRON",
2010
  "POS=PROPN",
@@ -2084,113 +2086,113 @@
2084
  "token_p":1.0,
2085
  "token_r":1.0,
2086
  "token_f":1.0,
2087
- "tag_acc":0.8341706555,
2088
- "pos_acc":0.9467937814,
2089
- "sents_p":1.0,
2090
- "sents_r":1.0,
2091
- "sents_f":1.0,
2092
- "dep_uas":0.8385098703,
2093
- "dep_las":0.8093222227,
2094
  "dep_las_per_type":{
2095
  "amod":{
2096
- "p":0.8346456693,
2097
- "r":0.8353765324,
2098
- "f":0.8350109409
2099
  },
2100
  "dislocated":{
2101
- "p":0.7644762524,
2102
- "r":0.7639791938,
2103
- "f":0.7642276423
2104
  },
2105
  "root":{
2106
- "p":0.8644724105,
2107
- "r":0.8644724105,
2108
- "f":0.8644724105
2109
  },
2110
  "nmod":{
2111
- "p":0.8822129685,
2112
- "r":0.8801186944,
2113
- "f":0.881164587
2114
  },
2115
  "nsubj":{
2116
- "p":0.8091844814,
2117
- "r":0.8149920255,
2118
- "f":0.8120778705
2119
  },
2120
  "advmod":{
2121
- "p":0.7287607171,
2122
- "r":0.7339089482,
2123
- "f":0.7313257724
2124
  },
2125
  "dep":{
2126
- "p":0.5090909091,
2127
- "r":0.4640883978,
2128
- "f":0.4855491329
2129
  },
2130
  "conj":{
2131
- "p":0.6899676375,
2132
- "r":0.6833333333,
2133
- "f":0.6866344605
2134
  },
2135
  "xcomp":{
2136
- "p":0.6398601399,
2137
- "r":0.6288659794,
2138
- "f":0.6343154246
2139
  },
2140
  "flat":{
2141
- "p":0.7058823529,
2142
- "r":0.2242990654,
2143
- "f":0.3404255319
2144
  },
2145
  "obj":{
2146
- "p":0.920212766,
2147
- "r":0.9296077378,
2148
- "f":0.924886394
2149
  },
2150
  "acl":{
2151
- "p":0.7983243567,
2152
- "r":0.800239952,
2153
- "f":0.7992810066
2154
  },
2155
  "advcl":{
2156
- "p":0.7842211733,
2157
- "r":0.789545146,
2158
- "f":0.7868741543
2159
  },
2160
  "det":{
2161
- "p":0.7737704918,
2162
- "r":0.8872180451,
2163
- "f":0.826619965
2164
  },
2165
  "compound":{
2166
- "p":0.8033980583,
2167
- "r":0.8083028083,
2168
- "f":0.8058429702
2169
  },
2170
  "ccomp":{
2171
- "p":0.6655433867,
2172
- "r":0.6556016598,
2173
- "f":0.6605351171
2174
  },
2175
  "obl":{
2176
- "p":0.8721399731,
2177
- "r":0.8840381992,
2178
- "f":0.8780487805
2179
  },
2180
  "aux":{
2181
- "p":0.9294605809,
2182
- "r":0.9424964937,
2183
- "f":0.9359331476
2184
  },
2185
  "cc":{
2186
- "p":0.8231511254,
2187
- "r":0.8476821192,
2188
- "f":0.8352365416
2189
  },
2190
  "nummod":{
2191
- "p":0.8502673797,
2192
- "r":0.832460733,
2193
- "f":0.8412698413
2194
  },
2195
  "discourse":{
2196
  "p":0.0,
@@ -2198,34 +2200,34 @@
2198
  "f":0.0
2199
  },
2200
  "fixed":{
2201
- "p":0.9407114625,
2202
- "r":0.9958158996,
2203
- "f":0.9674796748
2204
  },
2205
  "csubj":{
2206
- "p":0.6464646465,
2207
- "r":0.5818181818,
2208
- "f":0.6124401914
2209
  },
2210
  "mark":{
2211
- "p":0.66,
2212
- "r":0.5076923077,
2213
- "f":0.5739130435
2214
  },
2215
  "iobj":{
2216
- "p":0.8028169014,
2217
- "r":0.8028169014,
2218
- "f":0.8028169014
2219
  },
2220
  "case":{
2221
- "p":0.803030303,
2222
- "r":0.7794117647,
2223
- "f":0.7910447761
2224
  },
2225
  "cop":{
2226
- "p":0.6875,
2227
- "r":0.7333333333,
2228
- "f":0.7096774194
2229
  },
2230
  "vocative":{
2231
  "p":0.0,
@@ -2233,46 +2235,46 @@
2233
  "f":0.0
2234
  },
2235
  "appos":{
2236
- "p":0.625,
2237
  "r":0.7142857143,
2238
- "f":0.6666666667
2239
  }
2240
  },
2241
- "lemma_acc":0.89985957,
2242
- "speed":7799.3668113624,
2243
- "ents_p":0.850539861,
2244
- "ents_r":0.8125044154,
2245
- "ents_f":0.8310871843,
2246
  "ents_per_type":{
2247
  "OG":{
2248
- "p":0.776907001,
2249
- "r":0.6893834029,
2250
- "f":0.7305330386
2251
  },
2252
  "PS":{
2253
- "p":0.8594288528,
2254
- "r":0.7932288116,
2255
- "f":0.825002954
2256
  },
2257
  "QT":{
2258
- "p":0.9113005051,
2259
- "r":0.925617185,
2260
- "f":0.9184030539
2261
  },
2262
  "DT":{
2263
- "p":0.8889875666,
2264
- "r":0.8677936714,
2265
- "f":0.8782627769
2266
  },
2267
  "LC":{
2268
- "p":0.713747646,
2269
- "r":0.6992619926,
2270
- "f":0.7064305685
2271
  },
2272
  "TI":{
2273
- "p":0.9324577861,
2274
- "r":0.9119266055,
2275
- "f":0.9220779221
2276
  }
2277
  }
2278
  },
@@ -2290,7 +2292,7 @@
2290
  "author":"Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Youngsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Ryu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho"
2291
  },
2292
  {
2293
- "name":"Explosion floret Vectors (OSCAR 2109 + Wikipedia + OpenSubtitles + WMT News Crawl)",
2294
  "url":"https://github.com/explosion/spacy-vectors-builder",
2295
  "license":"CC0",
2296
  "author":"Explosion"
 
1
  {
2
  "lang":"ko",
3
  "name":"core_news_md",
4
+ "version":"3.4.0",
5
  "description":"Korean pipeline optimized for CPU. Components: tok2vec, tagger, morphologizer, parser, lemmatizer (trainable_lemmatizer), senter, ner.",
6
  "author":"Explosion",
7
  "email":"contact@explosion.ai",
8
  "url":"https://explosion.ai",
9
  "license":"CC BY-SA 4.0",
10
+ "spacy_version":">=3.4.0,<3.5.0",
11
+ "spacy_git_version":"dd038b536",
12
  "vectors":{
13
  "width":300,
14
  "vectors":50000,
 
20
 
21
  ],
22
  "tagger":[
23
+ "_SP",
24
  "ecs",
25
  "etm",
26
  "f",
 
2006
  "POS=VERB",
2007
  "POS=ADJ",
2008
  "POS=PUNCT",
2009
+ "POS=SPACE",
2010
  "POS=AUX",
2011
  "POS=PRON",
2012
  "POS=PROPN",
 
2086
  "token_p":1.0,
2087
  "token_r":1.0,
2088
  "token_f":1.0,
2089
+ "tag_acc":0.8336399059,
2090
+ "pos_acc":0.946239962,
2091
+ "sents_p":0.998065764,
2092
+ "sents_r":0.9990319458,
2093
+ "sents_f":0.9985486212,
2094
+ "dep_uas":0.83875855,
2095
+ "dep_las":0.8092235713,
2096
  "dep_las_per_type":{
2097
  "amod":{
2098
+ "p":0.8202443281,
2099
+ "r":0.823117338,
2100
+ "f":0.8216783217
2101
  },
2102
  "dislocated":{
2103
+ "p":0.7434725849,
2104
+ "r":0.7405721717,
2105
+ "f":0.742019544
2106
  },
2107
  "root":{
2108
+ "p":0.8689555126,
2109
+ "r":0.8697967086,
2110
+ "f":0.8693759071
2111
  },
2112
  "nmod":{
2113
+ "p":0.8883610451,
2114
+ "r":0.8878338279,
2115
+ "f":0.8880973583
2116
  },
2117
  "nsubj":{
2118
+ "p":0.7961089494,
2119
+ "r":0.8157894737,
2120
+ "f":0.8058290666
2121
  },
2122
  "advmod":{
2123
+ "p":0.7191097467,
2124
+ "r":0.7354788069,
2125
+ "f":0.7272021731
2126
  },
2127
  "dep":{
2128
+ "p":0.6642335766,
2129
+ "r":0.5027624309,
2130
+ "f":0.572327044
2131
  },
2132
  "conj":{
2133
+ "p":0.7157057654,
2134
+ "r":0.6923076923,
2135
+ "f":0.7038123167
2136
  },
2137
  "xcomp":{
2138
+ "p":0.6344827586,
2139
+ "r":0.6323024055,
2140
+ "f":0.6333907057
2141
  },
2142
  "flat":{
2143
+ "p":0.625,
2144
+ "r":0.1869158879,
2145
+ "f":0.2877697842
2146
  },
2147
  "obj":{
2148
+ "p":0.9232832618,
2149
+ "r":0.9247716282,
2150
+ "f":0.9240268456
2151
  },
2152
  "acl":{
2153
+ "p":0.8018072289,
2154
+ "r":0.7984403119,
2155
+ "f":0.8001202284
2156
  },
2157
  "advcl":{
2158
+ "p":0.7902571042,
2159
+ "r":0.7929395791,
2160
+ "f":0.7915960691
2161
  },
2162
  "det":{
2163
+ "p":0.7652733119,
2164
+ "r":0.8947368421,
2165
+ "f":0.8249566724
2166
  },
2167
  "compound":{
2168
+ "p":0.7985823981,
2169
+ "r":0.8253968254,
2170
+ "f":0.8117682378
2171
  },
2172
  "ccomp":{
2173
+ "p":0.6488925349,
2174
+ "r":0.6564315353,
2175
+ "f":0.652640264
2176
  },
2177
  "obl":{
2178
+ "p":0.8827493261,
2179
+ "r":0.8935879945,
2180
+ "f":0.8881355932
2181
  },
2182
  "aux":{
2183
+ "p":0.9286713287,
2184
+ "r":0.9312762973,
2185
+ "f":0.9299719888
2186
  },
2187
  "cc":{
2188
+ "p":0.8193548387,
2189
+ "r":0.8410596026,
2190
+ "f":0.8300653595
2191
  },
2192
  "nummod":{
2193
+ "p":0.8540540541,
2194
+ "r":0.8272251309,
2195
+ "f":0.8404255319
2196
  },
2197
  "discourse":{
2198
  "p":0.0,
 
2200
  "f":0.0
2201
  },
2202
  "fixed":{
2203
+ "p":0.956,
2204
+ "r":1.0,
2205
+ "f":0.9775051125
2206
  },
2207
  "csubj":{
2208
+ "p":0.5393258427,
2209
+ "r":0.4363636364,
2210
+ "f":0.4824120603
2211
  },
2212
  "mark":{
2213
+ "p":0.6923076923,
2214
+ "r":0.5538461538,
2215
+ "f":0.6153846154
2216
  },
2217
  "iobj":{
2218
+ "p":0.8169014085,
2219
+ "r":0.8169014085,
2220
+ "f":0.8169014085
2221
  },
2222
  "case":{
2223
+ "p":0.8260869565,
2224
+ "r":0.8382352941,
2225
+ "f":0.8321167883
2226
  },
2227
  "cop":{
2228
+ "p":0.6666666667,
2229
+ "r":0.6666666667,
2230
+ "f":0.6666666667
2231
  },
2232
  "vocative":{
2233
  "p":0.0,
 
2235
  "f":0.0
2236
  },
2237
  "appos":{
2238
+ "p":0.7142857143,
2239
  "r":0.7142857143,
2240
+ "f":0.7142857143
2241
  }
2242
  },
2243
+ "lemma_acc":0.8970905279,
2244
+ "speed":5125.497744352,
2245
+ "ents_p":0.8490038962,
2246
+ "ents_r":0.8158954433,
2247
+ "ents_f":0.8321204698,
2248
  "ents_per_type":{
2249
  "OG":{
2250
+ "p":0.7671443193,
2251
+ "r":0.6949466852,
2252
+ "f":0.7292629531
2253
  },
2254
  "PS":{
2255
+ "p":0.859375,
2256
+ "r":0.7998182231,
2257
+ "f":0.8285277157
2258
  },
2259
  "QT":{
2260
+ "p":0.9160861305,
2261
+ "r":0.9275408785,
2262
+ "f":0.9217779194
2263
  },
2264
  "DT":{
2265
+ "p":0.8884462151,
2266
+ "r":0.8699609883,
2267
+ "f":0.8791064389
2268
  },
2269
  "LC":{
2270
+ "p":0.7082294264,
2271
+ "r":0.6986469865,
2272
+ "f":0.7034055728
2273
  },
2274
  "TI":{
2275
+ "p":0.9285714286,
2276
+ "r":0.9064220183,
2277
+ "f":0.9173630455
2278
  }
2279
  }
2280
  },
 
2292
  "author":"Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Youngsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Ryu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho"
2293
  },
2294
  {
2295
+ "name":"Explosion Vectors (OSCAR 2109 + Wikipedia + OpenSubtitles + WMT News Crawl)",
2296
  "url":"https://github.com/explosion/spacy-vectors-builder",
2297
  "license":"CC0",
2298
  "author":"Explosion"
morphologizer/cfg CHANGED
@@ -9,6 +9,7 @@
9
  "POS=VERB":"",
10
  "POS=ADJ":"",
11
  "POS=PUNCT":"",
 
12
  "POS=AUX":"",
13
  "POS=PRON":"",
14
  "POS=PROPN":"",
@@ -28,6 +29,7 @@
28
  "POS=VERB":100,
29
  "POS=ADJ":84,
30
  "POS=PUNCT":97,
 
31
  "POS=AUX":87,
32
  "POS=PRON":95,
33
  "POS=PROPN":96,
 
9
  "POS=VERB":"",
10
  "POS=ADJ":"",
11
  "POS=PUNCT":"",
12
+ "POS=SPACE":"",
13
  "POS=AUX":"",
14
  "POS=PRON":"",
15
  "POS=PROPN":"",
 
29
  "POS=VERB":100,
30
  "POS=ADJ":84,
31
  "POS=PUNCT":97,
32
+ "POS=SPACE":103,
33
  "POS=AUX":87,
34
  "POS=PRON":95,
35
  "POS=PROPN":96,
morphologizer/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8c2ae1ba9d29fb717d790832099bb54fcea0ad10956fdd6f5ced3acb926751e7
3
- size 7025
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9f20366c12c4f0c5f7cf7451500b669897b71f7b2e12cfb2e52efcd89a4acdfd
3
+ size 7413
ner/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:dfd7cd4d4e162097d1ada384513ce1eb0acd9392ded778d500c73b4b96dcd662
3
  size 6498672
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2898e2dc21075fd89f19e3dfe5a71950b02d1a0923ef3811dcd486d7fed76938
3
  size 6498672
ner/moves CHANGED
@@ -1 +1 @@
1
- ��moves�,{"0":{},"1":{"PS":16127,"QT":13487,"DT":10090,"OG":9793,"LC":8337,"TI":2926},"2":{"PS":16127,"QT":13487,"DT":10090,"OG":9793,"LC":8337,"TI":2926},"3":{"PS":16127,"QT":13487,"DT":10090,"OG":9793,"LC":8337,"TI":2926},"4":{"PS":16127,"QT":13487,"DT":10090,"OG":9793,"LC":8337,"TI":2926,"":1},"5":{"":1}}�cfg��neg_key�
 
1
+ ��moves�,{"0":{},"1":{"PS":16134,"QT":13491,"DT":10094,"OG":9801,"LC":8341,"TI":2933},"2":{"PS":16134,"QT":13491,"DT":10094,"OG":9801,"LC":8341,"TI":2933},"3":{"PS":16134,"QT":13491,"DT":10094,"OG":9801,"LC":8341,"TI":2933},"4":{"PS":16134,"QT":13491,"DT":10094,"OG":9801,"LC":8341,"TI":2933,"":1},"5":{"":1}}�cfg��neg_key�
parser/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9db8d56b0f20a9a07d3303f08af6eb6195d17534b741f5a871344e5c261b583c
3
  size 305088
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3d503264583f74dad9660b68b8520590a98018d588a90f8a5bbf5e32b1ac8c87
3
  size 305088
parser/moves CHANGED
@@ -1 +1 @@
1
- ��moves��{"0":{"":202496},"1":{"":70942},"2":{"compound":20632,"obj":19819,"nmod":17977,"acl":17852,"advcl":16547,"dislocated":15591,"advmod":15157,"nsubj":14107,"amod":13975,"ccomp":12588,"obl":9740,"det":4235,"cc":3779,"xcomp":3414,"nummod":2654,"dep":2533,"dislocated||conj":2002,"punct":1803,"csubj":1008,"iobj":823,"advmod||conj":769,"ccomp||conj":717,"cc||conj":705,"mark":635,"nmod||conj":606,"advcl||conj":586,"nsubj||conj":521,"acl||conj":425,"obj||conj":287,"amod||conj":278,"compound||conj":184,"xcomp||conj":169,"obl||conj":153,"dep||conj":64,"det||conj":51,"iobj||conj":31},"3":{"punct":31170,"conj":17608,"aux":16034,"fixed":2776,"case":1120,"appos":956,"flat":595,"advmod":311,"cop":268,"dep":0},"4":{"ROOT":23010}}�cfg��neg_key�
 
1
+ ��moves��{"0":{"":202502},"1":{"":72363},"2":{"compound":20632,"obj":19819,"nmod":17977,"acl":17852,"advcl":16547,"dislocated":15591,"advmod":15157,"nsubj":14107,"amod":13975,"ccomp":12588,"obl":9740,"det":4235,"cc":3779,"xcomp":3414,"nummod":2654,"dep":2539,"dislocated||conj":2002,"punct":1803,"csubj":1008,"iobj":823,"advmod||conj":769,"ccomp||conj":717,"cc||conj":705,"mark":635,"nmod||conj":606,"advcl||conj":586,"nsubj||conj":521,"acl||conj":425,"obj||conj":287,"amod||conj":278,"compound||conj":184,"xcomp||conj":169,"obl||conj":153,"dep||conj":64,"det||conj":51,"iobj||conj":31},"3":{"punct":31170,"conj":17608,"aux":16034,"fixed":2776,"dep":1443,"case":1120,"appos":956,"flat":595,"advmod":311,"cop":268},"4":{"ROOT":23010}}�cfg��neg_key�
senter/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a9e2869ae93bad2143792ee0cf797bc426911b072501a6123add592bff34380b
3
  size 219953
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:866dcfd531f5295c04fdbe3c1530253ad2ced34da3f8b72a5f1fa7e910fa9849
3
  size 219953
tagger/cfg CHANGED
@@ -1,5 +1,6 @@
1
  {
2
  "labels":[
 
3
  "ecs",
4
  "etm",
5
  "f",
 
1
  {
2
  "labels":[
3
+ "_SP",
4
  "ecs",
5
  "etm",
6
  "f",
tagger/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6b006bb4f904849345d73ed1c32e2ae8752434feb5aab3db22b24983d76b6eb5
3
- size 766742
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:30eb986406370bc15c5ac2aa36a69283e65f5da92b608a7533f5e58fc2b2cabe
3
+ size 767130
tok2vec/model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e2dca74026bd9983c45f1ef1861e1369954d038a4f0dcf91c15907e5a407262f
3
  size 6365604
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e1de95f6c5dcc381c5cb1a5c50b1840c1c4bc1a06a3a3a5a9bcbb75fa806490e
3
  size 6365604
vocab/strings.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:714088dede2bb5bd98640b938d6deaf0af668344611311bffb7ee591d329c691
3
- size 9962460
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6629d48d9e2b106d9424c6dc70e5979416c14410ee42e8d42f7f3234f9229f35
3
+ size 9944642