etikaj-digital commited on
Commit
936fd34
1 Parent(s): 554cbc5

Update spaCy pipeline

Browse files
Files changed (3) hide show
  1. README.md +7 -23
  2. en_statistics-any-py3-none-any.whl +2 -2
  3. meta.json +2 -2
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  tags:
3
  - spacy
4
- - text-classification
5
  language:
6
  - en
7
  license: mit
@@ -9,46 +9,30 @@ model-index:
9
  - name: en_statistics
10
  results: []
11
  ---
12
- English pipeline that provides statistics, readability and formality scores.
13
 
14
  | Feature | Description |
15
  | --- | --- |
16
  | **Name** | `en_statistics` |
17
  | **Version** | `0.0.1` |
18
  | **spaCy** | `>=3.1.1,<3.2.0` |
19
- | **Default Pipeline** | `tok2vec`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner`, `syllables`, `formality`, `readability` |
20
- | **Components** | `tok2vec`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner`, `syllables`, `formality`, `readability` |
21
  | **Vectors** | 684830 keys, 20000 unique vectors (300 dimensions) |
22
  | **Sources** | [OntoNotes 5](https://catalog.ldc.upenn.edu/LDC2013T19) (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)<br />[ClearNLP Constituent-to-Dependency Conversion](https://github.com/clir/clearnlp-guidelines/blob/master/md/components/dependency_conversion.md) (Emory University)<br />[WordNet 3.0](https://wordnet.princeton.edu/) (Princeton University)<br />[GloVe Common Crawl](https://nlp.stanford.edu/projects/glove/) (Jeffrey Pennington, Richard Socher, and Christopher D. Manning) |
23
  | **License** | `MIT` |
24
- | **Author** | [Contentologie]() |
25
 
26
  ### Label Scheme
27
 
28
  <details>
29
 
30
- <summary>View label scheme (114 labels for 4 components)</summary>
31
 
32
  | Component | Labels |
33
  | --- | --- |
34
  | **`tagger`** | `$`, `''`, `,`, `-LRB-`, `-RRB-`, `.`, `:`, `ADD`, `AFX`, `CC`, `CD`, `DT`, `EX`, `FW`, `HYPH`, `IN`, `JJ`, `JJR`, `JJS`, `LS`, `MD`, `NFP`, `NN`, `NNP`, `NNPS`, `NNS`, `PDT`, `POS`, `PRP`, `PRP$`, `RB`, `RBR`, `RBS`, `RP`, `SYM`, `TO`, `UH`, `VB`, `VBD`, `VBG`, `VBN`, `VBP`, `VBZ`, `WDT`, `WP`, `WP$`, `WRB`, `XX`, ```` |
35
  | **`parser`** | `ROOT`, `acl`, `acomp`, `advcl`, `advmod`, `agent`, `amod`, `appos`, `attr`, `aux`, `auxpass`, `case`, `cc`, `ccomp`, `compound`, `conj`, `csubj`, `csubjpass`, `dative`, `dep`, `det`, `dobj`, `expl`, `intj`, `mark`, `meta`, `neg`, `nmod`, `npadvmod`, `nsubj`, `nsubjpass`, `nummod`, `oprd`, `parataxis`, `pcomp`, `pobj`, `poss`, `preconj`, `predet`, `prep`, `prt`, `punct`, `quantmod`, `relcl`, `xcomp` |
36
  | **`senter`** | `I`, `S` |
37
- | **`ner`** | `CARDINAL`, `DATE`, `EVENT`, `FAC`, `GPE`, `LANGUAGE`, `LAW`, `LOC`, `MONEY`, `NORP`, `ORDINAL`, `ORG`, `PERCENT`, `PERSON`, `PRODUCT`, `QUANTITY`, `TIME`, `WORK_OF_ART` |
38
 
39
- </details>
40
-
41
- ### Accuracy
42
-
43
- | Type | Score |
44
- | --- | --- |
45
- | `TOKEN_ACC` | 99.93 |
46
- | `TAG_ACC` | 97.28 |
47
- | `DEP_UAS` | 91.87 |
48
- | `DEP_LAS` | 90.05 |
49
- | `ENTS_P` | 85.37 |
50
- | `ENTS_R` | 84.57 |
51
- | `ENTS_F` | 84.97 |
52
- | `SENTS_P` | 90.49 |
53
- | `SENTS_R` | 88.01 |
54
- | `SENTS_F` | 89.24 |
1
  ---
2
  tags:
3
  - spacy
4
+ - token-classification
5
  language:
6
  - en
7
  license: mit
9
  - name: en_statistics
10
  results: []
11
  ---
12
+ Text statistics including readability and formality.
13
 
14
  | Feature | Description |
15
  | --- | --- |
16
  | **Name** | `en_statistics` |
17
  | **Version** | `0.0.1` |
18
  | **spaCy** | `>=3.1.1,<3.2.0` |
19
+ | **Default Pipeline** | `tok2vec`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `syllables`, `formality`, `readability` |
20
+ | **Components** | `tok2vec`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `syllables`, `formality`, `readability` |
21
  | **Vectors** | 684830 keys, 20000 unique vectors (300 dimensions) |
22
  | **Sources** | [OntoNotes 5](https://catalog.ldc.upenn.edu/LDC2013T19) (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)<br />[ClearNLP Constituent-to-Dependency Conversion](https://github.com/clir/clearnlp-guidelines/blob/master/md/components/dependency_conversion.md) (Emory University)<br />[WordNet 3.0](https://wordnet.princeton.edu/) (Princeton University)<br />[GloVe Common Crawl](https://nlp.stanford.edu/projects/glove/) (Jeffrey Pennington, Richard Socher, and Christopher D. Manning) |
23
  | **License** | `MIT` |
24
+ | **Author** | [Chris Knowles](https://explosion.ai) |
25
 
26
  ### Label Scheme
27
 
28
  <details>
29
 
30
+ <summary>View label scheme (96 labels for 3 components)</summary>
31
 
32
  | Component | Labels |
33
  | --- | --- |
34
  | **`tagger`** | `$`, `''`, `,`, `-LRB-`, `-RRB-`, `.`, `:`, `ADD`, `AFX`, `CC`, `CD`, `DT`, `EX`, `FW`, `HYPH`, `IN`, `JJ`, `JJR`, `JJS`, `LS`, `MD`, `NFP`, `NN`, `NNP`, `NNPS`, `NNS`, `PDT`, `POS`, `PRP`, `PRP$`, `RB`, `RBR`, `RBS`, `RP`, `SYM`, `TO`, `UH`, `VB`, `VBD`, `VBG`, `VBN`, `VBP`, `VBZ`, `WDT`, `WP`, `WP$`, `WRB`, `XX`, ```` |
35
  | **`parser`** | `ROOT`, `acl`, `acomp`, `advcl`, `advmod`, `agent`, `amod`, `appos`, `attr`, `aux`, `auxpass`, `case`, `cc`, `ccomp`, `compound`, `conj`, `csubj`, `csubjpass`, `dative`, `dep`, `det`, `dobj`, `expl`, `intj`, `mark`, `meta`, `neg`, `nmod`, `npadvmod`, `nsubj`, `nsubjpass`, `nummod`, `oprd`, `parataxis`, `pcomp`, `pobj`, `poss`, `preconj`, `predet`, `prep`, `prt`, `punct`, `quantmod`, `relcl`, `xcomp` |
36
  | **`senter`** | `I`, `S` |
 
37
 
38
+ </details>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
en_statistics-any-py3-none-any.whl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b73ccfb1dd1ed29dd8ae29fe0627a9da0201f674787cb1470b84a82e904a8669
3
- size 45406670
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8cb9bd80efd692442ffe255f592520008145d7915bf6078b1adcbbb4744ed50a
3
+ size 38942853
meta.json CHANGED
@@ -181,7 +181,7 @@
181
  "author":"Jeffrey Pennington, Richard Socher, and Christopher D. Manning"
182
  }
183
  ],
184
- "author_email":"knowles.chris.d@gmail.com",
185
  "performance":{
186
 
187
  },
@@ -189,4 +189,4 @@
189
  "requirements":[
190
 
191
  ]
192
- }
181
  "author":"Jeffrey Pennington, Richard Socher, and Christopher D. Manning"
182
  }
183
  ],
184
+ "author_email":"hello@valurank.com",
185
  "performance":{
186
 
187
  },
189
  "requirements":[
190
 
191
  ]
192
+ }