hiroshi-matsuda-rit's picture
Update spaCy pipeline
91bd98b
metadata
tags:
  - spacy
  - token-classification
language:
  - ja
license: CC-BY-SA-4.0
model-index:
  - name: ja_gsd_bert_wwm_unidic_lite
    results:
      - tasks:
          name: NER
          type: token-classification
          metrics:
            - name: Precision
              type: precision
              value: 0.8496143959
            - name: Recall
              type: recall
              value: 0.8314465409
            - name: F Score
              type: f_score
              value: 0.840432295
      - tasks:
          name: POS
          type: token-classification
          metrics:
            - name: Accuracy
              type: accuracy
              value: 0
      - tasks:
          name: SENTER
          type: token-classification
          metrics:
            - name: Precision
              type: precision
              value: 0.9201520913
            - name: Recall
              type: recall
              value: 0.9546351085
            - name: F Score
              type: f_score
              value: 0.9370764763
      - tasks:
          name: UNLABELED_DEPENDENCIES
          type: token-classification
          metrics:
            - name: Accuracy
              type: accuracy
              value: 0.9367795389
      - tasks:
          name: LABELED_DEPENDENCIES
          type: token-classification
          metrics:
            - name: Accuracy
              type: accuracy
              value: 0.9367795389

Japanese transformer pipeline (bert-base). Components: transformer, parser, ner.

Feature Description
Name ja_gsd_bert_wwm_unidic_lite
Version 3.1.1
spaCy >=3.1.0,<3.2.0
Default Pipeline transformer, parser, ner
Components transformer, parser, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources UD_Japanese-GSD
UD_Japanese-GSD r2.8+NE
SudachiDict_core
cl-tohoku/bert-base-japanese-whole-word-masking
unidic_lite
License CC BY-SA 4.0
Author Megagon Labs Tokyo.

Label Scheme

View label scheme (45 labels for 2 components)
Component Labels
parser ROOT, acl, advcl, advmod, amod, aux, case, cc, ccomp, compound, cop, csubj, dep, det, dislocated, fixed, mark, nmod, nsubj, nummod, obj, obl, punct
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, MOVEMENT, NORP, ORDINAL, ORG, PERCENT, PERSON, PET_NAME, PHONE, PRODUCT, QUANTITY, TIME, TITLE_AFFIX, WORK_OF_ART

Accuracy

Type Score
DEP_UAS 93.68
DEP_LAS 92.61
SENTS_P 92.02
SENTS_R 95.46
SENTS_F 93.71
ENTS_F 84.04
ENTS_P 84.96
ENTS_R 83.14
TAG_ACC 0.00
TRANSFORMER_LOSS 28861.67
PARSER_LOSS 1306248.63
NER_LOSS 13993.36