emiltj's picture
Update spaCy pipeline
a0aa2f8
metadata
tags:
  - spacy
  - token-classification
language:
  - da
license: apache-2.0
model-index:
  - name: da_dacy_small_DANSK_ner
    results:
      - task:
          name: NER
          type: token-classification
        metrics:
          - name: NER Precision
            type: precision
            value: 0.7718478986
          - name: NER Recall
            type: recall
            value: 0.7728790915
          - name: NER F Score
            type: f_score
            value: 0.7723631509

DaCy_small_DANSK_ner

DaCy is a Danish language processing framework with state-of-the-art pipelines as well as functionality for analyzing Danish pipelines. At the time of publishing this model, also included in DaCy encorporates the only models for fine-grained NER using DANSK dataset - a dataset containing 18 annotation types in the same format as Ontonotes. Moreover, DaCy's largest pipeline has achieved State-of-the-Art performance on Named entity recognition, part-of-speech tagging and dependency parsing for Danish on the DaNE dataset. Check out the DaCy repository for material on how to use DaCy and reproduce the results. DaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.

Feature Description
Name da_dacy_small_DANSK_ner
Version 0.1.0
spaCy >=3.5.0,<3.6.0
Default Pipeline transformer, ner
Components transformer, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources DANSK - Danish Annotations for NLP Specific TasKs
jonfd/electra-small-nordic (Jón Daðason)
License apache-2.0
Author Centre for Humanities Computing Aarhus

Label Scheme

View label scheme (18 labels for 1 components)
Component Labels
ner CARDINAL, DATE, EVENT, FACILITY, GPE, LANGUAGE, LAW, LOCATION, MONEY, NORP, ORDINAL, ORGANIZATION, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK OF ART

Accuracy

Type Score
ENTS_F 77.24
ENTS_P 77.18
ENTS_R 77.29
TRANSFORMER_LOSS 80975.57
NER_LOSS 90852.49