da_dacy_medium_trf / README.md
KennethEnevoldsen's picture
Changed header
56fa557
|
raw
history blame
14.8 kB
metadata
tags:
  - spacy
  - dacy
  - danish
  - named entity recognition
  - pos tagging
  - lemmatization
  - dependency parsing
  - coreference resolution
  - named entity linking
  - named entity disambiguation
  - token-classification
language:
  - da
license: apache-2.0
model-index:
  - name: da_dacy_medium_trf
    results:
      - task:
          name: NER
          type: token-classification
        metrics:
          - name: NER Precision
            type: precision
            value: 0.8708487085
          - name: NER Recall
            type: recall
            value: 0.8458781362
          - name: NER F Score
            type: f_score
            value: 0.8581818182
        dataset:
          name: DaNE
          split: test
          type: dane
      - task:
          name: TAG
          type: token-classification
        metrics:
          - name: TAG (XPOS) Accuracy
            type: accuracy
            value: 0.9847290149
        dataset:
          name: UD Danish DDT
          split: test
          type: universal_dependencies
          config: da_ddt
      - task:
          name: POS
          type: token-classification
        metrics:
          - name: POS (UPOS) Accuracy
            type: accuracy
            value: 0.985677928
        dataset:
          name: UD Danish DDT
          split: test
          type: universal_dependencies
          config: da_ddt
      - task:
          name: MORPH
          type: token-classification
        metrics:
          - name: Morph (UFeats) Accuracy
            type: accuracy
            value: 0.9814371257
        dataset:
          name: UD Danish DDT
          split: test
          type: universal_dependencies
          config: da_ddt
      - task:
          name: UNLABELED_DEPENDENCIES
          type: token-classification
        metrics:
          - name: Unlabeled Attachment Score (UAS)
            type: f_score
            value: 0.9083920564
        dataset:
          name: UD Danish DDT
          split: test
          type: universal_dependencies
          config: da_ddt
      - task:
          name: LABELED_DEPENDENCIES
          type: token-classification
        metrics:
          - name: Labeled Attachment Score (LAS)
            type: f_score
            value: 0.883349834
        dataset:
          name: UD Danish DDT
          split: test
          type: universal_dependencies
          config: da_ddt
      - task:
          name: SENTS
          type: token-classification
        metrics:
          - name: Sentences F-Score
            type: f_score
            value: 0.9885462555
        dataset:
          name: UD Danish DDT
          split: test
          type: universal_dependencies
          config: da_ddt
      - task:
          name: coreference-resolution
          type: coreference-resolution
        metrics:
          - name: LEA
            type: f_score
            value: 0.4118366346
        dataset:
          name: DaCoref
          type: alexandrainst/dacoref
          split: custom
      - task:
          name: coreference-resolution
          type: coreference-resolution
        metrics:
          - name: Named entity Linking Precision
            type: precision
            value: 0.9923076923
          - name: Named entity Linking Recall
            type: recall
            value: 0.671875
          - name: Named entity Linking F Score
            type: f_score
            value: 0.801242236
        dataset:
          name: DaNED
          type: named-entity-linking
          split: custom
library_name: spacy
datasets:
  - universal_dependencies
  - alexandrainst/dacoref
  - dane
metrics:
  - accuracy

DaCy medium

DaCy is a Danish language processing framework with state-of-the-art pipelines as well as functionality for analysing Danish pipelines. DaCy's largest pipeline has achieved State-of-the-Art performance on parts-of-speech tagging and dependency parsing for Danish on the Danish Dependency treebank as well as competitive performance on named entity recognition, named entity disambiguation and coreference resolution. To read more check out the DaCy repository for material on how to use DaCy and reproduce the results. DaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.

Feature Description
Name da_dacy_medium_trf
Version 0.2.0
spaCy >=3.5.2,<3.6.0
Default Pipeline transformer, tagger, morphologizer, trainable_lemmatizer, parser, ner, coref, span_resolver, span_cleaner, entity_linker
Components transformer, tagger, morphologizer, trainable_lemmatizer, parser, ner, coref, span_resolver, span_cleaner, entity_linker
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources UD Danish DDT v2.11 (Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara)
DaNE (Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders Søgaard)
DaCoref (Buch-Kromann, Matthias)
DaNED (Barrett, M. J., Lam, H., Wu, M., Lacroix, O., Plank, B., & Søgaard, A.)
vesteinn/DanskBERT (Vésteinn Snæbjarnarson)
License Apache-2.0 License
Author Kenneth Enevoldsen

Label Scheme

View label scheme (211 labels for 4 components)
Component Labels
tagger ADJ, ADP, ADV, AUX, CCONJ, DET, INTJ, NOUN, NUM, PART, PRON, PROPN, PUNCT, SCONJ, SYM, VERB, X
morphologizer AdpType=Prep|POS=ADP, Definite=Ind|Gender=Com|Number=Sing|POS=NOUN, Mood=Ind|POS=AUX|Tense=Pres|VerbForm=Fin|Voice=Act, POS=PROPN, Definite=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part, Definite=Def|Gender=Neut|Number=Sing|POS=NOUN, POS=SCONJ, Definite=Def|Gender=Com|Number=Sing|POS=NOUN, Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Act, POS=ADV, Number=Plur|POS=DET|PronType=Dem, Degree=Pos|Number=Plur|POS=ADJ, Definite=Ind|Gender=Com|Number=Plur|POS=NOUN, POS=PUNCT, NumType=Ord|POS=ADJ, POS=CCONJ, Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN, POS=VERB|VerbForm=Inf|Voice=Act, Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=3|PronType=Prs, Degree=Sup|POS=ADV, Degree=Pos|POS=ADV, Gender=Com|Number=Sing|POS=DET|PronType=Ind, Number=Plur|POS=DET|PronType=Ind, POS=ADP, POS=ADV|PartType=Inf, Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs, Mood=Ind|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act, Definite=Def|Degree=Pos|Number=Sing|POS=ADJ, Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs, Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act, POS=ADP|PartType=Inf, Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|POS=ADJ, NumType=Card|POS=NUM, Degree=Pos|POS=ADJ, Definite=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part, POS=PART|PartType=Inf, Case=Acc|POS=PRON|Person=3|PronType=Prs|Reflex=Yes, Definite=Def|Gender=Com|Number=Plur|POS=NOUN, Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN, Number[psor]=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs, POS=VERB|Tense=Pres|VerbForm=Part, Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs, Case=Gen|Definite=Def|Gender=Com|Number=Sing|POS=NOUN, Definite=Def|Degree=Sup|Number=Plur|POS=ADJ, Case=Acc|Number=Plur|POS=PRON|Person=3|PronType=Prs, POS=AUX|VerbForm=Inf|Voice=Act, Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ, Definite=Ind|Degree=Cmp|Number=Sing|POS=ADJ, Degree=Cmp|POS=ADJ, POS=PRON|PartType=Inf, Definite=Ind|Degree=Pos|Number=Sing|POS=ADJ, Case=Nom|Gender=Com|POS=PRON|PronType=Ind, Number=Plur|POS=PRON|PronType=Ind, POS=INTJ, Gender=Com|Number=Sing|POS=DET|PronType=Dem, Case=Gen|Number=Plur|POS=DET|PronType=Ind, Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Pass, Definite=Def|Gender=Neut|Number=Plur|POS=NOUN, Degree=Cmp|POS=ADV, Number=Plur|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs|Style=Form, Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs, Number=Plur|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes, Case=Gen|POS=PROPN, Gender=Neut|Number=Sing|POS=PRON|PronType=Ind, Number=Plur|POS=VERB|Tense=Past|VerbForm=Part, Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes, Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs, Definite=Def|Degree=Sup|POS=ADJ, Gender=Neut|Number=Sing|POS=DET|PronType=Ind, Case=Gen|Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN, Gender=Neut|Number=Sing|POS=DET|PronType=Dem, Definite=Def|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part, POS=PRON|PronType=Dem, Degree=Pos|Gender=Com|Number=Sing|POS=ADJ, Number=Plur|POS=NUM, POS=VERB|VerbForm=Inf|Voice=Pass, Definite=Def|Degree=Sup|Number=Sing|POS=ADJ, Number=Sing|POS=PRON|PronType=Int,Rel, Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs, Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs, Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs, POS=PRON, Definite=Ind|Number=Sing|POS=NOUN, Definite=Ind|Number=Sing|POS=NUM, Case=Gen|Definite=Ind|Gender=Com|Number=Sing|POS=NOUN, Foreign=Yes|POS=ADV, POS=NOUN, Case=Gen|Definite=Def|Gender=Neut|Number=Sing|POS=NOUN, Gender=Com|Number=Plur|POS=NOUN, Gender=Neut|Number=Sing|POS=PRON|PronType=Int,Rel, Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs, Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs, Gender=Com|Number=Sing|POS=PRON|PronType=Ind, Case=Gen|Definite=Ind|Gender=Com|Number=Plur|POS=NOUN, Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ, Degree=Sup|POS=ADJ, Degree=Pos|Number=Sing|POS=ADJ, Mood=Imp|POS=VERB, Case=Nom|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs, Case=Acc|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs, POS=X, Case=Gen|Definite=Def|Gender=Com|Number=Plur|POS=NOUN, Number=Plur|POS=PRON|PronType=Dem, Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs, Number=Plur|POS=PRON|PronType=Int,Rel, Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes, Degree=Cmp|Number=Plur|POS=ADJ, Number=Plur|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs, Gender=Com|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form, Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs, Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs, Gender=Com|POS=PRON|PronType=Int,Rel, Case=Gen|Degree=Pos|Number=Plur|POS=ADJ, Gender=Neut|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes, POS=VERB|VerbForm=Ger, Gender=Com|Number=Sing|POS=PRON|PronType=Dem, Case=Gen|POS=PRON|PronType=Int,Rel, Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Pass, Abbr=Yes|POS=X, Case=Gen|Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN, Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs, Definite=Ind|Number=Plur|POS=NOUN, Foreign=Yes|POS=X, Number=Plur|POS=PRON|PronType=Rcp, Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs, Case=Gen|Degree=Cmp|POS=ADJ, Case=Gen|Definite=Def|Gender=Neut|Number=Plur|POS=NOUN, Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs, Gender=Neut|Number=Sing|POS=PRON|PronType=Dem, Number=Plur|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form, Gender=Neut|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form, Number=Plur|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes, Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs, Case=Gen|Number=Plur|POS=PRON|PronType=Rcp, POS=DET|Person=2|Polite=Form|Poss=Yes|PronType=Prs, POS=SYM, POS=DET|PronType=Dem, Gender=Com|Number=Sing|POS=NUM, Number[psor]=Plur|POS=DET|Person=2|Poss=Yes|PronType=Prs, Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part, Definite=Def|Degree=Abs|POS=ADJ, POS=VERB|Tense=Pres, Definite=Ind|Gender=Neut|Number=Sing|POS=NUM, Degree=Abs|POS=ADV, Case=Gen|Definite=Def|Degree=Pos|Number=Sing|POS=ADJ, Gender=Com|Number=Sing|POS=PRON|PronType=Int,Rel, POS=VERB|Tense=Past|VerbForm=Part, Definite=Ind|Degree=Sup|Number=Sing|POS=ADJ, Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs, Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=1|Poss=Yes|PronType=Prs, Number=Plur|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs, Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs, Definite=Ind|POS=NOUN, Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Ind, Definite=Ind|Gender=Com|Number=Sing|POS=NUM, Definite=Def|Number=Plur|POS=NOUN, Case=Gen|POS=NOUN, POS=AUX|Tense=Pres|VerbForm=Part
parser ROOT, acl:relcl, advcl, advmod, advmod:lmod, amod, appos, aux, case, cc, ccomp, compound:prt, conj, cop, dep, det, expl, fixed, flat, iobj, list, mark, nmod, nmod:poss, nsubj, nummod, obj, obl, obl:lmod, obl:tmod, punct, xcomp
ner LOC, MISC, ORG, PER

Performance Metrics

Type Score
TOKEN_ACC 99.92
TOKEN_P 99.70
TOKEN_R 99.77
TOKEN_F 99.74
SENTS_P 98.42
SENTS_R 99.29
SENTS_F 98.85
TAG_ACC 98.47
POS_ACC 98.57
MORPH_ACC 98.14
MORPH_MICRO_P 99.10
MORPH_MICRO_R 98.77
MORPH_MICRO_F 98.93
DEP_UAS 90.84
DEP_LAS 88.33
ENTS_P 87.08
ENTS_R 84.59
ENTS_F 85.82
COREF_LEA_F1 41.18
COREF_LEA_PRECISION 48.89
COREF_LEA_RECALL 35.58
NEL_SCORE 80.12
NEL_MICRO_P 99.23
NEL_MICRO_R 67.19
NEL_MICRO_F 80.12
NEL_MACRO_P 99.39
NEL_MACRO_R 65.99
NEL_MACRO_F 78.15

Training

This model was trained using spaCy and logged to Weights & Biases. You can find all the training logs here.