{ "lang":"da", "name":"dacy_small_trf", "version":"0.1.0", "description":"\n\n\n# DaCy small transformer\n\nDaCy is a Danish language processing framework with state-of-the-art pipelines as well as functionality for analysing Danish pipelines.\nDaCy's largest pipeline has achieved State-of-the-Art performance on Named entity recognition, part-of-speech tagging and dependency \nparsing for Danish on the DaNE dataset. Check out the [DaCy repository](https://github.com/centre-for-humanities-computing/DaCy) for material on how to use DaCy and reproduce the results. \nDaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.\n ", "author":"Centre for Humanities Computing Aarhus", "email":"Kenneth.enevoldsen@cas.au.dk", "url":"https://chcaa.io/#/", "license":"Apache-2.0 License", "spacy_version":">=3.1.1,<3.2.0", "spacy_git_version":"ffaead8fe", "vectors":{ "width":0, "vectors":0, "keys":0, "name":null }, "labels":{ "transformer":[ ], "morphologizer":[ "AdpType=Prep|POS=ADP", "Definite=Ind|Gender=Com|Number=Sing|POS=NOUN", "Mood=Ind|POS=AUX|Tense=Pres|VerbForm=Fin|Voice=Act", "POS=PROPN", "Definite=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part", "Definite=Def|Gender=Neut|Number=Sing|POS=NOUN", "POS=SCONJ", "Definite=Def|Gender=Com|Number=Sing|POS=NOUN", "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Act", "POS=ADV", "Number=Plur|POS=DET|PronType=Dem", "Degree=Pos|Number=Plur|POS=ADJ", "Definite=Ind|Gender=Com|Number=Plur|POS=NOUN", "POS=PUNCT", "POS=CCONJ", "Definite=Ind|Degree=Cmp|Number=Sing|POS=ADJ", "Degree=Cmp|POS=ADJ", "POS=PRON|PartType=Inf", "Gender=Com|Number=Sing|POS=DET|PronType=Ind", "Definite=Ind|Degree=Pos|Number=Sing|POS=ADJ", "Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=3|PronType=Prs", "Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN", "Definite=Def|Degree=Pos|Number=Sing|POS=ADJ", "Gender=Neut|Number=Sing|POS=DET|PronType=Dem", "Degree=Pos|POS=ADV", "Definite=Def|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part", "Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN", "POS=PRON|PronType=Dem", "NumType=Card|POS=NUM", "Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ", "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs", "Degree=Pos|Gender=Com|Number=Sing|POS=ADJ", "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs", "NumType=Ord|POS=ADJ", "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes", "Mood=Ind|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act", "POS=VERB|VerbForm=Inf|Voice=Act", "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act", "POS=NOUN", "Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Pass", "POS=ADP|PartType=Inf", "Degree=Pos|POS=ADJ", "Definite=Def|Gender=Com|Number=Plur|POS=NOUN", "Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs", "Case=Gen|Definite=Def|Gender=Com|Number=Sing|POS=NOUN", "POS=AUX|VerbForm=Inf|Voice=Act", "Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|POS=ADJ", "Gender=Com|Number=Sing|POS=DET|PronType=Dem", "Number=Plur|POS=DET|PronType=Ind", "Gender=Com|Number=Sing|POS=PRON|PronType=Ind", "Case=Acc|POS=PRON|Person=3|PronType=Prs|Reflex=Yes", "POS=PART|PartType=Inf", "Gender=Neut|Number=Sing|POS=DET|PronType=Ind", "Case=Acc|Number=Plur|POS=PRON|Person=3|PronType=Prs", "Case=Gen|Definite=Def|Gender=Neut|Number=Sing|POS=NOUN", "Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs", "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs", "Case=Nom|Gender=Com|POS=PRON|PronType=Ind", "Gender=Neut|Number=Sing|POS=PRON|PronType=Ind", "Mood=Imp|POS=VERB", "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs", "Definite=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part", "POS=X", "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs", "Case=Gen|Definite=Def|Gender=Com|Number=Plur|POS=NOUN", "POS=VERB|Tense=Pres|VerbForm=Part", "Number=Plur|POS=PRON|PronType=Int,Rel", "POS=VERB|VerbForm=Inf|Voice=Pass", "Case=Gen|Definite=Ind|Gender=Com|Number=Sing|POS=NOUN", "Degree=Cmp|POS=ADV", "POS=ADV|PartType=Inf", "Degree=Sup|POS=ADV", "Number=Plur|POS=PRON|PronType=Dem", "Number=Plur|POS=PRON|PronType=Ind", "Definite=Def|Gender=Neut|Number=Plur|POS=NOUN", "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs", "Case=Gen|POS=PROPN", "POS=ADP", "Degree=Cmp|Number=Plur|POS=ADJ", "Definite=Def|Degree=Sup|POS=ADJ", "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs", "Degree=Pos|Number=Sing|POS=ADJ", "Number=Plur|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes", "Gender=Com|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form", "Number=Plur|POS=PRON|PronType=Rcp", "Case=Gen|Degree=Cmp|POS=ADJ", "Case=Gen|Definite=Def|Gender=Neut|Number=Plur|POS=NOUN", "Number[psor]=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs", "POS=INTJ", "Number=Plur|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs", "Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ", "Gender=Neut|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form", "Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs", "Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs", "Case=Gen|Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN", "Number=Sing|POS=PRON|PronType=Int,Rel", "Number=Plur|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form", "Gender=Neut|Number=Sing|POS=PRON|PronType=Int,Rel", "Definite=Def|Degree=Sup|Number=Plur|POS=ADJ", "Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs", "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes", "Definite=Ind|Number=Sing|POS=NOUN", "Number=Plur|POS=VERB|Tense=Past|VerbForm=Part", "Number=Plur|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes", "POS=SYM", "Case=Nom|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs", "Degree=Sup|POS=ADJ", "Number=Plur|POS=DET|PronType=Ind|Style=Arch", "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Dem", "Foreign=Yes|POS=X", "POS=DET|Person=2|Polite=Form|Poss=Yes|PronType=Prs", "Gender=Neut|Number=Sing|POS=PRON|PronType=Dem", "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs", "Case=Gen|Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN", "Case=Gen|POS=PRON|PronType=Int,Rel", "Gender=Com|Number=Sing|POS=PRON|PronType=Dem", "Abbr=Yes|POS=X", "Case=Gen|Definite=Ind|Gender=Com|Number=Plur|POS=NOUN", "Definite=Def|Degree=Abs|POS=ADJ", "Definite=Ind|Degree=Sup|Number=Sing|POS=ADJ", "Definite=Ind|POS=NOUN", "Gender=Com|Number=Plur|POS=NOUN", "Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs", "Gender=Com|POS=PRON|PronType=Int,Rel", "Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs", "Degree=Abs|POS=ADV", "POS=VERB|VerbForm=Ger", "POS=VERB|Tense=Past|VerbForm=Part", "Definite=Def|Degree=Sup|Number=Sing|POS=ADJ", "Number=Plur|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs|Style=Form", "Case=Gen|Definite=Def|Degree=Pos|Number=Sing|POS=ADJ", "Case=Gen|Degree=Pos|Number=Plur|POS=ADJ", "Case=Acc|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs", "Gender=Com|Number=Sing|POS=PRON|PronType=Int,Rel", "POS=VERB|Tense=Pres", "Case=Gen|Number=Plur|POS=DET|PronType=Ind", "Number[psor]=Plur|POS=DET|Person=2|Poss=Yes|PronType=Prs", "POS=PRON|Person=2|Polite=Form|Poss=Yes|PronType=Prs", "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs", "POS=AUX|Tense=Pres|VerbForm=Part", "Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Pass", "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes", "Degree=Sup|Number=Plur|POS=ADJ", "Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs", "Gender=Neut|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes", "Definite=Ind|Number=Plur|POS=NOUN", "Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part", "Mood=Imp|POS=AUX", "Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=1|Poss=Yes|PronType=Prs", "Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs", "Definite=Def|Gender=Com|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part", "Number=Plur|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs", "Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Ind", "Case=Gen|POS=NOUN", "Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs", "POS=DET|PronType=Dem", "Definite=Def|Number=Plur|POS=NOUN" ], "parser":[ "ROOT", "acl:relcl", "advcl", "advmod", "amod", "appos", "aux", "case", "cc", "ccomp", "compound:prt", "conj", "cop", "dep", "det", "expl", "fixed", "flat", "iobj", "list", "mark", "nmod", "nmod:poss", "nsubj", "nummod", "obj", "obl", "obl:loc", "obl:tmod", "punct", "xcomp" ], "attribute_ruler":[ ], "lemmatizer":[ ], "ner":[ "LOC", "MISC", "ORG", "PER" ] }, "pipeline":[ "transformer", "morphologizer", "parser", "attribute_ruler", "lemmatizer", "ner" ], "components":[ "transformer", "morphologizer", "parser", "attribute_ruler", "lemmatizer", "ner" ], "disabled":[ ], "_sourced_vectors_hashes":{ }, "performance":{ "pos_acc":0.9583030655, "morph_acc":0.9570439246, "morph_per_feat":{ "Mood":{ "p":0.9950690335, "r":0.9618684461, "f":0.9781871062 }, "Tense":{ "p":0.9859922179, "r":0.9540662651, "f":0.9697665519 }, "VerbForm":{ "p":0.9823343849, "r":0.952876377, "f":0.9673811743 }, "Voice":{ "p":0.9938414165, "r":0.9648729447, "f":0.9791429655 }, "Definite":{ "p":0.9872480461, "r":0.9482418017, "f":0.9673518742 }, "Gender":{ "p":0.9793956044, "r":0.9478231971, "f":0.9633507853 }, "Number":{ "p":0.985179197, "r":0.9535732916, "f":0.9691186216 }, "AdpType":{ "p":1.0, "r":0.9752431477, "f":0.9874664279 }, "PartType":{ "p":1.0, "r":0.9675324675, "f":0.9834983498 }, "Case":{ "p":0.9934640523, "r":0.9605055292, "f":0.9767068273 }, "Person":{ "p":0.9908925319, "r":0.9662522202, "f":0.9784172662 }, "PronType":{ "p":0.9941077441, "r":0.9712171053, "f":0.9825291181 }, "NumType":{ "p":0.9791666667, "r":0.9337748344, "f":0.9559322034 }, "Degree":{ "p":0.9726708075, "r":0.943373494, "f":0.9577981651 }, "Reflex":{ "p":1.0, "r":1.0, "f":1.0 }, "Number[psor]":{ "p":1.0, "r":0.988372093, "f":0.9941520468 }, "Poss":{ "p":1.0, "r":0.9772727273, "f":0.9885057471 }, "Foreign":{ "p":0.8888888889, "r":0.8, "f":0.8421052632 }, "Abbr":{ "p":1.0, "r":0.4, "f":0.5714285714 }, "Style":{ "p":1.0, "r":1.0, "f":1.0 }, "Polite":{ "p":0.3333333333, "r":0.25, "f":0.2857142857 } }, "dep_uas":0.8492442546, "dep_las":0.8176199573, "dep_las_per_type":{ "advmod":{ "p":0.7724637681, "r":0.7528248588, "f":0.7625178827 }, "root":{ "p":0.8561403509, "r":0.865248227, "f":0.860670194 }, "nsubj":{ "p":0.8939393939, "r":0.8713080169, "f":0.8824786325 }, "case":{ "p":0.9141414141, "r":0.8942687747, "f":0.9040959041 }, "obl":{ "p":0.7286585366, "r":0.7433903577, "f":0.7359507313 }, "cc":{ "p":0.8486646884, "r":0.8313953488, "f":0.8399412628 }, "conj":{ "p":0.671957672, "r":0.6773333333, "f":0.6746347942 }, "obj":{ "p":0.8560747664, "r":0.8893203883, "f":0.8723809524 }, "aux":{ "p":0.8885542169, "r":0.860058309, "f":0.8740740741 }, "acl:relcl":{ "p":0.6936416185, "r":0.6486486486, "f":0.6703910615 }, "obl:loc":{ "p":0.7222222222, "r":0.7428571429, "f":0.7323943662 }, "det":{ "p":0.9346733668, "r":0.9192751236, "f":0.926910299 }, "amod":{ "p":0.8549488055, "r":0.8549488055, "f":0.8549488055 }, "nmod:poss":{ "p":0.75, "r":0.7128712871, "f":0.730964467 }, "ccomp":{ "p":0.6885245902, "r":0.6774193548, "f":0.6829268293 }, "nummod":{ "p":0.8181818182, "r":0.825, "f":0.8215767635 }, "flat":{ "p":0.8636363636, "r":0.880794702, "f":0.8721311475 }, "compound:prt":{ "p":0.6551724138, "r":0.4634146341, "f":0.5428571429 }, "advcl":{ "p":0.6967213115, "r":0.7327586207, "f":0.7142857143 }, "mark":{ "p":0.9018789144, "r":0.887063655, "f":0.8944099379 }, "cop":{ "p":0.8514285714, "r":0.8514285714, "f":0.8514285714 }, "dep":{ "p":0.1960784314, "r":0.3773584906, "f":0.2580645161 }, "nmod":{ "p":0.7197452229, "r":0.662109375, "f":0.6897253306 }, "iobj":{ "p":0.7333333333, "r":0.5, "f":0.5945945946 }, "xcomp":{ "p":0.6315789474, "r":0.406779661, "f":0.4948453608 }, "list":{ "p":0.3636363636, "r":0.2222222222, "f":0.275862069 }, "vocative":{ "p":0.0, "r":0.0, "f":0.0 }, "fixed":{ "p":0.8947368421, "r":0.8095238095, "f":0.85 }, "expl":{ "p":0.9090909091, "r":0.8823529412, "f":0.8955223881 }, "appos":{ "p":0.6097560976, "r":0.7575757576, "f":0.6756756757 }, "obl:tmod":{ "p":0.8, "r":0.2222222222, "f":0.347826087 }, "discourse":{ "p":0.0, "r":0.0, "f":0.0 } }, "sents_p":0.8603839442, "sents_r":0.8741134752, "sents_f":0.8671943712, "lemma_acc":0.8491041162, "ents_f":0.8231644261, "ents_p":0.81724846, "ents_r":0.8291666667, "ents_per_type":{ "PER":{ "p":0.9290322581, "r":0.8674698795, "f":0.8971962617 }, "ORG":{ "p":0.7619047619, "r":0.7111111111, "f":0.7356321839 }, "MISC":{ "p":0.6739130435, "r":0.8230088496, "f":0.7410358566 }, "LOC":{ "p":0.8818181818, "r":0.8738738739, "f":0.8778280543 } }, "transformer_loss":417466.8663170633, "morphologizer_loss":34589.6649030063, "parser_loss":151048.9837691551, "ner_loss":5460.9844742843 }, "sources":[ { "name":"UD Danish DDT v2.5", "url":"https://github.com/UniversalDependencies/UD_Danish-DDT", "license":"CC BY-SA 4.0", "author":"Johannsen, Anders; Mart\u00ednez Alonso, H\u00e9ctor; Plank, Barbara" }, { "name":"DaNE", "url":"https://github.com/alexandrainst/danlp/blob/master/docs/datasets.md#danish-dependency-treebank-dane", "license":"CC BY-SA 4.0", "author":"Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders S\u00f8gaard" }, { "name":"Maltehb/-l-ctra-danish-electra-small-cased", "author":"Malte H\u00f8jmark-Bertelsen", "url":"https://huggingface.co/Maltehb/-l-ctra-danish-electra-small-cased", "license":"CC BY 4.0" } ], "requirements":[ "spacy-transformers>=1.0.3,<1.1.0" ], "notes":"\n## Bias and Robustness\n\nBesides the validation done by SpaCy on the DaNE testset, DaCy also provides a series of augmentations to the DaNE test set to see how well the models deal with these types of augmentations.\nThe can be seen as behavioural probes akinn to the NLP checklist.\n\n### Deterministic Augmentations\nDeterministic augmentations are augmentation which always yield the same result.\n\n| Augmentation | Part-of-speech tagging (Accuracy) | Morphological tagging (Accuracy) | Dependency Parsing (UAS) | Dependency Parsing (LAS) |\u00a0Sentence segmentation (F1) | Lemmatization (Accuracy) | Named entity recognition (F1) |\n| --- | --- | --- | --- | --- | --- | --- | --- |\n| No augmentation | 0.98 | 0.974 | 0.868 | 0.836 | 0.936 | 0.844 | 0.765 |\n| \u00c6\u00f8\u00e5 Augmentation | 0.955 | 0.948 | 0.823 | 0.783 | 0.922 | 0.754 | 0.718 |\n| Lowercase | 0.974 | 0.97 | 0.862 | 0.828 | 0.905 | 0.848 | 0.681 |\n| No Spacing | 0.229 | 0.229 | 0.004 | 0.003 | 0.824 | 0.225 | 0.048 |\n| Abbreviated first names | 0.979 | 0.973 | 0.864 | 0.832 | 0.94 | 0.845 | 0.699 |\n| Input size augmentation 5 sentences | 0.956 | 0.956 | 0.851 | 0.818 | 0.883 | 0.844 | 0.743 |\n| Input size augmentation 10 sentences | 0.959 | 0.958 | 0.853 | 0.821 | 0.897 | 0.844 | 0.755 |\n\n\n\n### Stochastic Augmentations\nStochastic augmentations are augmentation which are repeated mulitple times to estimate the effect of the augmentation.\n\n| Augmentation | Part-of-speech tagging (Accuracy) | Morphological tagging (Accuracy) | Dependency Parsing (UAS) | Dependency Parsing (LAS) |\u00a0Sentence segmentation (F1) | Lemmatization (Accuracy) | Named entity recognition (F1) |\n| --- | --- | --- | --- | --- | --- | --- | --- |\n| Keystroke errors 2% | 0.931 (0.003) | 0.929 (0.003) | 0.797 (0.003) | 0.753 (0.003) | 0.884 (0.003) | 0.772 (0.003) | 0.657 (0.003) |\n| Keystroke errors 5% | 0.859 (0.003) | 0.863 (0.003) | 0.699 (0.003) | 0.641 (0.003) | 0.824 (0.003) | 0.681 (0.003) | 0.53 (0.003) |\n| Keystroke errors 15% | 0.633 (0.006) | 0.662 (0.006) | 0.439 (0.006) | 0.358 (0.006) | 0.688 (0.006) | 0.459 (0.006) | 0.293 (0.006) |\n| Danish names | 0.979 (0.0) | 0.974 (0.0) | 0.867 (0.0) | 0.835 (0.0) | 0.943 (0.0) | 0.847 (0.0) | 0.748 (0.0) |\n| Muslim names | 0.979 (0.0) | 0.974 (0.0) | 0.865 (0.0) | 0.833 (0.0) | 0.94 (0.0) | 0.847 (0.0) | 0.732 (0.0) |\n| Female names | 0.979 (0.0) | 0.974 (0.0) | 0.867 (0.0) | 0.835 (0.0) | 0.946 (0.0) | 0.847 (0.0) | 0.754 (0.0) |\n| Male names | 0.979 (0.0) | 0.974 (0.0) | 0.867 (0.0) | 0.835 (0.0) | 0.943 (0.0) | 0.847 (0.0) | 0.748 (0.0) |\n| Spacing Augmention 5% | 0.941 (0.002) | 0.936 (0.002) | 0.755 (0.002) | 0.725 (0.002) | 0.907 (0.002) | 0.811 (0.002) | 0.699 (0.002) |\n\n
\n\n Description of Augmenters \n\n \n\n**No augmentation:**\nApplies no augmentation to the DaNE test set.\n\n**\u00c6\u00f8\u00e5 Augmentation:**\nThis augmentation replace the \u00e6,\u00f8, and \u00e5 with their spelling variations ae, oe and aa respectively.\n\n**Lowercase:**\nThis augmentation lowercases all text.\n\n**No Spacing:**\nThis augmentation removed all spacing from the text.\n\n**Abbreviated first names:**\nThis agmentation abbreviates the first names of entities. For instance 'Kenneth Enevoldsen' would turn to 'K. Enevoldsen'.\n\n**Keystroke errors 2%:**\nThis agmentation simulate keystroke errors by replacing 2% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Keystroke errors 5%:**\nThis agmentation simulate keystroke errors by replacing 5% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Keystroke errors 15%:**\nThis agmentation simulate keystroke errors by replacing 15% of keys with a neighbouring key on a Danish QWERTY keyboard. As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Danish names:**\nThis agmentation replace all names with Danish names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Muslim names:**\nThis agmentation replace all names with Muslim names derived from Meldgaard (2005). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Female names:**\nThis agmentation replace all names with Danish female names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Male names:**\nThis agmentation replace all names with Danish male names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n\n**Spacing Augmention 5%:**\nThis agmentation replace all names with Danish male names derived from Danmarks Statistik (2021). As this agmentation is stochastic it is repeated 20 times to obtain a consistent estimate and the mean is provided with its standard deviation in parenthesis.\n
\n
\n\n\n### Hardware\nThis was run an trained on a Quadro RTX 8000 GPU." }