hjb
Initial commit
8dcfa35
metadata
language: da
tags:
  - danish
  - bert
  - masked-lm
  - botxo
license: cc-by-4.0
datasets:
  - common_crawl
  - wikipedia
  - dindebat.dk
  - hestenettet.dk
  - danish OpenSubtitles
pipeline_tag: token-classification
widget:
  - text: Jens er en konge og er født i Danmark.

Danish BERT (version 2, uncased) by BotXO.ai finetuned for Named Entity Recognition on the DaNE dataset (Hvingelby et al., 2020) by Malte Højmark-Bertelsen

All credit goes to BotXO.ai who developed Danish BERT. For data and training details see their GitHub repository or this article.

It is both available in TensorFlow and Pytorch format.

The original TensorFlow version can be downloaded using this link.

Here is an example on how to load Danish BERT in PyTorch using the 🤗Transformers library:

from transformers import AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("Maltehb/danish-bert-botxo-ner-dane")
model = AutoModelForTokenClassification.from_pretrained("Maltehb/danish-bert-botxo-ner-dane")

References

Danish BERT. (2020). BotXO. https://github.com/botxo/nordic_bert (Original work published 2019)

Hvingelby, R., Pauli, A. B., Barrett, M., Rosted, C., Lidegaard, L. M., & Søgaard, A. (2020). DaNE: A Named Entity Resource for Danish. Proceedings of the 12th Language Resources and Evaluation Conference, 4597–4604. https://www.aclweb.org/anthology/2020.lrec-1.565

Contact

For help or further information feel free to connect with the author Malte Højmark-Bertelsen on hjb@kmd.dk or any of the following platforms:

MalteHB | Twitter MalteHB | LinkedIn MalteHB | Instagram