jannikskytt commited on
Commit
058370d
1 Parent(s): 4767f71

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -3
README.md CHANGED
@@ -1,3 +1,47 @@
1
- ---
2
- license: cc-by-nc-3.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-3.0
3
+ language:
4
+ - da
5
+ pipeline_tag: fill-mask
6
+ tags:
7
+ - bert
8
+ - danish
9
+ widget:
10
+ - text: Hvide blodlegemer beskytter kroppen mod [MASK]
11
+ ---
12
+
13
+
14
+ # Danish medical BERT
15
+
16
+ MeDa-BERT was initialized with weights from a [pretrained Danish BERT model](https://huggingface.co/Maltehb/danish-bert-botxo) and pretrained for 48 epochs using the MLM objective on a Danish medical corpus of 123M tokens.
17
+
18
+ The development of the corpus and model is described further in [this paper](https://aclanthology.org/2023.nodalida-1.31/).
19
+
20
+ Here is an example on how to load the model in PyTorch using the [🤗Transformers](https://github.com/huggingface/transformers) library:
21
+
22
+
23
+
24
+ ```python
25
+ from transformers import AutoTokenizer, AutoModelForMaskedLM
26
+ tokenizer = AutoTokenizer.from_pretrained("jannikskytt/MeDa-Bert")
27
+ model = AutoModelForMaskedLM.from_pretrained("jannikskytt/MeDa-Bert")
28
+ ```
29
+
30
+ ### Citing
31
+
32
+ ```
33
+ @inproceedings{pedersen-etal-2023-meda,
34
+ title = "{M}e{D}a-{BERT}: A medical {D}anish pretrained transformer model",
35
+ author = "Pedersen, Jannik and
36
+ Laursen, Martin and
37
+ Vinholt, Pernille and
38
+ Savarimuthu, Thiusius Rajeeth",
39
+ booktitle = "Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)",
40
+ month = may,
41
+ year = "2023",
42
+ address = "T{\'o}rshavn, Faroe Islands",
43
+ publisher = "University of Tartu Library",
44
+ url = "https://aclanthology.org/2023.nodalida-1.31",
45
+ pages = "301--307",
46
+ }
47
+ ```