Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,41 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: "en"
|
3 |
+
tags:
|
4 |
+
- fill-mask
|
5 |
+
|
6 |
+
---
|
7 |
+
|
8 |
+
<span style="font-size:larger;">**Clinical-Longformer**</span> is a clinical knowledge enriched version of Longformer that was further pre-trained using MIMIC-III clinical notes.
|
9 |
+
|
10 |
+
### Pre-training
|
11 |
+
We initialized Clinical-Longformer from the pre-trained weights of the base version of Longformer. The pre-training process was distributed in parallel to 6 32GB Tesla V100 GPUs. FP16 precision was enabled to accelerate training. We pre-trained Clinical-Longformer for 200,000 steps with batch size of 6×3. The learning rates were 3e-5 for both models. The entire pre-training process took more than 2 weeks.
|
12 |
+
|
13 |
+
### Down-stream Tasks
|
14 |
+
Clinical-Longformer consistently out-perform ClinicalBERT across 10 baseline dataset for at least 2 percent. The dataset broadly cover NER, QA and text classification tasks. For more details, please refer to:
|
15 |
+
|
16 |
+
### Usage
|
17 |
+
|
18 |
+
Load the model directly from Transformers:
|
19 |
+
```
|
20 |
+
from transformers import AutoTokenizer, AutoModel
|
21 |
+
tokenizer = AutoTokenizer.from_pretrained("yikuan8/Clinical-Longformer")
|
22 |
+
model = AutoModel.from_pretrained("yikuan8/Clinical-Longformer")
|
23 |
+
```
|
24 |
+
|
25 |
+
If you find our implementation helps, please consider citing this :)
|
26 |
+
```
|
27 |
+
@inproceedings{li2020comparison,
|
28 |
+
title={A comparison of pre-trained vision-and-language models for multimodal representation learning across medical images and reports},
|
29 |
+
author={Li, Yikuan and Wang, Hanyin and Luo, Yuan},
|
30 |
+
booktitle={2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)},
|
31 |
+
pages={1999--2004},
|
32 |
+
year={2020},
|
33 |
+
organization={IEEE}
|
34 |
+
}
|
35 |
+
```
|
36 |
+
|
37 |
+
### Questions
|
38 |
+
Please email yikuanli2018@u.northwestern.edu
|
39 |
+
|
40 |
+
|
41 |
+
|