Tsubasaz commited on
Commit
73a95af
1 Parent(s): 4b98b07

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -0
README.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+
5
+ license: mit
6
+
7
+ datasets:
8
+ - MIMIC-III
9
+
10
+ widget:
11
+ - text: "Due to shortness of breath, the patient is diagnosed with [MASK], and other respiratory problems."
12
+ example_title: "Example 1"
13
+ ---
14
+
15
+ # ClinicalPubMedBERT
16
+ ## Description
17
+
18
+ A BERT model pre-trained on PubMed abstracts, and continual pre-trained on clinical notes ([MIMIC-III](https://mimic.physionet.org/)). We try combining two domains that have fewer overlaps with general knowledge text corpora: EHRs and biomedical papers. We hope this model can serve better results on clinical-related downstream tasks such as readmissions.
19
+
20
+ This model is trained on 500000 clinical notes randomly sampled from MIMIC datasets, with 120k steps of training. We also used whole word masking to enhance the coherence of the language model. All notes are chunked into a length of 128 tokens.
21
+
22
+ Pre-trained model: https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract