UCSD-VA-health commited on
Commit
f2bc797
1 Parent(s): d2d99d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md CHANGED
@@ -1,3 +1,47 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ ## RadBERT-RoBERTa-4m
5
+
6
+ This is a base model of Radiology-BERT models from UC San Diego and VA healthcare system. It is initialized from BERT-base-uncased and further trained with 2 million radiology reports deidentified from US VA hospital. The model achieves stronger medical language understanding performance than previous medical domain models such as BioBERT, Clinical-BERT, BLUE-BERT and BioMed-RoBERTa.
7
+
8
+ Performances are evaluated on three tasks:
9
+ (a) abnormal sentence classification: sentence classification in radiology reports as reporting abnormal or normal findings;
10
+ (b) report coding: Assign a diagnostic code to a given radiology report for five different coding systems;
11
+ (c) report summarization: given the findings section of a radiology report, extractively select key sentences that summarized the findings.
12
+
13
+ It also shows superior performance on other radiology NLP tasks which are not reported in the paper.
14
+
15
+ For details, check out the paper here:
16
+ [RadBERT: Adapting transformer-based language models to radiology](https://pubs.rsna.org/doi/abs/10.1148/ryai.210258)
17
+
18
+ ### How to use
19
+
20
+ Here is an example of how to use this model to extract the features of a given text in PyTorch:
21
+
22
+ ```python
23
+ from transformers import AutoConfig, AutoTokenizer, AutoModel
24
+ config = AutoConfig.from_pretrained('zzxslp/RadBERT-RoBERTa-4m')
25
+ tokenizer = AutoTokenizer.from_pretrained('zzxslp/RadBERT-RoBERTa-4m')
26
+ model = AutoModel.from_pretrained('zzxslp/RadBERT-RoBERTa-4m', config=config)
27
+ text = "Replace me by any medical text you'd like."
28
+ encoded_input = tokenizer(text, return_tensors='pt')
29
+ output = model(**encoded_input)
30
+ ```
31
+
32
+ ### BibTeX entry and citation info
33
+
34
+ If you use the model, please cite our paper:
35
+
36
+ ```bibtex
37
+ @article{yan2022radbert,
38
+ title={RadBERT: Adapting transformer-based language models to radiology},
39
+ author={Yan, An and McAuley, Julian and Lu, Xing and Du, Jiang and Chang, Eric Y and Gentili, Amilcare and Hsu, Chun-Nan},
40
+ journal={Radiology: Artificial Intelligence},
41
+ volume={4},
42
+ number={4},
43
+ pages={e210258},
44
+ year={2022},
45
+ publisher={Radiological Society of North America}
46
+ }
47
+ ```