|
--- |
|
license: apache-2.0 |
|
--- |
|
## RadBERT-2m |
|
|
|
This is a base model of Radiology-BERT models from UC San Diego and VA healthcare system. It is initialized from BERT-base-uncased and further trained with 2 million radiology reports deidentified from US VA hospital. The model achieves stronger medical language understanding performance than previous medical domain models such as BioBERT, Clinical-BERT, BLUE-BERT and BioMed-RoBERTa. |
|
|
|
Performances are evaluated on three tasks: |
|
(a) abnormal sentence classification: sentence classification in radiology reports as reporting abnormal or normal findings; |
|
(b) report coding: Assign a diagnostic code to a given radiology report for five different coding systems; |
|
(c) report summarization: given the findings section of a radiology report, extractively select key sentences that summarized the findings. |
|
|
|
It also shows superior performance on other radiology NLP tasks which are not reported in the paper. |
|
|
|
For details, check out the paper here: |
|
[RadBERT: Adapting transformer-based language models to radiology](https://pubs.rsna.org/doi/abs/10.1148/ryai.210258) |
|
|
|
### How to use |
|
|
|
Here is an example of how to use this model to extract the features of a given text in PyTorch: |
|
|
|
```python |
|
from transformers import AutoConfig, AutoTokenizer, AutoModel |
|
config = AutoConfig.from_pretrained('zzxslp/RadBERT-RoBERTa-4m') |
|
tokenizer = AutoTokenizer.from_pretrained('zzxslp/RadBERT-RoBERTa-4m') |
|
model = AutoModel.from_pretrained('zzxslp/RadBERT-RoBERTa-4m', config=config) |
|
text = "Replace me by any medical text you'd like." |
|
encoded_input = tokenizer(text, return_tensors='pt') |
|
output = model(**encoded_input) |
|
``` |
|
|
|
### BibTeX entry and citation info |
|
|
|
If you use the model, please cite our paper: |
|
|
|
```bibtex |
|
@article{yan2022radbert, |
|
title={RadBERT: Adapting transformer-based language models to radiology}, |
|
author={Yan, An and McAuley, Julian and Lu, Xing and Du, Jiang and Chang, Eric Y and Gentili, Amilcare and Hsu, Chun-Nan}, |
|
journal={Radiology: Artificial Intelligence}, |
|
volume={4}, |
|
number={4}, |
|
pages={e210258}, |
|
year={2022}, |
|
publisher={Radiological Society of North America} |
|
} |
|
``` |