VetBERT / README.md
havocy28's picture
Upload ./ with huggingface_hub
f247c76 verified
|
raw
history blame
2.71 kB
---
language: en
tags:
- veterinary
- pets
- vetbert
- BERT
widget:
- text: >-
Hx: 7 yo canine with history of vomiting intermittently since yesterday.
No other concerns. Still eating and drinking [MASK]. cPL negative.
example_title: normally
---
# VetBERT Disease Syndrome Classifier
This is a finetuned version of the VetBERT model, designed to classify the disease syndrome within a veterinary clinical note.
<!-- Provide a quick summary of what the model is/does. -->
This pretrained model is designed for performing NLP tasks related to veterinary clinical notes. The [Domain Adaptation and Instance Selection for Disease Syndrome Classification over Veterinary Clinical Notes](https://aclanthology.org/2020.bionlp-1.17) (Hur et al., BioNLP 2020) paper introduced VetBERT model: an initialized Bert Model with ClinicalBERT (Bio+Clinical BERT) and further pretrained on the [VetCompass Australia](https://www.vetcompass.com.au/) corpus for performing tasks specific to veterinary medicine.
## Pretraining Data
The VetBERT model was initialized from [Bio_ClinicalBERT model](https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT), which was initialized from BERT. The VetBERT model was trained on over 15 million veterinary clincal Records and 1.3 Billion tokens.
## Pretraining Hyperparameters
During the pretraining phase for VetBERT, we used a batch size of 32, a maximum sequence length of 512, and a learning rate of 5 · 10−5. The dup factor for duplicating input data with different masks was set to 5. All other default parameters were used (specifically, masked language model probability = 0.15 and max predictions per sequence = 20).
## VetBERT Finetuning
VetBERT was further finetuned on a set of 5002 annotated clinical notes to classifiy the disease syndrome associated with the clinical notes as outlined in the paper: [Domain Adaptation and Instance Selection for Disease Syndrome Classification over Veterinary Clinical Notes](https://aclanthology.org/2020.bionlp-1.17)
## How to use the model
Load the model via the transformers library:
```
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("havocy28/VetBERTDx")
model = AutoModel.from_pretrained("havocy28/VetBERTDx")
```
## Citation
Please cite this article: Brian Hur, Timothy Baldwin, Karin Verspoor, Laura Hardefeldt, and James Gilkerson. 2020. [Domain Adaptation and Instance Selection for Disease Syndrome Classification over Veterinary Clinical Notes](https://aclanthology.org/2020.bionlp-1.17). In Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing, pages 156–166, Online. Association for Computational Linguistics.