starmpcc's picture
Initialize README
f1522a9 verified
---
license: cc-by-nc-4.0
datasets:
- starmpcc/Asclepius-Synthetic-Clinical-Notes
language:
- en
pipeline_tag: text-generation
tags:
- medical
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This is an pre-trained Llama2-13B model, which was trained using causal language modeling on [Asclepius-Synthetic-Clinical-Notes](https://huggingface.co/datasets/starmpcc/Asclepius-Synthetic-Clinical-Notes).
The [Asclepius-Llama2-13B](https://huggingface.co/starmpcc/Asclepius-Llama2-13B) model was developed from this checkpoint by applying instruction fine-tuning.
## UPDATE
### 2024.01.10
- Asclepius-R, the variant of Asclepius that trained on MIMIC-III discharge summaries, is now available on [Physionet](https://physionet.org/content/asclepius-r/1.0.0/)!
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Model type:** Clinical LLM (Large Language Model)
- **Language(s) (NLP):** English
- **License:** CC-BY-NC-SA 4.0
- **Finetuned from model:** Llama2-13B
### Model Sources
<!-- Provide the basic links for the model. -->
- **Repository:** https://github.com/starmpcc/Asclepius
- **Paper:** https://arxiv.org/abs/2309.00237
- **Data:** https://huggingface.co/datasets/starmpcc/Asclepius-Synthetic-Clinical-Notes
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
This model is trained with causal launguage modeling, using [Asclepius-Synthetic-Clinical-Notes](https://huggingface.co/datasets/starmpcc/Asclepius-Synthetic-Clinical-Notes).
### Out-of-Scope Use
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
ONLY USE THIS MODEL FOR RESEARCH PURPOSE!!
## How to Get Started with the Model
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("starmpcc/Asclepius-Llama2-13B-Pretraining-Only", use_fast=False)
model = AutoModelForCausalLM.from_pretrained("starmpcc/Asclepius-Llama2-13B-Pretraining-Only")
model_input = "YOUR INPUT"
input_ids = tokenizer(model_input, return_tensors="pt").input_ids
output = model.generate(input_ids)
print(tokenizer.decode(output[0]))
```
## Training Details
### Training Data
<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
https://huggingface.co/datasets/starmpcc/Asclepius-Synthetic-Clinical-Notes
### Training Procedure
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
- Causal language modeling on synthetic clinical notes.
#### Training Hyperparameters
- We followed config used in [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca)
-
#### Speeds, Sizes, Times
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
- Pre-Training (1 epoch): 1h 58m with 8x A100 80G
## Citation
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
**BibTeX:**
```
@misc{kweon2023publicly,
title={Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes},
author={Sunjun Kweon and Junu Kim and Jiyoun Kim and Sujeong Im and Eunbyeol Cho and Seongsu Bae and Jungwoo Oh and Gyubok Lee and Jong Hak Moon and Seng Chan You and Seungjin Baek and Chang Hoon Han and Yoon Bin Jung and Yohan Jo and Edward Choi},
year={2023},
eprint={2309.00237},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```