File size: 9,084 Bytes
a879755 6a172d3 bdfdc87 6a172d3 cf59ca5 4ae2196 cf59ca5 4ae2196 a879755 5c0fa20 cf59ca5 5c0fa20 6a172d3 e037014 080e973 e037014 080e973 ddfa17f 080e973 ddfa17f e037014 080e973 e037014 080e973 e037014 080e973 4ae2196 e037014 080e973 4ae2196 080e973 4ae2196 080e973 ddfa17f 080e973 ed17c46 080e973 e037014 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 |
---
language:
- en
license: apache-2.0
library_name: transformers
tags:
- medical
- pharmacovigilance
- vaccines
datasets:
- chrisvoncsefalvay/vaers-outcomes
metrics:
- accuracy
- f1
- precision
- recall
dataset: chrisvoncsefalvay/vaers-outcomes
pipeline_tag: text-classification
widget:
- text: >-
Patient is a 90 y.o. male with a PMH of IPF, HFpEF, AFib (Eliquis),
Metastatic Prostate Cancer who presented to Hospital 10/28/2023 following
an unwitnessed fall at his assisted living. He was found to have an AKI,
pericardial effusion, hypoxia, AMS, and COVID-19. His hospital course was
complicated by delirium and aspiration, leading to acute hypoxic
respiratory failure requiring BiPAP and transfer to the ICU. Palliative
Care had been following, and after goals of care conversations on
11/10/2023 the patient was transitioned to DNR-CC. Patient expired at 0107
11/12/23.
example_title: VAERS 2727645 (hospitalisation, death)
- text: >-
hospitalized for paralytic ileus a week after the vaccination; This
serious case was reported by a physician via call center representative
and described the occurrence of ileus paralytic in a patient who received
Rota (Rotarix liquid formulation) for prophylaxis. On an unknown date, the
patient received the 1st dose of Rotarix liquid formulation. On an unknown
date, less than 2 weeks after receiving Rotarix liquid formulation, the
patient experienced ileus paralytic (Verbatim: hospitalized for paralytic
ileus a week after the vaccination) (serious criteria hospitalization and
GSK medically significant). The outcome of the ileus paralytic was not
reported. It was unknown if the reporter considered the ileus paralytic to
be related to Rotarix liquid formulation. It was unknown if the company
considered the ileus paralytic to be related to Rotarix liquid
formulation. Additional Information: GSK Receipt Date: 27-DEC-2023 Age at
vaccination and lot number were not reported. The patient of unknown age
and gender was hospitalized for paralytic ileus a week after the
vaccination. The reporting physician was in charge of the patient.
example_title: VAERS 2728408 (hospitalisation)
- text: >-
Patient received Pfizer vaccine 7 days beyond BUD. According to Pfizer
manufacturer research data, vaccine is stable and effective up to 2 days
after BUD. Waiting for more stability data from PFIZER to determine if
revaccination is necessary.
example_title: VAERS 2728394 (no event)
- text: >-
Fever of 106F rectally beginning 1 hr after immunizations and lasting <24
hrs. Seen at ER treated w/tylenol & cool baths.
example_title: VAERS 25042 (ER attendance)
- text: >-
I had the MMR shot last week, and I felt a little dizzy afterwards, but it
passed after a few minutes and I'm doing fine now.
example_title: 'Non-sample example: simulated informal patient narrative (no event)'
- text: >-
My niece had the COVID vaccine. A few weeks later, she was T-boned by a
drunk driver. She called me from the ER. She's fully recovered now,
though.
example_title: >-
Non-sample example: simulated informal patient narrative (ER attendance,
albeit unconnected)
model-index:
- name: daedra
results:
- task:
type: text-classification
dataset:
type: vaers-outcomes
name: vaers-outcomes
metrics:
- name: Accuracy, microaveraged
type: accuracy_microaverage
value: 0.885
verified: false
- name: F1 score, microaveraged
type: f1_microaverage
value: 0.885
verified: false
- name: Precision, macroaveraged
type: precision_macroaverage
value: 0.769
verified: false
- name: Recall, macroaveraged
type: recall_macroaverage
value: 0.688
verified: false
---
# DAEDRA: Determining Adverse Event Disposition for Regulatory Affairs
This model is a fine-tuned version of [dmis-lab/biobert-base-cased-v1.2](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2) trained on the [VAERS adversome outcomes data set](https://huggingface.com/datasets/chrisvoncsefalvay/vaers-outcomes).
# Table of Contents
- [Model Details](#model-details)
- [Uses](#uses)
- [Bias, Risks, and Limitations](#bias-risks-and-limitations)
- [Training Details](#training-details)
- [Evaluation](#evaluation)
- [Environmental Impact](#environmental-impact)
- [Technical Specifications](#technical-specifications-optional)
- [Citation](#citation)
# Model Details
## Model Description
<!-- Provide a longer summary of what this model is/does. -->
DAEDRA is a model for the identification of adverse event dispositions (outcomes) from passive pharmacovigilance data.
The model is trained on a real-world adversomics data set spanning over three decades (1990-2023) and comprising over 1.8m records for a total corpus of 173,093,850 words constructed from a subset of reports submitted to VAERS.
It is intended to identify, based on the narrative, whether any, or any combination, of three serious outcomes -- death, hospitalisation and ER attendance -- have occurred.
- **Developed by:** Chris von Csefalvay
- **Model type:** Language model
- **Language(s) (NLP):** en
- **License:** apache-2.0
- **Parent Model:** [dmis-lab/biobert-base-cased-v1.2](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2)
- **Resources for more information:**
- [GitHub Repo](https://github.com/chrisvoncsefalvay/daedra)
# Uses
This model was designed to facilitate the coding of passive adverse event reports into severity outcome categories.
## Direct Use
Load the model via the `transformers` library:
```
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("chrisvoncsefalvay/daedra")
model = AutoModel.from_pretrained("chrisvoncsefalvay/daedra")
```
## Out-of-Scope Use
This model is not intended for the diagnosis or treatment of any disease.
# Bias, Risks, and Limitations
Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
# Training Details
## Training Data
The model was trained on the [VAERS adversome outcomes data set](https://huggingface.com/datasets/chrisvoncsefalvay/vaers-outcomes), which comprises 1,814,920 reports from the FDA's Vaccine Adverse Events Reporting System (VAERS). Reports were split into a 70% training set and a 15% test set and 15% validation set after age and gender matching.
## Training Procedure
Training was conducted on an Azure `Standard_NC24s_v3` instance in `us-east`, with 4x Tesla V100-PCIE-16GB GPUs and 24x Intel Xeon E5-2690 v4 CPUs at 2.60GHz.
### Speeds, Sizes, Times
Training took 15 hours and 10 minutes.
## Testing Data, Factors & Metrics
### Testing Data
The model was tested on the `test` partition of the [VAERS adversome outcomes data set](https://huggingface.com/datasets/chrisvoncsefalvay/vaers-outcomes).
## Results
On the test set, the model achieved the following results:
* `f1`: 0.885
* `precision` and `recall`, microaveraged: 0.885
* `precision`, macroaveraged: 0.769
* `recall`, macroaveraged: 0.688
# Environmental Impact
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
- **Hardware Type:** 4 x Tesla V100-PCIE-16GB
- **Hours used:** 15.166
- **Cloud Provider:** Azure
- **Compute Region:** us-east
- **Carbon Emitted:** 6.72 kg CO2eq (offset by provider)
# Citation
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
**BibTeX:**
Forthcoming -- watch this space.
# Model Card Authors
<!-- This section provides another layer of transparency and accountability. Whose views is this model card representing? How many voices were included in its construction? Etc. -->
Chris von Csefalvay
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
### Framework versions
- Transformers 4.37.2
- Pytorch 2.1.2+cu121
- Datasets 2.3.2
- Tokenizers 0.15.1
|