chrisvoncsefalvay commited on
Commit
e037014
1 Parent(s): 979bef4

Model save

Browse files
Files changed (7) hide show
  1. README.md +41 -59
  2. config.json +17 -16
  3. model.safetensors +2 -2
  4. tokenizer.json +0 -0
  5. tokenizer_config.json +4 -2
  6. training_args.bin +3 -0
  7. vocab.txt +0 -0
README.md CHANGED
@@ -1,65 +1,47 @@
1
  ---
2
- language:
3
- - en
4
- license: apache-2.0
5
- library_name: transformers
6
  tags:
7
- - medical
8
- - pharmacovigilance
9
- - vaccines
10
- datasets:
11
- - chrisvoncsefalvay/vaers-outcomes
12
- metrics:
13
- - accuracy
14
- - f1
15
- - precision
16
- - recall
17
- pipeline_tag: text-classification
18
- widget:
19
- - text: Patient is a 90 y.o. male with a PMH of IPF, HFpEF, AFib (Eliquis), Metastatic
20
- Prostate Cancer who presented to Hospital 10/28/2023 following an unwitnessed
21
- fall at his assisted living. He was found to have an AKI, pericardial effusion,
22
- hypoxia, AMS, and COVID-19. His hospital course was complicated by delirium and
23
- aspiration, leading to acute hypoxic respiratory failure requiring BiPAP and transfer
24
- to the ICU. Palliative Care had been following, and after goals of care conversations
25
- on 11/10/2023 the patient was transitioned to DNR-CC. Patient expired at 0107
26
- 11/12/23.
27
- example_title: VAERS 2727645 (hospitalisation, death)
28
- - text: 'hospitalized for paralytic ileus a week after the vaccination; This serious
29
- case was reported by a physician via call center representative and described
30
- the occurrence of ileus paralytic in a patient who received Rota (Rotarix liquid
31
- formulation) for prophylaxis. On an unknown date, the patient received the 1st
32
- dose of Rotarix liquid formulation. On an unknown date, less than 2 weeks after
33
- receiving Rotarix liquid formulation, the patient experienced ileus paralytic
34
- (Verbatim: hospitalized for paralytic ileus a week after the vaccination) (serious
35
- criteria hospitalization and GSK medically significant). The outcome of the ileus
36
- paralytic was not reported. It was unknown if the reporter considered the ileus
37
- paralytic to be related to Rotarix liquid formulation. It was unknown if the company
38
- considered the ileus paralytic to be related to Rotarix liquid formulation. Additional
39
- Information: GSK Receipt Date: 27-DEC-2023 Age at vaccination and lot number were
40
- not reported. The patient of unknown age and gender was hospitalized for paralytic
41
- ileus a week after the vaccination. The reporting physician was in charge of the
42
- patient.'
43
- example_title: VAERS 2728408 (hospitalisation)
44
- - text: Patient received Pfizer vaccine 7 days beyond BUD. According to Pfizer manufacturer
45
- research data, vaccine is stable and effective up to 2 days after BUD. Waiting
46
- for more stability data from PFIZER to determine if revaccination is necessary.
47
- example_title: VAERS 2728394 (no event)
48
- - text: Fever of 106F rectally beginning 1 hr after immunizations and lasting <24
49
- hrs. Seen at ER treated w/tylenol & cool baths.
50
- example_title: VAERS 25042 (ER attendance)
51
- - text: I had the MMR shot last week, and I felt a little dizzy afterwards, but it
52
- passed after a few minutes and I'm doing fine now.
53
- example_title: 'Non-sample example: simulated informal patient narrative (no event)'
54
- - text: My niece had the COVID vaccine. A few weeks later, she was T-boned by a drunk
55
- driver. She called me from the ER. She's fully recovered now, though.
56
- example_title: 'Non-sample example: simulated informal patient narrative (ER attendance,
57
- albeit unconnected)'
58
  ---
59
 
60
- DAEDRA (Detecting Adverse Event Dispositions for Regulatory Affairs) is a pharmacovigilance language model intended to facilitate the rapid identification and extraction of high-consequence outcomes from passive pharmacovigilance reporting. It was trained on the VAERS data set, and focuses on three main outcomes:
 
61
 
62
- * mortality (VAERS `DIED` flag);
63
- * emergency room attendance (`ER_VISIT`); and
64
- * hospitalisation (`HOSPITAL`).
65
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: dmis-lab/biobert-base-cased-v1.2
 
 
 
3
  tags:
4
+ - generated_from_trainer
5
+ model-index:
6
+ - name: daedra
7
+ results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
+ should probably proofread and complete it, then remove this comment. -->
12
 
13
+ # daedra
 
 
14
 
15
+ This model is a fine-tuned version of [dmis-lab/biobert-base-cased-v1.2](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2) on an unknown dataset.
16
+
17
+ ## Model description
18
+
19
+ More information needed
20
+
21
+ ## Intended uses & limitations
22
+
23
+ More information needed
24
+
25
+ ## Training and evaluation data
26
+
27
+ More information needed
28
+
29
+ ## Training procedure
30
+
31
+ ### Training hyperparameters
32
+
33
+ The following hyperparameters were used during training:
34
+ - learning_rate: 2e-05
35
+ - train_batch_size: 64
36
+ - eval_batch_size: 64
37
+ - seed: 42
38
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
39
+ - lr_scheduler_type: linear
40
+ - num_epochs: 3
41
+
42
+ ### Framework versions
43
+
44
+ - Transformers 4.37.2
45
+ - Pytorch 2.1.2+cu121
46
+ - Datasets 2.3.2
47
+ - Tokenizers 0.15.1
config.json CHANGED
@@ -1,13 +1,13 @@
1
  {
2
- "_name_or_path": "distilbert-base-uncased",
3
- "activation": "gelu",
4
  "architectures": [
5
- "DistilBertForSequenceClassification"
6
  ],
7
- "attention_dropout": 0.1,
8
- "dim": 768,
9
- "dropout": 0.1,
10
- "hidden_dim": 3072,
 
11
  "id2label": {
12
  "0": "No event",
13
  "1": "ER_VISIT",
@@ -19,6 +19,7 @@
19
  "7": "ER_VISIT, DIED"
20
  },
21
  "initializer_range": 0.02,
 
22
  "label2id": {
23
  "DIED": 3,
24
  "ER_VISIT": 1,
@@ -29,17 +30,17 @@
29
  "HOSPITAL, DIED": 6,
30
  "No event": 0
31
  },
 
32
  "max_position_embeddings": 512,
33
- "model_type": "distilbert",
34
- "n_heads": 12,
35
- "n_layers": 6,
36
  "pad_token_id": 0,
 
37
  "problem_type": "single_label_classification",
38
- "qa_dropout": 0.1,
39
- "seq_classif_dropout": 0.2,
40
- "sinusoidal_pos_embds": false,
41
- "tie_weights_": true,
42
  "torch_dtype": "float32",
43
- "transformers_version": "4.37.1",
44
- "vocab_size": 30522
 
 
45
  }
 
1
  {
2
+ "_name_or_path": "dmis-lab/biobert-base-cased-v1.2",
 
3
  "architectures": [
4
+ "BertForSequenceClassification"
5
  ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 768,
11
  "id2label": {
12
  "0": "No event",
13
  "1": "ER_VISIT",
 
19
  "7": "ER_VISIT, DIED"
20
  },
21
  "initializer_range": 0.02,
22
+ "intermediate_size": 3072,
23
  "label2id": {
24
  "DIED": 3,
25
  "ER_VISIT": 1,
 
30
  "HOSPITAL, DIED": 6,
31
  "No event": 0
32
  },
33
+ "layer_norm_eps": 1e-12,
34
  "max_position_embeddings": 512,
35
+ "model_type": "bert",
36
+ "num_attention_heads": 12,
37
+ "num_hidden_layers": 12,
38
  "pad_token_id": 0,
39
+ "position_embedding_type": "absolute",
40
  "problem_type": "single_label_classification",
 
 
 
 
41
  "torch_dtype": "float32",
42
+ "transformers_version": "4.37.2",
43
+ "type_vocab_size": 2,
44
+ "use_cache": true,
45
+ "vocab_size": 28996
46
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8cde70f8a4c453a3d63e0941158445d9795528724cd0f1daece7e02bcb3633c4
3
- size 267851024
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2c678a157697f94bad7925f694c152bc9817bb8309f75ecee49f5f72c8292b8e
3
+ size 433289224
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json CHANGED
@@ -43,13 +43,15 @@
43
  },
44
  "clean_up_tokenization_spaces": true,
45
  "cls_token": "[CLS]",
 
46
  "do_lower_case": true,
47
  "mask_token": "[MASK]",
48
- "model_max_length": 512,
 
49
  "pad_token": "[PAD]",
50
  "sep_token": "[SEP]",
51
  "strip_accents": null,
52
  "tokenize_chinese_chars": true,
53
- "tokenizer_class": "DistilBertTokenizer",
54
  "unk_token": "[UNK]"
55
  }
 
43
  },
44
  "clean_up_tokenization_spaces": true,
45
  "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
  "do_lower_case": true,
48
  "mask_token": "[MASK]",
49
+ "model_max_length": 1000000000000000019884624838656,
50
+ "never_split": null,
51
  "pad_token": "[PAD]",
52
  "sep_token": "[SEP]",
53
  "strip_accents": null,
54
  "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
  "unk_token": "[UNK]"
57
  }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:69d1d6788d827ca923a1d0cdbf90d22765c77a85e86fb761181c602b888bbcea
3
+ size 4728
vocab.txt CHANGED
The diff for this file is too large to render. See raw diff