anicolson commited on
Commit
406c924
1 Parent(s): ab1490b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +99 -1
README.md CHANGED
@@ -27,4 +27,102 @@ This is the model and data pipeline for the CXRMate-ED model from: https://arxiv
27
 
28
  The abstract from the paper:
29
 
30
- "This study investigates the integration of diverse patient data sources into multimodal language models for automated chest X-ray (CXR) report generation. Traditionally, CXR report generation relies solely on CXR images and limited radiology data, overlooking valuable information from patient health records, particularly from emergency departments. Utilising the MIMIC-CXR and MIMIC-IV-ED datasets, we incorporate detailed patient information such as aperiodic vital signs, medications, and clinical history to enhance diagnostic accuracy. We introduce a novel approach to transform these heterogeneous data sources into embeddings that prompt a multimodal language model, significantly enhancing the diagnostic accuracy of generated radiology reports. Our comprehensive evaluation demonstrates the benefits of using a broader set of patient data, underscoring the potential for enhanced diagnostic capabilities and better patient outcomes through the integration of multimodal data in CXR report generation."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
  The abstract from the paper:
29
 
30
+ "This study investigates the integration of diverse patient data sources into multimodal language models for automated chest X-ray (CXR) report generation. Traditionally, CXR report generation relies solely on CXR images and limited radiology data, overlooking valuable information from patient health records, particularly from emergency departments. Utilising the MIMIC-CXR and MIMIC-IV-ED datasets, we incorporate detailed patient information such as aperiodic vital signs, medications, and clinical history to enhance diagnostic accuracy. We introduce a novel approach to transform these heterogeneous data sources into embeddings that prompt a multimodal language model, significantly enhancing the diagnostic accuracy of generated radiology reports. Our comprehensive evaluation demonstrates the benefits of using a broader set of patient data, underscoring the potential for enhanced diagnostic capabilities and better patient outcomes through the integration of multimodal data in CXR report generation."
31
+
32
+ ## Example
33
+
34
+ ```python
35
+ import torch
36
+ import transformers
37
+ from timm.data.constants import IMAGENET_DEFAULT_MEAN, IMAGENET_DEFAULT_STD
38
+ from torch.utils.data import DataLoader
39
+ from torchvision.transforms import v2
40
+ import os
41
+ import pprint
42
+ import matplotlib.pyplot as plt
43
+ from torchvision.utils import make_grid
44
+
45
+ # Device and paths:
46
+ device = 'cuda'
47
+ physionet_dir = '/datasets/work/hb-mlaifsp-mm/work/archive/physionet.org/files'
48
+ dataset_dir = '/scratch3/nic261/datasets'
49
+ database_path = '/scratch3/nic261/database/cxrmated_test.db'
50
+ mimic_cxr_jpg_dir = '/scratch3/nic261/datasets/physionet.org/files/mimic-cxr-jpg/2.0.0/files'
51
+
52
+ # Download model checkpoint:
53
+ model = transformers.AutoModel.from_pretrained('aehrc/cxrmate-ed', trust_remote_code=True).to(device=device)
54
+ model.eval()
55
+
56
+ # Download tokenizer:
57
+ tokenizer = transformers.PreTrainedTokenizerFast.from_pretrained('aehrc/cxrmate-ed')
58
+ os.environ['TOKENIZERS_PARALLELISM'] = 'false'
59
+
60
+ # Image transforms:
61
+ image_size = 384
62
+ test_transforms = v2.Compose(
63
+ [
64
+ v2.Grayscale(num_output_channels=3),
65
+ v2.Resize(
66
+ size=image_size,
67
+ antialias=True,
68
+ interpolation=v2.InterpolationMode.BICUBIC,
69
+ ),
70
+ v2.CenterCrop(size=[image_size, image_size]),
71
+ v2.ToDtype(torch.float32, scale=True),
72
+ v2.Normalize(mean=IMAGENET_DEFAULT_MEAN, std=IMAGENET_DEFAULT_STD),
73
+ ]
74
+ )
75
+
76
+ # Prepare the MIMIC-CXR & MIMIC-IV-ED dataset:
77
+ model.prepare_data(
78
+ physionet_dir=physionet_dir,
79
+ dataset_dir=dataset_dir,
80
+ database_path=database_path,
81
+ )
82
+
83
+ # Get the test set dataset & dataloader:
84
+ test_set = model.get_dataset('test', test_transforms, database_path, mimic_cxr_jpg_dir)
85
+ test_dataloader = DataLoader(
86
+ test_set,
87
+ batch_size=1,
88
+ num_workers=5,
89
+ shuffle=True,
90
+ collate_fn=model.collate_fn,
91
+ pin_memory=True,
92
+ )
93
+
94
+ # Get an example:
95
+ batch = next(iter(test_dataloader))
96
+
97
+ # Move tensors in the batch to the device:
98
+ for key, value in batch.items():
99
+ if isinstance(value, torch.Tensor):
100
+ batch[key] = value.to(device)
101
+
102
+ # Convert the patient data in the batch into embeddings:
103
+ inputs_embeds, attention_mask, token_type_ids, position_ids, bos_token_ids = model.prepare_inputs(tokenizer=tokenizer, **batch)
104
+
105
+ # Generate reports:
106
+ output_ids = model.generate(
107
+ input_ids=bos_token_ids,
108
+ decoder_inputs_embeds=inputs_embeds,
109
+ decoder_token_type_ids=token_type_ids,
110
+ prompt_attention_mask=attention_mask,
111
+ prompt_position_ids=position_ids,
112
+ special_token_ids=[tokenizer.sep_token_id],
113
+ token_type_id_sections=model.decoder.config.section_ids,
114
+ max_length=256,
115
+ bos_token_id=tokenizer.bos_token_id,
116
+ eos_token_id=tokenizer.eos_token_id,
117
+ pad_token_id=tokenizer.pad_token_id,
118
+ num_beams=4,
119
+ return_dict_in_generate=True,
120
+ use_cache=True,
121
+ )['sequences']
122
+
123
+ # Findings and impression section:
124
+ findings, impression = model.split_and_decode_sections(output_ids, [tokenizer.sep_token_id, tokenizer.eos_token_id], tokenizer)
125
+ for i,j in zip(findings, impression):
126
+ print(f'Findings:\t{i}\nImpression:\t{j}\n\n')
127
+
128
+ ```