anicolson commited on
Commit
6da7bd7
1 Parent(s): 9034591

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -1
README.md CHANGED
@@ -60,7 +60,23 @@ wget -r -N -c -np --user <username> --ask-password https://physionet.org/files/m
60
  Note that you must be a credentialised user to access this dataset.
61
 
62
  ### Prepare the dataset:
63
- Run the [prepare_dataset.ipynb](https://github.com/aehrc/anon/blob/main/prepare_dataset.ipynb) notebook from https://github.com/aehrc/anon and change the paths accordingly. It should take roughly an hour. The most time-consuming tasks are extracting sections from the radiology reports and matching CXR studies to ED stays.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
 
65
  ## Example
66
 
 
60
  Note that you must be a credentialised user to access this dataset.
61
 
62
  ### Prepare the dataset:
63
+ ```python
64
+ import transformers
65
+
66
+ # Paths:
67
+ physionet_dir = '/.../physionet.org/files' # Where MIMIC-CXR, MIMIC-CXR-JPG, and MIMIC-IV-ED are stored.
68
+ dataset_dir = '/.../datasets' # Some outputs of prepare_data() will be stored here, e.g, the report sections.
69
+ database_path = '/.../database/cxrmate_ed.db' # The DuckDB database used to manage the tables of the dataset will be saved here.
70
+
71
+ # Prepare the MIMIC-CXR & MIMIC-IV-ED dataset:
72
+ model = transformers.AutoModel.from_pretrained('aehrc/cxrmate-ed', trust_remote_code=True)
73
+ model.prepare_data(
74
+ physionet_dir=physionet_dir,
75
+ dataset_dir=dataset_dir,
76
+ database_path=database_path,
77
+ )
78
+ ```
79
+
80
 
81
  ## Example
82