Reduction
This folder contains scripts for combining, reducing, filling and scaling processed EHR data for modelling. Scripts should be run in the below order.
Note that scripts must be run in the below order:
combine.py- combine datasets and perform any post-processingpost_prod_reduction.py- Combine columns to reduce 0 valuesremove_ids.py- remove receiver, scale up and test IDsclean_and_scale_train.py- impute nulls and min-max scale training dataclean_and_scale_test.py- impute nulls and min-max scale testing data
NB: The data_type in clean_and_scale_test.py can be changed to rec, sup, val and test.