riverjiang commited on
Commit
128c0be
1 Parent(s): e8192f7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -0
README.md ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: gpl-3.0
3
+ language:
4
+ - en
5
+ ---
6
+ # AI Long QT ECG analysis
7
+ Deep Neural Networks in Evaluation of Patients with Congenital Long QT Syndrome from the Surface 12-Lead Electrocardiogram
8
+
9
+ ## Step 0: Install pip packages
10
+
11
+ Install python packages.
12
+
13
+ `python -m pip install -r requirements.txt`
14
+
15
+ ## Step 1: Obtain MUSE ECGs
16
+
17
+ Should be in XML format and the beginning of the files start like this:
18
+
19
+ ```xml
20
+ <?xml version="1.0" encoding="ISO-8859-1"?>
21
+ <!DOCTYPE RestingECG SYSTEM "restecg.dtd">
22
+ ```
23
+
24
+ ## Step 2: Convert ECGs into CSV files
25
+
26
+ Run `python lqtnet/extract_ecg_xml.py`, which converts a folder containing XML ECG files into CSV format, normalizes the voltage data, and resamples all the files (to 2500 samples over the file, 250 Hz over 10 second recording).
27
+
28
+ ## Step 3: Create metadata file
29
+
30
+ Create a `metadata` folder and in that create a CSV file with the following columns:
31
+
32
+ ```csv
33
+ file,patient_id,ecg_id,id_site,set,lqts_type,dob,age,sex,ethnicity,date,hr,qt,qt_confirmed,qt_prolonged,qc,qc_reason
34
+ ```
35
+
36
+ Descriptions for the columns:
37
+ - `file`: csv file name (without '.csv' file extension)
38
+ - `patient_id`: unique ID for patient (HiRO ID)
39
+ - `ecg_id`: unique ID for the ECG file
40
+ - `id_site`: HiRO site ID
41
+ - `set`: split, `Derivation`, `Internal validation`, or `External validation`
42
+ - `lqts_type`: either `Control`, `Type 1`, or `Type 2` based on genetic diagnosis
43
+ - `dob`: date of birth, yyyy-mm-dd
44
+ - `age`: age (in years)
45
+ - `sex`: `Female` or `Male`
46
+ - `ethnicity`: used for baseline characteristics and subsequent analysis
47
+ - `date`: date of ecg, yyyy-mm-dd
48
+ - `hr`: heart rate, for baseline characteristics and subsequent analysis
49
+ - `qt_manual`: correct QT interval (in milliseconds)
50
+ - `qt_manual_confirmed`: `True` or `False`, was the QT interval manually interpreted?
51
+ - `qc`: `True` or `False`, whether ECG passed manual quality control
52
+ - `qc_reason` (optional): description of QC issue with ECG
53
+
54
+ Use `lqtnet.import_metadata.convert_dtypes()` to convert the dtypes for the files for more efficient storage. We also suggest saving the metadata file as `pickle` or `parquet` format after importing it as a pandas `DataFrame`.
55
+
56
+ ## Step 4: Quality control
57
+
58
+ Some of the files are missing parts of the leads, excessive noise, wandering leads, are corrupted and don't contain any ECG data, etc. Fill in this data into the above metadata file.
59
+
60
+ ## Step 5: Run model inference
61
+
62
+ Please see example code below, showing inference for an `External validation` dataset:
63
+
64
+ ```python
65
+ import lqtnet
66
+
67
+ # directory containing normalized CSV files
68
+ ECG_SOURCE_DIR = 'ecgs/csv_normalized_2500/'
69
+ MODEL_PATH = 'models/XYZ/'
70
+
71
+ metadata = pd.read_parquet('metadata/example_YYYYmmdd.parquet')
72
+ ext_df = metadata.query('set == "External validation" and qc == "Good"')
73
+
74
+ x_ext = lqtnet.import_ecgs.df_import_csv_to_numpy(ext_df, from_dir=ECG_SOURCE_DIR)
75
+ y_ext = lqtnet.import_ecgs.df_to_np_labels(ext_df)
76
+
77
+ model = lqtnet.train._load_model(MODEL_PATH)
78
+
79
+ # make predictions - save this output for further analysis
80
+ y_extval_pred = model.predict(x_extval)
81
+ ```