File size: 13,160 Bytes
ecf08bc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
# FAQ

## Where can I find the segmentation metrics of my experiments?
**Results for the validation sets of each fold** are stored in the respective output folder after the training is completed. For example, this could be. 
`${RESULTS_FOLDER}/nnUNet/3d_fullres/Task003_Liver/nnUNetTrainerV2__nnUNetPlansv2.1/fold_0`. After training there will
 be a `validation_raw` subfolder and a `validation_raw_postprocessed` subfolder. In each of these folders is going to 
 be a `summary.json` file with the segmentation metrics. There are metrics for each individual validation case and then 
 at the bottom there is also a mean across all cases. 
 
**Cross-validation metrics** can only be computed after all five folds were run. You first need to run
`nnUNet_determine_postprocessing` first (see `nnUNet_determine_postprocessing -h` for help). This will collect the 
predictions from the validation sets of the five folds, compute metrics on them and then determine the postprocessing. 
Once this is all done, there will be new folders located in the output directory (for example 
`${RESULTS_FOLDER}/nnUNet/3d_fullres/Task003_Liver/nnUNetTrainerV2__nnUNetPlansv2.1/`): `cv_niftis_raw` (raw predictions 
from the cross-validation) and `cv_niftis_postprocessed` (postprocessed predictions). In each of these folders is 
going to be a `summary.json` file with the metrics (see above).

Note that the postprocessing determined on each individual fold is completely ignored by nnU-Net because it needs to 
find a single postprocessing configuration for the whole cross-validation. The postprocessed results in each fold are 
just for development purposes!

**Test set results** see [here](#evaluating-test-set-results).

**Ensemble performance** will be accessible here `${RESULTS_FOLDER}/nnUNet/ensembles/TASKNAME` after you ran 
`nnUNet_find_best_configuration`. There are summary.csv for a quick overview and then there is also going to be 
detailed results in the form of `summary.json` in the respective subfolders.

## What postprocessing is selected?
After you run `nnUNet_determine_postprocessing` (see `nnUNet_determine_postprocessing -h` for help) there will be a 
`postprocessing.json` file located in the output directory of your training (for example 
`${RESULTS_FOLDER}/nnUNet/3d_fullres/Task003_Liver/nnUNetTrainerV2__nnUNetPlansv2.1/`). If you open this with a text 
editor, there is a key "for_which_classes", followed by some list. For LiTS (classes 0: bg, 1: liver, 2: tumor) 
this can for example be:
```python
    "for_which_classes": [
        [
            1,
            2
        ],
        1
```
This means that nnU-Net will first remove all but the largest components for the merged object consisting of classes 
1 and 2 (essentially the liver including the tumors) and then in a second step also remove all but the largest 
connected component for the liver class.

Note that you do not have to run `nnUNet_determine_postprocessing` if you use `nnUNet_find_best_configuration`. 
`nnUNet_find_best_configuration` will do that for you.

Ensemble results and postprocessing will be stored in `${RESULTS_FOLDER}/nnUNet/ensembles` 
(this will all be generated by `nnUNet_find_best_configuration`).

## Evaluating test set results
This feature was only added recently. Please run `pip install --upgrade nnunet` or reinstall nnunet from the master.

You can now use `nnUNet_evaluate_folder` to compute metrics on predicted test cases. For example:

```
nnUNet_evaluate_folder -ref FOLDER_WITH_GT -pred FOLDER_WITH_PREDICTIONS -l 1 2 3 4
```

This example is for a dataset that has 4 foreground classes (labels 1, 2, 3, 4). `FOLDER_WITH_GT` and 
`FOLDER_WITH_PREDICTIONS` must contain files with the same names containing the reference and predicted segmentations 
of each case, respectivelty. The files must be nifti (end with .nii.gz).

## Creating and managing data splits

At the start of each training, nnU-Net will check whether the splits_final.pkl file is present in the directory where 
the preprocessed data of the requested dataset is located. If the file is not present, nnU-Net will create its own 
split: a five-fold cross-validation using all the available training cases. nnU-Net needs this five-fold 
cross-validation to be able to determine the postprocessing and to run model/ensemble selection.

There are however situations in which you may want to create your own split, for example
- in datasets like ACDC where several training cases are connected (there are two time steps for each patient) you 
may need to manually create splits to ensure proper stratification.
- cases are annotated by multiple annotators and you would like to use the annotations as separate training examples
- if you are running experiments with a domain transfer, you might want to train only on cases from domain A and 
validate on domain B
- ...

Creating your own data split is simple: the splits_final.pkl file contains the following data structure (assume there are five training cases A, B, C, D, and E):
```python
splits = [
    {'train': ['A', 'B', 'C', 'D'], 'val': ['E']},
    {'train': ['A', 'B', 'C', 'E'], 'val': ['D']},
    {'train': ['A', 'B', 'D', 'E'], 'val': ['C']},
    {'train': ['A', 'C', 'D', 'E'], 'val': ['B']},
    {'train': ['B', 'C', 'D', 'E'], 'val': ['A']}
]
```

Use load_pickle and save_pickle from batchgenerators.utilities.file_and_folder_operations for loading/storing the splits.

Splits is a list of length NUMBER_OF_FOLDS. Each entry in the list is a dict, with 'train' and 'val' as keys and lists 
of the corresponding case names (without the _0000 etc!) as values.

nnU-Net's five-fold cross validation will always create a list of len(splits)=5. But you can do whatever you want. Note 
that if you define only 4 splits (fold 0-3) and then set fold=4 when training (that would be the fifth split), 
nnU-Net will print a warning and proceed to use a random 80:20 data split. 

## How can I swap component XXX (for example the loss) of nnU-Net?

All changes in nnU-Net are handled the same way:

1) create a new nnU-Net trainer class. Place the file somewhere in the nnunet.training.network_training folder 
(any subfolder will do. If you create a new subfolder, make sure to include an empty `__init__.py` file!)

2) make your new trainer class derive from the trainer you would like to change (most likely this is going to be nnUNetTrainerV2)

3) identify the function that you need to overwrite. You may have to go up the inheritance hierarchy to find it!

4) overwrite that function in your custom trainer, make sure whatever you do is compatible with the rest of nnU-Net

What these changes need to look like specifically is hard to say without knowing what you are exactly trying to do. 
Before you open a new issue on GitHub, please have a look around the `nnunet.training.network_training` folder first! 
There are tons of examples modifying various parts of the pipeline.

Also see [here](extending_nnunet.md)

## How does nnU-Net handle multi-modal images?

Multi-modal images are treated as color channels. BraTS, which comes with T1, T1c, T2 and Flair images for each 
training case will thus for example have 4 input channels.

## Why does nnU-Net not use all my GPU memory?

nnU-net and all its parameters are optimized for a training setting that uses about 8GB of VRAM for a network training. 
Using more VRAM will not speed up the training. Using more VRAM has also not (yet) been beneficial for model 
performance consistently enough to make that the default. If you really want to train with more VRAM, you can do one of these things:

1) Manually edit the plans files to increase the batch size. A larger batch size gives better (less noisy) gradients 
and may improve your model performance if the dataset is large. Note that nnU-Net always runs for 1000 epochs with 250 
iterations each (250000 iterations). The training time thus scales approximately linearly with the batch size 
(batch size 4 is going to need twice as long for training than batch size 2!)

2) Manually edit the plans files to increase the patch size. This one is tricky and should only been attempted if you 
know what you are doing! Again, training times will be increased if you do this! 3) is a better way of increasing the 
patch size.

3) Run `nnUNet_plan_and_preprocess` with a larger GPU memory budget. This will make nnU-Net plan for larger patch sizes 
during experiment planning. Doing this can change the patch size, network topology, the batch size as well as the 
presence of the U-Net cascade. To run with a different memory budget, you need to specify a different experiment planner, for example
`nnUNet_plan_and_preprocess -t TASK_ID -pl2d None -pl3d ExperimentPlanner3D_v21_32GB` (note that `-pl2d None` will 
disable 2D U-Net configuration. There is currently no planner for larger 2D U-Nets). We have planners for 8 GB (default), 
11GB and 32GB available. If you need a planner for a different GPU size, you should be able to quickly hack together 
your own using the code of the 11GB or 32GB planner (same goes for a 2D planner). Note that we have experimented with 
these planners and not found an increase in segmentation performance as a result of using them. Training times are 
again longer than with the default.

## Do I need to always run all U-Net configurations?
The model training pipeline above is for challenge participations. Depending on your task you may not want to train all 
U-Net models and you may also not want to run a cross-validation all the time.
Here are some recommendations about what U-Net model to train:
- It is safe to say that on average, the 3D U-Net model (3d_fullres) was most robust. If you just want to use nnU-Net because you 
need segmentations, I recommend you start with this.
- If you are not happy with the results from the 3D U-Net then you can try the following:
  - if your cases are very large so that the patch size of the 3d U-Net only covers a very small fraction of an image then 
  it is possible that the 3d U-Net cannot capture sufficient contextual information in order to be effective. If this 
  is the case, you should consider running the 3d U-Net cascade (3d_lowres followed by 3d_cascade_fullres)
  - If your data is very anisotropic then a 2D U-Net may actually be a better choice (Promise12, ACDC, Task05_Prostate 
  from the decathlon are examples for anisotropic data)

You do not have to run five-fold cross-validation all the time. If you want to test single model performance, use
 *all* for `FOLD` instead of a number. Note that this will then not give you an estimate of your performance on the 
 training set. You will also no tbe able to automatically identify which ensembling should be used and nnU-Net will 
 not be able to configure a postprocessing. 
 
CAREFUL: DO NOT use fold=all when you intend to run the cascade! You must run the cross-validation in 3d_lowres so 
that you get proper (=not overfitted) low resolution predictions.
 
## Sharing Models
You can share trained models by simply sending the corresponding output folder from `RESULTS_FOLDER/nnUNet` to 
whoever you want share them with. The recipient can then use nnU-Net for inference with this model.

You can now also use `nnUNet_export_model_to_zip` to export a trained model (or models) to a zip file. The recipient 
can then use `nnUNet_install_pretrained_model_from_zip` to install the model from this zip file. 

## Can I run nnU-Net on smaller GPUs?
nnU-Net is guaranteed to run on GPUs with 11GB of memory. Many configurations may also run on 8 GB.
If you have an 11GB and there is still an `Out of Memory` error, please read 'nnU-Net training: RuntimeError: CUDA out of memory' [here](common_problems_and_solutions.md).
 
If you wish to configure nnU-Net to use a different amount of GPU memory, simply adapt the reference value for the GPU memory estimation 
accordingly (with some slack because the whole thing is not an exact science!). For example, in 
[experiment_planner_baseline_3DUNet_v21_11GB.py](nnunet/experiment_planning/experiment_planner_baseline_3DUNet_v21_11GB.py) 
we provide an example that attempts to maximise the usage of GPU memory on 11GB as opposed to the default which leaves 
much more headroom). This is simply achieved by this line:

```python
ref = Generic_UNet.use_this_for_batch_size_computation_3D * 11 / 8
```

with 8 being what is currently used (approximately) and 11 being the target. Should you get CUDA out of memory 
issues, simply reduce the reference value. You should do this adaptation as part of a separate ExperimentPlanner class. 
Please read the instructions [here](documentation/extending_nnunet.md).


## Why is no 3d_lowres model created?
3d_lowres is created only if the patch size in 3d_fullres less than 1/8 of the voxels of the median shape of the data 
in 3d_fullres (for example Liver is about 512x512x512 and the patch size is 128x128x128, so that's 1/64 and thus 
3d_lowres is created). You can enforce the creation of 3d_lowres models for smaller datasets by changing the value of
`HOW_MUCH_OF_A_PATIENT_MUST_THE_NETWORK_SEE_AT_STAGE0` (located in experiment_planning.configuration).