katielink commited on
Commit
460afce
1 Parent(s): 50252f2

add note for multi-gpu training with example dataset

Browse files
Files changed (3) hide show
  1. README.md +2 -0
  2. configs/metadata.json +2 -1
  3. docs/README.md +2 -0
README.md CHANGED
@@ -99,6 +99,8 @@ torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run training
99
 
100
  Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove `--standalone`, modify `--nnodes`, or do some other necessary changes according to the machine used. For more details, please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
101
 
 
 
102
  #### Override the `train` config to execute evaluation with the trained model:
103
 
104
  ```
 
99
 
100
  Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove `--standalone`, modify `--nnodes`, or do some other necessary changes according to the machine used. For more details, please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
101
 
102
+ In addition, if using the 20 samples example dataset, the preprocessing script will divide the samples to 16 training samples, 2 validation samples and 2 test samples. However, pytorch multi-gpu training requires number of samples in dataloader larger than gpu numbers. Therefore, please use no more than 2 gpus to run this bundle if using the 20 samples example dataset.
103
+
104
  #### Override the `train` config to execute evaluation with the trained model:
105
 
106
  ```
configs/metadata.json CHANGED
@@ -1,7 +1,8 @@
1
  {
2
  "schema": "https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_20220324.json",
3
- "version": "0.3.3",
4
  "changelog": {
 
5
  "0.3.3": "enhance data preprocess script and readme file",
6
  "0.3.2": "restructure readme to match updated template",
7
  "0.3.1": "add workflow, train loss and validation accuracy figures",
 
1
  {
2
  "schema": "https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_20220324.json",
3
+ "version": "0.3.4",
4
  "changelog": {
5
+ "0.3.4": "add note for multi-gpu training with example dataset",
6
  "0.3.3": "enhance data preprocess script and readme file",
7
  "0.3.2": "restructure readme to match updated template",
8
  "0.3.1": "add workflow, train loss and validation accuracy figures",
docs/README.md CHANGED
@@ -92,6 +92,8 @@ torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run training
92
 
93
  Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove `--standalone`, modify `--nnodes`, or do some other necessary changes according to the machine used. For more details, please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
94
 
 
 
95
  #### Override the `train` config to execute evaluation with the trained model:
96
 
97
  ```
 
92
 
93
  Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove `--standalone`, modify `--nnodes`, or do some other necessary changes according to the machine used. For more details, please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
94
 
95
+ In addition, if using the 20 samples example dataset, the preprocessing script will divide the samples to 16 training samples, 2 validation samples and 2 test samples. However, pytorch multi-gpu training requires number of samples in dataloader larger than gpu numbers. Therefore, please use no more than 2 gpus to run this bundle if using the 20 samples example dataset.
96
+
97
  #### Override the `train` config to execute evaluation with the trained model:
98
 
99
  ```