add note for multi-gpu training with example dataset

Files changed (3) hide show

README.md CHANGED Viewed

@@ -99,6 +99,8 @@ torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run training
 Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove `--standalone`, modify `--nnodes`, or do some other necessary changes according to the machine used. For more details, please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
 #### Override the `train` config to execute evaluation with the trained model:
 ```

 Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove `--standalone`, modify `--nnodes`, or do some other necessary changes according to the machine used. For more details, please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
+In addition, if using the 20 samples example dataset, the preprocessing script will divide the samples to 16 training samples, 2 validation samples and 2 test samples. However, pytorch multi-gpu training requires number of samples in dataloader larger than gpu numbers. Therefore, please use no more than 2 gpus to run this bundle if using the 20 samples example dataset.
 #### Override the `train` config to execute evaluation with the trained model:
 ```

configs/metadata.json CHANGED Viewed

@@ -1,7 +1,8 @@
 {
     "schema": "https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_20220324.json",
-    "version": "0.3.3",
     "changelog": {
         "0.3.3": "enhance data preprocess script and readme file",
         "0.3.2": "restructure readme to match updated template",
         "0.3.1": "add workflow, train loss and validation accuracy figures",

 {
     "schema": "https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/meta_schema_20220324.json",
+    "version": "0.3.4",
     "changelog": {
+        "0.3.4": "add note for multi-gpu training with example dataset",
         "0.3.3": "enhance data preprocess script and readme file",
         "0.3.2": "restructure readme to match updated template",
         "0.3.1": "add workflow, train loss and validation accuracy figures",

docs/README.md CHANGED Viewed

@@ -92,6 +92,8 @@ torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run training
 Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove `--standalone`, modify `--nnodes`, or do some other necessary changes according to the machine used. For more details, please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
 #### Override the `train` config to execute evaluation with the trained model:
 ```

 Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove `--standalone`, modify `--nnodes`, or do some other necessary changes according to the machine used. For more details, please refer to [pytorch's official tutorial](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
+In addition, if using the 20 samples example dataset, the preprocessing script will divide the samples to 16 training samples, 2 validation samples and 2 test samples. However, pytorch multi-gpu training requires number of samples in dataloader larger than gpu numbers. Therefore, please use no more than 2 gpus to run this bundle if using the 20 samples example dataset.
 #### Override the `train` config to execute evaluation with the trained model:
 ```