Getting Started :flight_departure:
Create environment
Preparation
- For training, you may need training datasets and ImageNet pre-trained checkpoints for the backbone. For testing (inference), you may need test datasets (sample images).
- Training datasets are expected to be located under Train.Dataset.root. Likewise, testing datasets should be under Test.Dataset.root.
- Each dataset folder should contain
images
folder and masks
folder for images and ground truth masks respectively.
- You may use multiple training datasets by listing dataset folders for Train.Dataset.sets, such as
[DUTS-TR] -> [DUTS-TR, HRSOD-TR, UHRSD-TR]
.
Backbone Checkpoints
Item |
Destination Folder |
OneDrive |
GDrive |
Res2Net50 checkpoint |
data/backbone_ckpt/*.pth |
Link |
Link |
SwinB checkpoint |
data/backbone_ckpt/*.pth |
Link |
Link |
- We changed Res2Net50 checkpoint to resolve an error while training with DDP. Please refer to issue #9.
Train Datasets
Item |
Destination Folder |
OneDrive |
GDrive |
DUTS-TR |
data/Train_Dataset/... |
Link |
Link |
Extra Train Datasets (High Resolution, Optional)
Item |
Destination Folder |
OneDrive |
GDrive |
HRSOD-TR |
data/Train_Dataset/... |
Link |
N/A |
UHRSD-TR |
data/Train_Dataset/... |
Link |
N/A |
DIS-TR |
data/Train_Dataset/... |
Link |
N/A |
Test Datasets
Item |
Destination Folder |
OneDrive |
GDrive |
DUTS-TE |
data/Test_Dataset/... |
Link |
Link |
DUT-OMRON |
data/Test_Dataset/... |
Link |
Link |
ECSSD |
data/Test_Dataset/... |
Link |
Link |
HKU-IS |
data/Test_Dataset/... |
Link |
Link |
PASCAL-S |
data/Test_Dataset/... |
Link |
Link |
DAVIS-S |
data/Test_Dataset/... |
Link |
Link |
HRSOD-TE |
data/Test_Dataset/... |
Link |
Link |
UHRSD-TE |
data/Test_Dataset/... |
Link |
Link |
Extra Test Datasets (Optional)
Item |
Destination Folder |
OneDrive |
GDrive |
FSS-1000 |
data/Test_Dataset/... |
Link |
N/A |
MSRA-10K |
data/Test_Dataset/... |
Link |
N/A |
DIS-VD |
data/Test_Dataset/... |
Link |
Link |
DIS-TE1 |
data/Test_Dataset/... |
Link |
Link |
DIS-TE2 |
data/Test_Dataset/... |
Link |
Link |
DIS-TE3 |
data/Test_Dataset/... |
Link |
Link |
DIS-TE4 |
data/Test_Dataset/... |
Link |
Link |
Train & Evaluate
# Single GPU
python run/Train.py --config configs/InSPyReNet_SwinB.yaml --verbose
# Multi GPUs with DDP (e.g., 4 GPUs)
torchrun --standalone --nproc_per_node=4 run/Train.py --config configs/InSPyReNet_SwinB.yaml --verbose
# Multi GPUs with DDP with designated devices (e.g., 2 GPUs - 0 and 1)
CUDA_VISIBLE_DEVICES=0,1 torchrun --standalone --nproc_per_node=2 run/Train.py --config configs/InSPyReNet_SwinB.yaml --verbose
--config [CONFIG_FILE], -c [CONFIG_FILE]
: config file path for training.
--resume, -r
: use this argument to resume from last checkpoint.
--verbose, -v
: use this argument to output progress info.
--debug, -d
: use this argument to save debug images every epoch.
Train with extra training datasets can be done by just changing Train.Dataset.sets in the yaml
config file, which is just simply adding more directories (e.g., HRSOD-TR, HRSOD-TR, UHRSD-TR, ...):
Train:
Dataset:
type: "RGB_Dataset"
root: "data/RGB_Dataset/Train_Dataset"
sets: ['DUTS-TR'] --> ['DUTS-TR', 'HRSOD-TR', 'UHRSD-TR']
- Inference for test benchmarks
python run/Test.py --config configs/InSPyReNet_SwinB.yaml --verbose
python run/Eval.py --config configs/InSPyReNet_SwinB.yaml --verbose
- All-in-One command (Train, Test, Eval in single command)
# Single GPU
python Expr.py --config configs/InSPyReNet_SwinB.yaml --verbose
# Multi GPUs with DDP (e.g., 4 GPUs)
torchrun --standalone --nproc_per_node=4 Expr.py --config configs/InSPyReNet_SwinB.yaml --verbose
# Multi GPUs with DDP with designated devices (e.g., 2 GPUs - 0 and 1)
CUDA_VISIBLE_DEVICES=0,1 torchrun --standalone --nproc_per_node=2 Expr.py --config configs/InSPyReNet_SwinB.yaml --verbose
Inference on your own data
- You can inference your own single image or images (.jpg, .jpeg, and .png are supported), single video or videos (.mp4, .mov, and .avi are supported), and webcam input (ubuntu and macos are tested so far).
python run/Inference.py --config configs/InSPyReNet_SwinB.yaml --source [SOURCE] --dest [DEST] --type [TYPE] --gpu --jit --verbose
--source [SOURCE]
: Specify your data in this argument.
- Single image -
image.png
- Folder containing images -
path/to/img/folder
- Single video -
video.mp4
- Folder containing videos -
path/to/vid/folder
- Webcam input:
0
(may vary depends on your device.)
--dest [DEST]
(optional): Specify your destination folder. If not specified, it will be saved in results
folder.
--type [TYPE]
: Choose between map
green
, rgba
, blur
, overlay
, and another image file.
map
will output saliency map only.
green
will change the background with green screen.
rgba
will generate RGBA output regarding saliency score as an alpha map. Note that this will not work for video and webcam input.
blur
will blur the background.
overlay
will cover the salient object with translucent green color, and highlight the edges.
- Another image file (e.g.,
backgroud.png
) will be used as a background, and the object will be overlapped on it.Examples of TYPE argument
--gpu
: Use this argument if you want to use GPU.
--jit
: Slightly improves inference speed when used.
--verbose
: Use when you want to visualize progress.