# Peekaboo **Concordia University** Hasib Zunair, A. Ben Hamza [[`Paper`](https://arxiv.org/abs/2407.17628)] [[`Project`](https://hasibzunair.github.io/peekaboo/)] [[`Demo`](#4-demo)] [[`BibTeX`](#5-citation)] This is official code for our **BMVC 2024 paper**:
[PEEKABOO: Hiding Parts of an Image for Unsupervised Object Localization](Link)
![MSL Design](./media/figure.jpg) We aim to explicitly model contextual relationship among pixels through image masking for unsupervised object localization. In a self-supervised procedure (i.e. pretext task) without any additional training (i.e. downstream task), context-based representation learning is done at both the pixel-level by making predictions on masked images and at shape-level by matching the predictions of the masked input to the unmasked one. ## 1. Specification of dependencies This code requires Python 3.8 and CUDA 11.2. Clone the project repository, then create and activate the following conda envrionment. ```bash # clone repo git clone https://github.com/hasibzunair/peekaboo cd peekaboo # create env conda update conda conda env create -f environment.yml conda activate peekaboo ``` Or, you can also create a fresh environment and install the project requirements inside that environment by: ```bash # clone repo git clone https://github.com/hasibzunair/peekaboo cd peekaboo # create fresh env conda create -n peekaboo python=3.8 conda activate peekaboo # example of pytorch installation pip install torch===1.8.1 torchvision==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html pip install pycocotools # install dependencies pip install -r requirements.txt ``` And then, install [DINO](https://arxiv.org/pdf/2104.14294.pdf) using the following commands: ```bash git clone https://github.com/facebookresearch/dino.git cd dino; touch __init__.py echo -e "import sys\nfrom os.path import dirname, join\nsys.path.insert(0, join(dirname(__file__), '.'))" >> __init__.py; cd ../; ``` ## 2a. Training code ### Dataset details We train Peekaboo on only the images of [DUTS-TR](http://saliencydetection.net/duts/) dataset without any labels, since Peekaboo is self-supervised. Download it, then create a directory inside the project folder named `datasets_local` and put it there. We evaluate on two tasks: unsupervised saliency detection and single object discovery. Since our method is used in an unsupervised setting, it does not require training or fine-tuning on the datasets we evaluate on. #### Unsupervised Saliency Detection We use the following datasets: - [DUT-OMRON](http://saliencydetection.net/dut-omron/) - [DUTS-TEST](http://saliencydetection.net/duts/) - [ECSSD](https://www.cse.cuhk.edu.hk/leojia/projects/hsaliency/dataset.html) Download the datasets and keep them in `datasets_local`. #### Single Object Discovery For single object discovery, we follow the framework used in [LOST](https://github.com/valeoai/LOST). Download the datasets and put them in the folder `datasets_local`. - [VOC07](http://host.robots.ox.ac.uk/pascal/VOC/) - [VOC12](http://host.robots.ox.ac.uk/pascal/VOC/) - [COCO20k](https://cocodataset.org/#home) Finally, download the masks of random streaks and holes of arbitrary shapes from [SCRIBBLES.zip](https://github.com/hasibzunair/masksup-segmentation/releases/download/v1.0/SCRIBBLES.zip) and put it inside `datasets` folder. ### DUTS-TR training ```bash export DATASET_DIR=datasets_local # root directory training and evaluation datasets python train.py --exp-name peekaboo --dataset-dir $DATASET_DIR ``` See tensorboard logs by running: `tensorboard --logdir=outputs`. ## 2b. Evaluation code After training, the model checkpoint and logs are available in `peekaboo-DUTS-TR-vit_small8` in the `outputs` folder. Set the model path for evaluation. ```bash export MODEL="outputs/peekaboo-DUTS-TR-vit_small8/decoder_weights_niter500.pt" ``` ### Unsupervised saliency detection eval ```bash # run evaluation source evaluate_saliency.sh $MODEL $DATASET_DIR single source evaluate_saliency.sh $MODEL $DATASET_DIR multi ``` ### Single object discovery eval ```bash # run evalulation source evaluate_uod.sh $MODEL $DATASET_DIR ``` All experiments are conducted on a single NVIDIA 3080Ti GPU. For additional implementation details and results, please refer to the supplementary materials section in the paper. ## 3. Pre-trained models We provide pretrained models on [./data/weights/](./data/weights/) for reproducibility. Here are the main results of Peekaboo on single object discovery task. For results on unsupervised saliency detection task, we refer readers to our paper! |Dataset | Backbone | CorLoc (%) | Download | | ---------- | ------- | ------ | -------- | | VOC07 | ViT-S/8 | 72.7 | [download](./data/weights/peekaboo_decoder_weights_niter500.pt) | | VOC12 | ViT-S/8 | 75.9 | [download](./data/weights/peekaboo_decoder_weights_niter500.pt) | | COCO20K | ViT-S/8 | 64.0 | [download](./data/weights/peekaboo_decoder_weights_niter500.pt) | ## 4. Demo We provide prediction demos of our models. The following applies and visualizes our method on a single image. ```bash # infer on one image python demo.py ``` ## 5. Citation ```bibtex @inproceedings{zunair2024peekaboo, title={PEEKABOO: Hiding Parts of an Image for Unsupervised Object Localization}, author={Zunair, Hasib and Hamza, A Ben}, booktitle={Proc. British Machine Vision Conference}, year={2024} } ``` ## Project Notes
Click to view
**[Mar 18, 2024]** Infer on image folders. ```python # infer on folder of images python visualize_outputs.py --model-weights outputs/msl_a1.5_b1_g1_reg4-MSL-DUTS-TR-vit_small8/decoder_weights_niter500.pt --img-folder ./datasets_local/DUTS-TR/DUTS-TR-Image/ --output-dir outputs/visualizations/msl_masks ``` **[Nov 10, 2023]** Reproduced FOUND results. **[Nov 10, 2023]** Added project notes section.
## Acknowledgements This repository was built on top of [FOUND](https://github.com/valeoai/FOUND), [SelfMask](https://github.com/NoelShin/selfmask), [TokenCut](https://github.com/YangtaoWANG95/TokenCut) and [LOST](https://github.com/valeoai/LOST). Consider acknowledging these projects.