File size: 6,248 Bytes
1803579
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
# Peekaboo

**Concordia University**

Hasib Zunair, A. Ben Hamza

[[`Paper`](https://arxiv.org/abs/2407.17628)] [[`Project`](https://hasibzunair.github.io/peekaboo/)] [[`Demo`](#4-demo)] [[`BibTeX`](#5-citation)]

This is official code for our **BMVC 2024 paper**:<br>
[PEEKABOO: Hiding Parts of an Image for Unsupervised Object Localization](Link)
<br>

![MSL Design](./media/figure.jpg)

We aim to explicitly model contextual relationship among pixels through image masking for unsupervised object localization. In a self-supervised procedure (i.e. pretext task) without any additional training (i.e. downstream task), context-based representation learning is done at both the pixel-level by making predictions on masked images and at shape-level by matching the predictions of the masked input to the unmasked one.

## 1. Specification of dependencies

This code requires Python 3.8 and CUDA 11.2. Clone the project repository, then create and activate the following conda envrionment.

```bash
# clone repo
git clone https://github.com/hasibzunair/peekaboo
cd peekaboo
# create env
conda update conda
conda env create -f environment.yml
conda activate peekaboo
```

Or, you can also create a fresh environment and install the project requirements inside that environment by:

```bash
# clone repo
git clone https://github.com/hasibzunair/peekaboo
cd peekaboo
# create fresh env
conda create -n peekaboo python=3.8     
conda activate peekaboo
# example of pytorch installation
pip install torch===1.8.1 torchvision==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html
pip install pycocotools
# install dependencies
pip install -r requirements.txt
```

And then, install [DINO](https://arxiv.org/pdf/2104.14294.pdf) using the following commands:

```bash
git clone https://github.com/facebookresearch/dino.git
cd dino; 
touch __init__.py
echo -e "import sys\nfrom os.path import dirname, join\nsys.path.insert(0, join(dirname(__file__), '.'))" >> __init__.py; cd ../;
```

## 2a. Training code

### Dataset details

We train Peekaboo on only the images of [DUTS-TR](http://saliencydetection.net/duts/) dataset without any labels, since Peekaboo is self-supervised. Download it, then create a directory inside the project folder named `datasets_local` and put it there.

We evaluate on two tasks: unsupervised saliency detection and single object discovery. Since our method is used in an unsupervised setting, it does not require training or fine-tuning on the datasets we evaluate on.

#### Unsupervised Saliency Detection

We use the following datasets:

- [DUT-OMRON](http://saliencydetection.net/dut-omron/)
- [DUTS-TEST](http://saliencydetection.net/duts/)
- [ECSSD](https://www.cse.cuhk.edu.hk/leojia/projects/hsaliency/dataset.html)

Download the datasets and keep them in `datasets_local`.

#### Single Object Discovery

For single object discovery, we follow the framework used in [LOST](https://github.com/valeoai/LOST). Download the datasets and put them in the folder `datasets_local`.

- [VOC07](http://host.robots.ox.ac.uk/pascal/VOC/)
- [VOC12](http://host.robots.ox.ac.uk/pascal/VOC/)
- [COCO20k](https://cocodataset.org/#home)

Finally, download the masks of random streaks and holes of arbitrary shapes from [SCRIBBLES.zip](https://github.com/hasibzunair/masksup-segmentation/releases/download/v1.0/SCRIBBLES.zip) and put it inside `datasets` folder.

### DUTS-TR training

```bash
export DATASET_DIR=datasets_local # root directory training and evaluation datasets

python train.py --exp-name peekaboo --dataset-dir $DATASET_DIR
```

See tensorboard logs by running: `tensorboard --logdir=outputs`.

## 2b. Evaluation code

After training, the model checkpoint and logs are available in `peekaboo-DUTS-TR-vit_small8` in the `outputs` folder. Set the model path for evaluation.

```bash
export MODEL="outputs/peekaboo-DUTS-TR-vit_small8/decoder_weights_niter500.pt"
```

### Unsupervised saliency detection eval

```bash
# run evaluation
source evaluate_saliency.sh $MODEL $DATASET_DIR single
source evaluate_saliency.sh $MODEL $DATASET_DIR multi
```

### Single object discovery eval

```bash
# run evalulation
source evaluate_uod.sh $MODEL $DATASET_DIR
```

All experiments are conducted on a single NVIDIA 3080Ti GPU. For additional implementation details and results, please refer to the supplementary materials section in the paper.

## 3. Pre-trained models

We provide pretrained models on [./data/weights/](./data/weights/) for reproducibility. Here are the main results of Peekaboo on single object discovery task. For results on unsupervised saliency detection task, we refer readers to our paper!

|Dataset      | Backbone  |   CorLoc (%)  |   Download   |
|  ---------- | -------   |  ------ |  --------   |
| VOC07 | ViT-S/8  | 72.7 | [download](./data/weights/peekaboo_decoder_weights_niter500.pt) |
| VOC12 | ViT-S/8 | 75.9 | [download](./data/weights/peekaboo_decoder_weights_niter500.pt) |
| COCO20K | ViT-S/8 | 64.0 | [download](./data/weights/peekaboo_decoder_weights_niter500.pt) |

## 4. Demo

We provide prediction demos of our models. The following applies and visualizes our method on a single image.

```bash
# infer on one image
python demo.py
```

## 5. Citation

```bibtex
 @inproceedings{zunair2024peekaboo,
    title={PEEKABOO: Hiding Parts of an Image for Unsupervised Object Localization},
    author={Zunair, Hasib and Hamza, A Ben},
    booktitle={Proc. British Machine Vision Conference},
    year={2024}
  }
```

## Project Notes

<details><summary>Click to view</summary>
<br>

**[Mar 18, 2024]** Infer on image folders.

```python
# infer on folder of images
python visualize_outputs.py --model-weights outputs/msl_a1.5_b1_g1_reg4-MSL-DUTS-TR-vit_small8/decoder_weights_niter500.pt --img-folder ./datasets_local/DUTS-TR/DUTS-TR-Image/ --output-dir outputs/visualizations/msl_masks
```

**[Nov 10, 2023]** Reproduced FOUND results.

**[Nov 10, 2023]** Added project notes section.

</details>

## Acknowledgements

This repository was built on top of [FOUND](https://github.com/valeoai/FOUND), [SelfMask](https://github.com/NoelShin/selfmask), [TokenCut](https://github.com/YangtaoWANG95/TokenCut) and [LOST](https://github.com/valeoai/LOST). Consider acknowledging these projects.