yucornetto commited on
Commit
b6396ac
1 Parent(s): 821b298

init for demo

Browse files

Change-Id: Iedfddfc377edab70464dd68ba4618336f41d2a2a

This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. GETTING_STARTED.md +65 -0
  2. INSTALL.md +48 -0
  3. LICENSE +19 -0
  4. app.py +232 -0
  5. cog.yaml +28 -0
  6. configs/coco/panoptic-segmentation/Base-COCO-PanopticSegmentation.yaml +47 -0
  7. configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_a847.yaml +10 -0
  8. configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_ade20k.yaml +28 -0
  9. configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_cityscapes.yaml +8 -0
  10. configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_coco.yaml +3 -0
  11. configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_mapillary_vistas.yaml +12 -0
  12. configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_pas20.yaml +10 -0
  13. configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_pas21.yaml +10 -0
  14. configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_pc459.yaml +10 -0
  15. configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_pc59.yaml +10 -0
  16. configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml +45 -0
  17. datasets/README.md +135 -0
  18. datasets/ade20k_instance_catid_mapping.txt +104 -0
  19. datasets/ade20k_instance_imgCatIds.json +0 -0
  20. datasets/prepare_ade20k_ins_seg.py +112 -0
  21. datasets/prepare_ade20k_pan_seg.py +500 -0
  22. datasets/prepare_ade20k_sem_seg.py +27 -0
  23. datasets/prepare_coco_semantic_annos_from_panoptic_annos.py +84 -0
  24. datasets/prepare_pascal_ctx_full_sem_seg.py +48 -0
  25. datasets/prepare_pascal_ctx_sem_seg.py +84 -0
  26. datasets/prepare_pascal_voc_sem_seg.py +65 -0
  27. demo/__init__.py +0 -0
  28. demo/demo.py +195 -0
  29. demo/examples/ade.jpg +0 -0
  30. demo/examples/coco.jpg +0 -0
  31. demo/examples/ego4d.jpg +0 -0
  32. demo/predictor.py +275 -0
  33. fcclip/.DS_Store +0 -0
  34. fcclip/__init__.py +26 -0
  35. fcclip/config.py +124 -0
  36. fcclip/data/.DS_Store +0 -0
  37. fcclip/data/__init__.py +2 -0
  38. fcclip/data/dataset_mappers/__init__.py +1 -0
  39. fcclip/data/dataset_mappers/coco_instance_new_baseline_dataset_mapper.py +189 -0
  40. fcclip/data/dataset_mappers/coco_panoptic_new_baseline_dataset_mapper.py +165 -0
  41. fcclip/data/dataset_mappers/mask_former_instance_dataset_mapper.py +180 -0
  42. fcclip/data/dataset_mappers/mask_former_panoptic_dataset_mapper.py +165 -0
  43. fcclip/data/dataset_mappers/mask_former_semantic_dataset_mapper.py +184 -0
  44. fcclip/data/datasets/__init__.py +15 -0
  45. fcclip/data/datasets/ade20k_150_with_prompt_eng.txt +151 -0
  46. fcclip/data/datasets/ade20k_847_with_prompt_eng.txt +848 -0
  47. fcclip/data/datasets/cityscapes_with_prompt_eng.txt +19 -0
  48. fcclip/data/datasets/coco_panoptic_with_prompt_eng.txt +201 -0
  49. fcclip/data/datasets/coco_stuff_with_prompt_eng.txt +183 -0
  50. fcclip/data/datasets/lvis_1203_with_prompt_eng.txt +1203 -0
GETTING_STARTED.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Getting Started with Mask2Former
2
+
3
+ This document provides a brief intro of the usage of Mask2Former.
4
+
5
+ Please see [Getting Started with Detectron2](https://github.com/facebookresearch/detectron2/blob/master/GETTING_STARTED.md) for full usage.
6
+
7
+
8
+ ### Inference Demo with Pre-trained Models
9
+
10
+ 1. Pick a model and its config file from
11
+ [model zoo](MODEL_ZOO.md),
12
+ for example, `configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml`.
13
+ 2. We provide `demo.py` that is able to demo builtin configs. Run it with:
14
+ ```
15
+ cd demo/
16
+ python demo.py --config-file ../configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml \
17
+ --input input1.jpg input2.jpg \
18
+ [--other-options]
19
+ --opts MODEL.WEIGHTS /path/to/checkpoint_file
20
+ ```
21
+ The configs are made for training, therefore we need to specify `MODEL.WEIGHTS` to a model from model zoo for evaluation.
22
+ This command will run the inference and show visualizations in an OpenCV window.
23
+
24
+ For details of the command line arguments, see `demo.py -h` or look at its source code
25
+ to understand its behavior. Some common arguments are:
26
+ * To run __on your webcam__, replace `--input files` with `--webcam`.
27
+ * To run __on a video__, replace `--input files` with `--video-input video.mp4`.
28
+ * To run __on cpu__, add `MODEL.DEVICE cpu` after `--opts`.
29
+ * To save outputs to a directory (for images) or a file (for webcam or video), use `--output`.
30
+
31
+
32
+ ### Training & Evaluation in Command Line
33
+
34
+ We provide a script `train_net.py`, that is made to train all the configs provided in Mask2Former.
35
+
36
+ To train a model with "train_net.py", first
37
+ setup the corresponding datasets following
38
+ [datasets/README.md](./datasets/README.md),
39
+ then run:
40
+ ```
41
+ python train_net.py --num-gpus 8 \
42
+ --config-file configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml
43
+ ```
44
+
45
+ The configs are made for 8-GPU training.
46
+ Since we use ADAMW optimizer, it is not clear how to scale learning rate with batch size.
47
+ To train on 1 GPU, you need to figure out learning rate and batch size by yourself:
48
+ ```
49
+ python train_net.py \
50
+ --config-file configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml \
51
+ --num-gpus 1 SOLVER.IMS_PER_BATCH SET_TO_SOME_REASONABLE_VALUE SOLVER.BASE_LR SET_TO_SOME_REASONABLE_VALUE
52
+ ```
53
+
54
+ To evaluate a model's performance, use
55
+ ```
56
+ python train_net.py \
57
+ --config-file configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml \
58
+ --eval-only MODEL.WEIGHTS /path/to/checkpoint_file
59
+ ```
60
+ For more options, see `python train_net.py -h`.
61
+
62
+
63
+ ### Video instance segmentation
64
+ Please use `demo_video/demo.py` for video instance segmentation demo and `train_net_video.py` to train
65
+ and evaluate video instance segmentation models.
INSTALL.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Installation
2
+
3
+ ### Requirements
4
+ - Linux or macOS with Python ≥ 3.6
5
+ - PyTorch ≥ 1.9 and [torchvision](https://github.com/pytorch/vision/) that matches the PyTorch installation.
6
+ Install them together at [pytorch.org](https://pytorch.org) to make sure of this. Note, please check
7
+ PyTorch version matches that is required by Detectron2.
8
+ - Detectron2: follow [Detectron2 installation instructions](https://detectron2.readthedocs.io/tutorials/install.html).
9
+ - OpenCV is optional but needed by demo and visualization
10
+ - `pip install -r requirements.txt`
11
+
12
+ ### CUDA kernel for MSDeformAttn
13
+ After preparing the required environment, run the following command to compile CUDA kernel for MSDeformAttn:
14
+
15
+ `CUDA_HOME` must be defined and points to the directory of the installed CUDA toolkit.
16
+
17
+ ```bash
18
+ cd mask2former/modeling/pixel_decoder/ops
19
+ sh make.sh
20
+ ```
21
+
22
+ #### Building on another system
23
+ To build on a system that does not have a GPU device but provide the drivers:
24
+ ```bash
25
+ TORCH_CUDA_ARCH_LIST='8.0' FORCE_CUDA=1 python setup.py build install
26
+ ```
27
+
28
+ ### Example conda environment setup
29
+ ```bash
30
+ conda create --name mask2former python=3.8 -y
31
+ conda activate mask2former
32
+ conda install pytorch==1.9.0 torchvision==0.10.0 cudatoolkit=11.1 -c pytorch -c nvidia
33
+ pip install -U opencv-python
34
+
35
+ # under your working directory
36
+ git clone git@github.com:facebookresearch/detectron2.git
37
+ cd detectron2
38
+ pip install -e .
39
+ pip install git+https://github.com/cocodataset/panopticapi.git
40
+ pip install git+https://github.com/mcordts/cityscapesScripts.git
41
+
42
+ cd ..
43
+ git clone git@github.com:facebookresearch/Mask2Former.git
44
+ cd Mask2Former
45
+ pip install -r requirements.txt
46
+ cd mask2former/modeling/pixel_decoder/ops
47
+ sh make.sh
48
+ ```
LICENSE ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Copyright (c) 2022 Meta, Inc.
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy
4
+ of this software and associated documentation files (the "Software"), to deal
5
+ in the Software without restriction, including without limitation the rights
6
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
7
+ copies of the Software, and to permit persons to whom the Software is
8
+ furnished to do so, subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in all
11
+ copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
15
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
16
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
17
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
18
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
19
+ SOFTWARE.
app.py ADDED
@@ -0,0 +1,232 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import sys
3
+
4
+ os.system("pip install gdown")
5
+
6
+ os.system("pip install imutils")
7
+
8
+ os.system('pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.9/index.html')
9
+
10
+ os.system("pip install git+https://github.com/cocodataset/panopticapi.git")
11
+
12
+ import gradio as gr
13
+ # check pytorch installation:
14
+ import detectron2
15
+ from detectron2.utils.logger import setup_logger
16
+ from contextlib import ExitStack
17
+ # import some common libraries
18
+ import numpy as np
19
+ import cv2
20
+ import torch
21
+ import itertools
22
+ # import some common detectron2 utilities
23
+ from detectron2 import model_zoo
24
+ from detectron2.config import get_cfg
25
+ from detectron2.utils.visualizer import Visualizer, ColorMode, random_color
26
+ from detectron2.data import MetadataCatalog
27
+ from detectron2.projects.deeplab import add_deeplab_config
28
+
29
+
30
+ coco_metadata = MetadataCatalog.get("coco_2017_val_panoptic")
31
+
32
+ # import FCCLIP project
33
+ from fcclip import add_maskformer2_config, add_fcclip_config
34
+ from demo.predictor import DefaultPredictor, OpenVocabVisualizer
35
+ from PIL import Image
36
+ import imutils
37
+ import json
38
+
39
+ setup_logger()
40
+ logger = setup_logger(name="fcclip")
41
+
42
+ cfg = get_cfg()
43
+ cfg.MODEL.DEVICE='cpu'
44
+ add_deeplab_config(cfg)
45
+ add_maskformer2_config(cfg)
46
+ add_fcclip_config(cfg)
47
+ cfg.merge_from_file("configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_ade20k.yaml")
48
+ os.system("gdown 1-91PIns86vyNaL3CzMmDD39zKGnPMtvj")
49
+ cfg.MODEL.WEIGHTS = './fcclip_cocopan.pth'
50
+ cfg.MODEL.KMAX_DEEPLAB.TEST.SEMANTIC_ON = False
51
+ cfg.MODEL.KMAX_DEEPLAB.TEST.INSTANCE_ON = False
52
+ cfg.MODEL.KMAX_DEEPLAB.TEST.PANOPTIC_ON = True
53
+ predictor = DefaultPredictor(cfg)
54
+
55
+ # def inference(img):
56
+ # im = cv2.imread(img)
57
+ # #im = imutils.resize(im, width=512)
58
+ # outputs = predictor(im)
59
+ # v = OpenVocabVisualizer(im[:, :, ::-1], coco_metadata, scale=1.2, instance_mode=ColorMode.IMAGE_BW)
60
+ # panoptic_result = v.draw_panoptic_seg(outputs["panoptic_seg"][0].to("cpu"), outputs["panoptic_seg"][1]).get_image()
61
+ # return Image.fromarray(np.uint8(panoptic_result)).convert('RGB')
62
+
63
+
64
+ title = "FC-CLIP"
65
+ description = """Gradio demo for FC-CLIP. To use it, simply upload your image, or click one of the examples to load them. FC-CLIP could perform open vocabulary segmentation, you may input more classes (separate by comma).
66
+ The expected format is 'a1,a2;b1,b2', where a1,a2 are synonyms vocabularies for the first class.
67
+ The first word will be displayed as the class name.Read more at the links below."""
68
+
69
+ article = "<p style='text-align: center'><a href='https://arxiv.org/abs/2207.04044' target='_blank'>kMaX-DeepLab</a> | <a href='https://github.com/google-research/deeplab2' target='_blank'>Github Repo</a></p>"
70
+
71
+ examples = [
72
+ [
73
+ "demo/examples/coco.jpg",
74
+ "black pickup truck, pickup truck; blue sky, sky",
75
+ ["COCO (133 categories)", "ADE (150 categories)", "LVIS (1203 categories)"],
76
+ ],
77
+ [
78
+ "demo/examples/ade.jpg",
79
+ "luggage, suitcase, baggage;handbag",
80
+ ["ADE (150 categories)"],
81
+ ],
82
+ [
83
+ "demo/examples/ego4d.jpg",
84
+ "faucet, tap; kitchen paper, paper towels",
85
+ ["COCO (133 categories)"],
86
+ ],
87
+ ]
88
+
89
+
90
+ coco_metadata = MetadataCatalog.get("openvocab_coco_2017_val_panoptic_with_sem_seg")
91
+ ade20k_metadata = MetadataCatalog.get("openvocab_ade20k_panoptic_val")
92
+ lvis_classes = open("./fcclip/data/datasets/lvis_1203_with_prompt_eng.txt", 'r').read().splitlines()
93
+ lvis_classes = [x[x.find(':')+1:] for x in lvis_classes]
94
+ lvis_colors = list(
95
+ itertools.islice(itertools.cycle(coco_metadata.stuff_colors), len(lvis_classes))
96
+ )
97
+ # rerrange to thing_classes, stuff_classes
98
+ coco_thing_classes = coco_metadata.thing_classes
99
+ coco_stuff_classes = [x for x in coco_metadata.stuff_classes if x not in coco_thing_classes]
100
+ coco_thing_colors = coco_metadata.thing_colors
101
+ coco_stuff_colors = [x for x in coco_metadata.stuff_colors if x not in coco_thing_colors]
102
+ ade20k_thing_classes = ade20k_metadata.thing_classes
103
+ ade20k_stuff_classes = [x for x in ade20k_metadata.stuff_classes if x not in ade20k_thing_classes]
104
+ ade20k_thing_colors = ade20k_metadata.thing_colors
105
+ ade20k_stuff_colors = [x for x in ade20k_metadata.stuff_colors if x not in ade20k_thing_colors]
106
+
107
+ def build_demo_classes_and_metadata(vocab, label_list):
108
+ extra_classes = []
109
+
110
+ if vocab:
111
+ for words in vocab.split(";"):
112
+ extra_classes.append([word.strip() for word in words.split(",")])
113
+ extra_colors = [random_color(rgb=True, maximum=1) for _ in range(len(extra_classes))]
114
+
115
+ demo_thing_classes = extra_classes
116
+ demo_stuff_classes = []
117
+ demo_thing_colors = extra_colors
118
+ demo_stuff_colors = []
119
+
120
+ if any("COCO" in label for label in label_list):
121
+ demo_thing_classes += coco_thing_classes
122
+ demo_stuff_classes += coco_stuff_classes
123
+ demo_thing_colors += coco_thing_colors
124
+ demo_stuff_colors += coco_stuff_colors
125
+ if any("ADE" in label for label in label_list):
126
+ demo_thing_classes += ade20k_thing_classes
127
+ demo_stuff_classes += ade20k_stuff_classes
128
+ demo_thing_colors += ade20k_thing_colors
129
+ demo_stuff_colors += ade20k_stuff_colors
130
+ if any("LVIS" in label for label in label_list):
131
+ demo_thing_classes += lvis_classes
132
+ demo_thing_colors += lvis_colors
133
+
134
+ MetadataCatalog.pop("fcclip_demo_metadata", None)
135
+ demo_metadata = MetadataCatalog.get("fcclip_demo_metadata")
136
+ demo_metadata.thing_classes = [c[0] for c in demo_thing_classes]
137
+ demo_metadata.stuff_classes = [
138
+ *demo_metadata.thing_classes,
139
+ *[c[0] for c in demo_stuff_classes],
140
+ ]
141
+ demo_metadata.thing_colors = demo_thing_colors
142
+ demo_metadata.stuff_colors = demo_thing_colors + demo_stuff_colors
143
+ demo_metadata.stuff_dataset_id_to_contiguous_id = {
144
+ idx: idx for idx in range(len(demo_metadata.stuff_classes))
145
+ }
146
+ demo_metadata.thing_dataset_id_to_contiguous_id = {
147
+ idx: idx for idx in range(len(demo_metadata.thing_classes))
148
+ }
149
+
150
+ demo_classes = demo_thing_classes + demo_stuff_classes
151
+
152
+ return demo_classes, demo_metadata
153
+
154
+
155
+ def inference(image_path, vocab, label_list):
156
+
157
+ logger.info("building class names")
158
+ demo_classes, demo_metadata = build_demo_classes_and_metadata(vocab, label_list)
159
+ predictor.set_metadata(demo_metadata)
160
+
161
+ im = cv2.imread(image_path)
162
+ outputs = predictor(im)
163
+ v = OpenVocabVisualizer(im[:, :, ::-1], demo_metadata, scale=1.2, instance_mode=ColorMode.IMAGE_BW)
164
+ panoptic_result = v.draw_panoptic_seg(outputs["panoptic_seg"][0].to("cpu"), outputs["panoptic_seg"][1]).get_image()
165
+ return Image.fromarray(np.uint8(panoptic_result)).convert('RGB')
166
+
167
+
168
+ with gr.Blocks(title=title) as demo:
169
+ gr.Markdown("<h1 style='text-align: center; margin-bottom: 1rem'>" + title + "</h1>")
170
+ gr.Markdown(description)
171
+ input_components = []
172
+ output_components = []
173
+
174
+ with gr.Row():
175
+ output_image_gr = gr.outputs.Image(label="Panoptic Segmentation", type="pil")
176
+ output_components.append(output_image_gr)
177
+
178
+ with gr.Row().style(equal_height=True, mobile_collapse=True):
179
+ with gr.Column(scale=3, variant="panel") as input_component_column:
180
+ input_image_gr = gr.inputs.Image(type="filepath")
181
+ extra_vocab_gr = gr.inputs.Textbox(default="", label="Extra Vocabulary")
182
+ category_list_gr = gr.inputs.CheckboxGroup(
183
+ choices=["COCO (133 categories)", "ADE (150 categories)", "LVIS (1203 categories)"],
184
+ default=["COCO (133 categories)", "ADE (150 categories)", "LVIS (1203 categories)"],
185
+ label="Category to use",
186
+ )
187
+ input_components.extend([input_image_gr, extra_vocab_gr, category_list_gr])
188
+
189
+ with gr.Column(scale=2):
190
+ examples_handler = gr.Examples(
191
+ examples=examples,
192
+ inputs=[c for c in input_components if not isinstance(c, gr.State)],
193
+ outputs=[c for c in output_components if not isinstance(c, gr.State)],
194
+ fn=inference,
195
+ cache_examples=torch.cuda.is_available(),
196
+ examples_per_page=5,
197
+ )
198
+ with gr.Row():
199
+ clear_btn = gr.Button("Clear")
200
+ submit_btn = gr.Button("Submit", variant="primary")
201
+
202
+ gr.Markdown(article)
203
+
204
+ submit_btn.click(
205
+ inference,
206
+ input_components,
207
+ output_components,
208
+ api_name="predict",
209
+ scroll_to_output=True,
210
+ )
211
+
212
+ clear_btn.click(
213
+ None,
214
+ [],
215
+ (input_components + output_components + [input_component_column]),
216
+ _js=f"""() => {json.dumps(
217
+ [component.cleared_value if hasattr(component, "cleared_value") else None
218
+ for component in input_components + output_components] + (
219
+ [gr.Column.update(visible=True)]
220
+ )
221
+ + ([gr.Column.update(visible=False)])
222
+ )}
223
+ """,
224
+ )
225
+
226
+ demo.launch()
227
+
228
+
229
+ # gr.Interface(inference, inputs=gr.inputs.Image(type="filepath"), outputs=gr.outputs.Image(label="Panoptic segmentation",type="pil"), title=title,
230
+ # description=description,
231
+ # article=article,
232
+ # examples=examples).launch(enable_queue=True)
cog.yaml ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build:
2
+ gpu: true
3
+ cuda: "10.1"
4
+ python_version: "3.8"
5
+ system_packages:
6
+ - "libgl1-mesa-glx"
7
+ - "libglib2.0-0"
8
+ python_packages:
9
+ - "ipython==7.30.1"
10
+ - "numpy==1.21.4"
11
+ - "torch==1.8.1"
12
+ - "torchvision==0.9.1"
13
+ - "opencv-python==4.5.5.62"
14
+ - "Shapely==1.8.0"
15
+ - "h5py==3.6.0"
16
+ - "scipy==1.7.3"
17
+ - "submitit==1.4.1"
18
+ - "scikit-image==0.19.1"
19
+ - "Cython==0.29.27"
20
+ - "timm==0.4.12"
21
+ run:
22
+ - pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.8/index.html
23
+ - pip install git+https://github.com/cocodataset/panopticapi.git
24
+ - pip install git+https://github.com/mcordts/cityscapesScripts.git
25
+ - git clone https://github.com/facebookresearch/Mask2Former
26
+ - TORCH_CUDA_ARCH_LIST='7.5' FORCE_CUDA=1 python Mask2Former/mask2former/modeling/pixel_decoder/ops/setup.py build install
27
+
28
+ predict: "predict.py:Predictor"
configs/coco/panoptic-segmentation/Base-COCO-PanopticSegmentation.yaml ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MODEL:
2
+ BACKBONE:
3
+ FREEZE_AT: 0
4
+ NAME: "build_resnet_backbone"
5
+ WEIGHTS: "detectron2://ImageNetPretrained/torchvision/R-50.pkl"
6
+ PIXEL_MEAN: [123.675, 116.280, 103.530]
7
+ PIXEL_STD: [58.395, 57.120, 57.375]
8
+ RESNETS:
9
+ DEPTH: 50
10
+ STEM_TYPE: "basic" # not used
11
+ STEM_OUT_CHANNELS: 64
12
+ STRIDE_IN_1X1: False
13
+ OUT_FEATURES: ["res2", "res3", "res4", "res5"]
14
+ # NORM: "SyncBN"
15
+ RES5_MULTI_GRID: [1, 1, 1] # not used
16
+ DATASETS:
17
+ TRAIN: ("coco_2017_train_panoptic",)
18
+ TEST: ("coco_2017_val_panoptic_with_sem_seg",) # to evaluate instance and semantic performance as well
19
+ SOLVER:
20
+ IMS_PER_BATCH: 16
21
+ BASE_LR: 0.0001
22
+ STEPS: (327778, 355092)
23
+ MAX_ITER: 368750
24
+ WARMUP_FACTOR: 1.0
25
+ WARMUP_ITERS: 10
26
+ WEIGHT_DECAY: 0.05
27
+ OPTIMIZER: "ADAMW"
28
+ BACKBONE_MULTIPLIER: 0.1
29
+ CLIP_GRADIENTS:
30
+ ENABLED: True
31
+ CLIP_TYPE: "full_model"
32
+ CLIP_VALUE: 0.01
33
+ NORM_TYPE: 2.0
34
+ AMP:
35
+ ENABLED: True
36
+ INPUT:
37
+ IMAGE_SIZE: 1024
38
+ MIN_SCALE: 0.1
39
+ MAX_SCALE: 2.0
40
+ FORMAT: "RGB"
41
+ DATASET_MAPPER_NAME: "coco_panoptic_lsj"
42
+ TEST:
43
+ EVAL_PERIOD: 5000
44
+ DATALOADER:
45
+ FILTER_EMPTY_ANNOTATIONS: True
46
+ NUM_WORKERS: 4
47
+ VERSION: 2
configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_a847.yaml ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ _BASE_: ./fcclip_convnext_large_eval_ade20k.yaml
2
+
3
+ MODEL:
4
+ MASK_FORMER:
5
+ TEST:
6
+ PANOPTIC_ON: False
7
+ INSTANCE_ON: False
8
+
9
+ DATASETS:
10
+ TEST: ("openvocab_ade20k_full_sem_seg_val",)
configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_ade20k.yaml ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ _BASE_: ../maskformer2_R50_bs16_50ep.yaml
2
+ MODEL:
3
+ META_ARCHITECTURE: "FCCLIP"
4
+ SEM_SEG_HEAD:
5
+ NAME: "FCCLIPHead"
6
+ # backbone part.
7
+ BACKBONE:
8
+ NAME: "CLIP"
9
+ WEIGHTS: ""
10
+ PIXEL_MEAN: [122.7709383, 116.7460125, 104.09373615]
11
+ PIXEL_STD: [68.5005327, 66.6321579, 70.32316305]
12
+ FC_CLIP:
13
+ CLIP_MODEL_NAME: "convnext_large_d_320"
14
+ CLIP_PRETRAINED_WEIGHTS: "laion2b_s29b_b131k_ft_soup"
15
+ EMBED_DIM: 768
16
+ GEOMETRIC_ENSEMBLE_ALPHA: 0.4
17
+ GEOMETRIC_ENSEMBLE_BETA: 0.8
18
+ MASK_FORMER:
19
+ NUM_OBJECT_QUERIES: 250
20
+ TEST:
21
+ SEMANTIC_ON: True
22
+ INSTANCE_ON: True
23
+ PANOPTIC_ON: True
24
+ OBJECT_MASK_THRESHOLD: 0.0
25
+
26
+ DATASETS:
27
+ TRAIN: ("openvocab_coco_2017_train_panoptic_with_sem_seg",)
28
+ TEST: ("openvocab_ade20k_panoptic_val",)
configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_cityscapes.yaml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ _BASE_: ./fcclip_convnext_large_eval_ade20k.yaml
2
+
3
+ INPUT:
4
+ MIN_SIZE_TEST: 1024
5
+ MAX_SIZE_TEST: 2560
6
+
7
+ DATASETS:
8
+ TEST: ("openvocab_cityscapes_fine_panoptic_val",)
configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_coco.yaml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ _BASE_: ./fcclip_convnext_large_eval_ade20k.yaml
2
+ DATASETS:
3
+ TEST: ("openvocab_coco_2017_val_panoptic_with_sem_seg",)
configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_mapillary_vistas.yaml ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ _BASE_: ./fcclip_convnext_large_eval_ade20k.yaml
2
+
3
+ MODEL:
4
+ MASK_FORMER:
5
+ TEST:
6
+ INSTANCE_ON: False
7
+ INPUT:
8
+ MIN_SIZE_TEST: 1024
9
+ MAX_SIZE_TEST: 2560
10
+
11
+ DATASETS:
12
+ TEST: ("openvocab_mapillary_vistas_panoptic_val",)
configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_pas20.yaml ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ _BASE_: ./fcclip_convnext_large_eval_ade20k.yaml
2
+
3
+ MODEL:
4
+ MASK_FORMER:
5
+ TEST:
6
+ PANOPTIC_ON: False
7
+ INSTANCE_ON: False
8
+
9
+ DATASETS:
10
+ TEST: ("openvocab_pascal20_sem_seg_val",)
configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_pas21.yaml ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ _BASE_: ./fcclip_convnext_large_eval_ade20k.yaml
2
+
3
+ MODEL:
4
+ MASK_FORMER:
5
+ TEST:
6
+ PANOPTIC_ON: False
7
+ INSTANCE_ON: False
8
+
9
+ DATASETS:
10
+ TEST: ("openvocab_pascal21_sem_seg_val",)
configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_pc459.yaml ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ _BASE_: ./fcclip_convnext_large_eval_ade20k.yaml
2
+
3
+ MODEL:
4
+ MASK_FORMER:
5
+ TEST:
6
+ PANOPTIC_ON: False
7
+ INSTANCE_ON: False
8
+
9
+ DATASETS:
10
+ TEST: ("openvocab_pascal_ctx459_sem_seg_val",)
configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_pc59.yaml ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ _BASE_: ./fcclip_convnext_large_eval_ade20k.yaml
2
+
3
+ MODEL:
4
+ MASK_FORMER:
5
+ TEST:
6
+ PANOPTIC_ON: False
7
+ INSTANCE_ON: False
8
+
9
+ DATASETS:
10
+ TEST: ("openvocab_pascal_ctx59_sem_seg_val",)
configs/coco/panoptic-segmentation/maskformer2_R50_bs16_50ep.yaml ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ _BASE_: Base-COCO-PanopticSegmentation.yaml
2
+ MODEL:
3
+ META_ARCHITECTURE: "MaskFormer"
4
+ SEM_SEG_HEAD:
5
+ NAME: "MaskFormerHead"
6
+ IN_FEATURES: ["res2", "res3", "res4", "res5"]
7
+ IGNORE_VALUE: 255
8
+ NUM_CLASSES: 133
9
+ LOSS_WEIGHT: 1.0
10
+ CONVS_DIM: 256
11
+ MASK_DIM: 256
12
+ NORM: "GN"
13
+ # pixel decoder
14
+ PIXEL_DECODER_NAME: "MSDeformAttnPixelDecoder"
15
+ IN_FEATURES: ["res2", "res3", "res4", "res5"]
16
+ DEFORMABLE_TRANSFORMER_ENCODER_IN_FEATURES: ["res3", "res4", "res5"]
17
+ COMMON_STRIDE: 4
18
+ TRANSFORMER_ENC_LAYERS: 6
19
+ MASK_FORMER:
20
+ TRANSFORMER_DECODER_NAME: "MultiScaleMaskedTransformerDecoder"
21
+ TRANSFORMER_IN_FEATURE: "multi_scale_pixel_decoder"
22
+ DEEP_SUPERVISION: True
23
+ NO_OBJECT_WEIGHT: 0.1
24
+ CLASS_WEIGHT: 2.0
25
+ MASK_WEIGHT: 5.0
26
+ DICE_WEIGHT: 5.0
27
+ HIDDEN_DIM: 256
28
+ NUM_OBJECT_QUERIES: 100
29
+ NHEADS: 8
30
+ DROPOUT: 0.0
31
+ DIM_FEEDFORWARD: 2048
32
+ ENC_LAYERS: 0
33
+ PRE_NORM: False
34
+ ENFORCE_INPUT_PROJ: False
35
+ SIZE_DIVISIBILITY: 32
36
+ DEC_LAYERS: 10 # 9 decoder layers, add one for the loss on learnable query
37
+ TRAIN_NUM_POINTS: 12544
38
+ OVERSAMPLE_RATIO: 3.0
39
+ IMPORTANCE_SAMPLE_RATIO: 0.75
40
+ TEST:
41
+ SEMANTIC_ON: True
42
+ INSTANCE_ON: True
43
+ PANOPTIC_ON: True
44
+ OVERLAP_THRESHOLD: 0.8
45
+ OBJECT_MASK_THRESHOLD: 0.8
datasets/README.md ADDED
@@ -0,0 +1,135 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Prepare Datasets for FCCLIP
2
+
3
+ A dataset can be used by accessing [DatasetCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.DatasetCatalog)
4
+ for its data, or [MetadataCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.MetadataCatalog) for its metadata (class names, etc).
5
+ This document explains how to setup the builtin datasets so they can be used by the above APIs.
6
+ [Use Custom Datasets](https://detectron2.readthedocs.io/tutorials/datasets.html) gives a deeper dive on how to use `DatasetCatalog` and `MetadataCatalog`,
7
+ and how to add new datasets to them.
8
+
9
+ FCCLIP has builtin support for a few datasets.
10
+ The datasets are assumed to exist in a directory specified by the environment variable
11
+ `DETECTRON2_DATASETS`.
12
+ Under this directory, detectron2 will look for datasets in the structure described below, if needed.
13
+ ```
14
+ $DETECTRON2_DATASETS/
15
+ ADEChallengeData2016/
16
+ coco/
17
+ cityscapes/
18
+ mapillary_vistas/
19
+ ```
20
+
21
+ You can set the location for builtin datasets by `export DETECTRON2_DATASETS=/path/to/datasets`.
22
+ If left unset, the default is `./datasets` relative to your current working directory.
23
+
24
+
25
+ ## Expected dataset structure for [COCO](https://cocodataset.org/#download):
26
+
27
+ ```
28
+ coco/
29
+ annotations/
30
+ instances_{train,val}2017.json
31
+ panoptic_{train,val}2017.json
32
+ {train,val}2017/
33
+ # image files that are mentioned in the corresponding json
34
+ panoptic_{train,val}2017/ # png annotations
35
+ panoptic_semseg_{train,val}2017/ # generated by the script mentioned below
36
+ ```
37
+
38
+ Install panopticapi by:
39
+ ```
40
+ pip install git+https://github.com/cocodataset/panopticapi.git
41
+ ```
42
+ Then, run `python datasets/prepare_coco_semantic_annos_from_panoptic_annos.py`, to extract semantic annotations from panoptic annotations (only used for evaluation).
43
+
44
+
45
+ ## Expected dataset structure for [cityscapes](https://www.cityscapes-dataset.com/downloads/):
46
+ ```
47
+ cityscapes/
48
+ gtFine/
49
+ train/
50
+ aachen/
51
+ color.png, instanceIds.png, labelIds.png, polygons.json,
52
+ labelTrainIds.png
53
+ ...
54
+ val/
55
+ test/
56
+ # below are generated Cityscapes panoptic annotation
57
+ cityscapes_panoptic_train.json
58
+ cityscapes_panoptic_train/
59
+ cityscapes_panoptic_val.json
60
+ cityscapes_panoptic_val/
61
+ cityscapes_panoptic_test.json
62
+ cityscapes_panoptic_test/
63
+ leftImg8bit/
64
+ train/
65
+ val/
66
+ test/
67
+ ```
68
+ Install cityscapes scripts by:
69
+ ```
70
+ pip install git+https://github.com/mcordts/cityscapesScripts.git
71
+ ```
72
+
73
+ Note: to create labelTrainIds.png, first prepare the above structure, then run cityscapesescript with:
74
+ ```
75
+ CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createTrainIdLabelImgs.py
76
+ ```
77
+ These files are not needed for instance segmentation.
78
+
79
+ Note: to generate Cityscapes panoptic dataset, run cityscapesescript with:
80
+ ```
81
+ CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createPanopticImgs.py
82
+ ```
83
+ These files are not needed for semantic and instance segmentation.
84
+
85
+
86
+ ## Expected dataset structure for [ADE20k](http://sceneparsing.csail.mit.edu/):
87
+ ```
88
+ ADEChallengeData2016/
89
+ images/
90
+ annotations/
91
+ objectInfo150.txt
92
+ # download instance annotation
93
+ annotations_instance/
94
+ # generated by prepare_ade20k_sem_seg.py
95
+ annotations_detectron2/
96
+ # below are generated by prepare_ade20k_pan_seg.py
97
+ ade20k_panoptic_{train,val}.json
98
+ ade20k_panoptic_{train,val}/
99
+ # below are generated by prepare_ade20k_ins_seg.py
100
+ ade20k_instance_{train,val}.json
101
+ ```
102
+
103
+ The directory `annotations_detectron2` is generated by running `python datasets/prepare_ade20k_sem_seg.py`.
104
+
105
+ Install panopticapi by:
106
+ ```bash
107
+ pip install git+https://github.com/cocodataset/panopticapi.git
108
+ ```
109
+
110
+ Download the instance annotation from http://sceneparsing.csail.mit.edu/:
111
+ ```bash
112
+ wget http://sceneparsing.csail.mit.edu/data/ChallengeData2017/annotations_instance.tar
113
+ ```
114
+
115
+ Then, run `python datasets/prepare_ade20k_pan_seg.py`, to combine semantic and instance annotations for panoptic annotations.
116
+
117
+ And run `python datasets/prepare_ade20k_ins_seg.py`, to extract instance annotations in COCO format.
118
+
119
+
120
+ ## Expected dataset structure for [Mapillary Vistas](https://www.mapillary.com/dataset/vistas):
121
+ ```
122
+ mapillary_vistas/
123
+ training/
124
+ images/
125
+ instances/
126
+ labels/
127
+ panoptic/
128
+ validation/
129
+ images/
130
+ instances/
131
+ labels/
132
+ panoptic/
133
+ ```
134
+
135
+ No preprocessing is needed for Mapillary Vistas on semantic and panoptic segmentation.
datasets/ade20k_instance_catid_mapping.txt ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Instacne100 SceneParse150 FullADE20K
2
+ 1 8 165
3
+ 2 9 3055
4
+ 3 11 350
5
+ 4 13 1831
6
+ 5 15 774
7
+ 5 15 783
8
+ 6 16 2684
9
+ 7 19 687
10
+ 8 20 471
11
+ 9 21 401
12
+ 10 23 1735
13
+ 11 24 2473
14
+ 12 25 2329
15
+ 13 28 1564
16
+ 14 31 57
17
+ 15 32 2272
18
+ 16 33 907
19
+ 17 34 724
20
+ 18 36 2985
21
+ 18 36 533
22
+ 19 37 1395
23
+ 20 38 155
24
+ 21 39 2053
25
+ 22 40 689
26
+ 23 42 266
27
+ 24 43 581
28
+ 25 44 2380
29
+ 26 45 491
30
+ 27 46 627
31
+ 28 48 2388
32
+ 29 50 943
33
+ 30 51 2096
34
+ 31 54 2530
35
+ 32 56 420
36
+ 33 57 1948
37
+ 34 58 1869
38
+ 35 59 2251
39
+ 36 63 239
40
+ 37 65 571
41
+ 38 66 2793
42
+ 39 67 978
43
+ 40 68 236
44
+ 41 70 181
45
+ 42 71 629
46
+ 43 72 2598
47
+ 44 73 1744
48
+ 45 74 1374
49
+ 46 75 591
50
+ 47 76 2679
51
+ 48 77 223
52
+ 49 79 47
53
+ 50 81 327
54
+ 51 82 2821
55
+ 52 83 1451
56
+ 53 84 2880
57
+ 54 86 480
58
+ 55 87 77
59
+ 56 88 2616
60
+ 57 89 246
61
+ 57 89 247
62
+ 58 90 2733
63
+ 59 91 14
64
+ 60 93 38
65
+ 61 94 1936
66
+ 62 96 120
67
+ 63 98 1702
68
+ 64 99 249
69
+ 65 103 2928
70
+ 66 104 2337
71
+ 67 105 1023
72
+ 68 108 2989
73
+ 69 109 1930
74
+ 70 111 2586
75
+ 71 112 131
76
+ 72 113 146
77
+ 73 116 95
78
+ 74 117 1563
79
+ 75 119 1708
80
+ 76 120 103
81
+ 77 121 1002
82
+ 78 122 2569
83
+ 79 124 2833
84
+ 80 125 1551
85
+ 81 126 1981
86
+ 82 127 29
87
+ 83 128 187
88
+ 84 130 747
89
+ 85 131 2254
90
+ 86 133 2262
91
+ 87 134 1260
92
+ 88 135 2243
93
+ 89 136 2932
94
+ 90 137 2836
95
+ 91 138 2850
96
+ 92 139 64
97
+ 93 140 894
98
+ 94 143 1919
99
+ 95 144 1583
100
+ 96 145 318
101
+ 97 147 2046
102
+ 98 148 1098
103
+ 99 149 530
104
+ 100 150 954
datasets/ade20k_instance_imgCatIds.json ADDED
The diff for this file is too large to render. See raw diff
 
datasets/prepare_ade20k_ins_seg.py ADDED
@@ -0,0 +1,112 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # -*- coding: utf-8 -*-
3
+ # Copyright (c) Facebook, Inc. and its affiliates.
4
+ import glob
5
+ import json
6
+ import os
7
+ from collections import Counter
8
+
9
+ import numpy as np
10
+ import tqdm
11
+ from panopticapi.utils import IdGenerator, save_json
12
+ from PIL import Image
13
+ import pycocotools.mask as mask_util
14
+
15
+
16
+ if __name__ == "__main__":
17
+ dataset_dir = os.getenv("DETECTRON2_DATASETS", "datasets")
18
+
19
+ for name, dirname in [("train", "training"), ("val", "validation")]:
20
+ image_dir = os.path.join(dataset_dir, f"ADEChallengeData2016/images/{dirname}/")
21
+ instance_dir = os.path.join(
22
+ dataset_dir, f"ADEChallengeData2016/annotations_instance/{dirname}/"
23
+ )
24
+
25
+ # img_id = 0
26
+ ann_id = 1
27
+
28
+ # json
29
+ out_file = os.path.join(dataset_dir, f"ADEChallengeData2016/ade20k_instance_{name}.json")
30
+
31
+ # json config
32
+ instance_config_file = "datasets/ade20k_instance_imgCatIds.json"
33
+ with open(instance_config_file) as f:
34
+ category_dict = json.load(f)["categories"]
35
+
36
+ # load catid mapping
37
+ # it is important to share category id for both instance and panoptic annotations
38
+ mapping_file = "datasets/ade20k_instance_catid_mapping.txt"
39
+ with open(mapping_file) as f:
40
+ map_id = {}
41
+ for i, line in enumerate(f.readlines()):
42
+ if i == 0:
43
+ continue
44
+ ins_id, sem_id, _ = line.strip().split()
45
+ # shift id by 1 because we want it to start from 0!
46
+ # ignore_label becomes 255
47
+ map_id[int(ins_id)] = int(sem_id) - 1
48
+
49
+ for cat in category_dict:
50
+ cat["id"] = map_id[cat["id"]]
51
+
52
+ filenames = sorted(glob.glob(os.path.join(image_dir, "*.jpg")))
53
+
54
+ ann_dict = {}
55
+ images = []
56
+ annotations = []
57
+
58
+ for idx, filename in enumerate(tqdm.tqdm(filenames)):
59
+ image = {}
60
+ image_id = os.path.basename(filename).split(".")[0]
61
+
62
+ image["id"] = image_id
63
+ image["file_name"] = os.path.basename(filename)
64
+
65
+ original_format = np.array(Image.open(filename))
66
+ image["width"] = original_format.shape[1]
67
+ image["height"] = original_format.shape[0]
68
+
69
+ images.append(image)
70
+
71
+ filename_instance = os.path.join(instance_dir, image_id + ".png")
72
+ ins_seg = np.asarray(Image.open(filename_instance))
73
+ assert ins_seg.dtype == np.uint8
74
+
75
+ instance_cat_ids = ins_seg[..., 0]
76
+ # instance id starts from 1!
77
+ # because 0 is reserved as VOID label
78
+ instance_ins_ids = ins_seg[..., 1]
79
+
80
+ # process things
81
+ for thing_id in np.unique(instance_ins_ids):
82
+ if thing_id == 0:
83
+ continue
84
+ mask = instance_ins_ids == thing_id
85
+ instance_cat_id = np.unique(instance_cat_ids[mask])
86
+ assert len(instance_cat_id) == 1
87
+
88
+ anno = {}
89
+ anno['id'] = ann_id
90
+ ann_id += 1
91
+ anno['image_id'] = image['id']
92
+ anno["iscrowd"] = int(0)
93
+ anno["category_id"] = int(map_id[instance_cat_id[0]])
94
+
95
+ inds = np.nonzero(mask)
96
+ ymin, ymax = inds[0].min(), inds[0].max()
97
+ xmin, xmax = inds[1].min(), inds[1].max()
98
+ anno["bbox"] = [int(xmin), int(ymin), int(xmax - xmin + 1), int(ymax - ymin + 1)]
99
+ # if xmax <= xmin or ymax <= ymin:
100
+ # continue
101
+ rle = mask_util.encode(np.array(mask[:, :, None], order="F", dtype="uint8"))[0]
102
+ rle["counts"] = rle["counts"].decode("utf-8")
103
+ anno["segmentation"] = rle
104
+ anno["area"] = int(mask_util.area(rle))
105
+ annotations.append(anno)
106
+
107
+ # save this
108
+ ann_dict['images'] = images
109
+ ann_dict['categories'] = category_dict
110
+ ann_dict['annotations'] = annotations
111
+
112
+ save_json(ann_dict, out_file)
datasets/prepare_ade20k_pan_seg.py ADDED
@@ -0,0 +1,500 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # -*- coding: utf-8 -*-
3
+ # Copyright (c) Facebook, Inc. and its affiliates.
4
+ import glob
5
+ import json
6
+ import os
7
+ from collections import Counter
8
+
9
+ import numpy as np
10
+ import tqdm
11
+ from panopticapi.utils import IdGenerator, save_json
12
+ from PIL import Image
13
+
14
+ ADE20K_SEM_SEG_CATEGORIES = [
15
+ "wall",
16
+ "building",
17
+ "sky",
18
+ "floor",
19
+ "tree",
20
+ "ceiling",
21
+ "road, route",
22
+ "bed",
23
+ "window ",
24
+ "grass",
25
+ "cabinet",
26
+ "sidewalk, pavement",
27
+ "person",
28
+ "earth, ground",
29
+ "door",
30
+ "table",
31
+ "mountain, mount",
32
+ "plant",
33
+ "curtain",
34
+ "chair",
35
+ "car",
36
+ "water",
37
+ "painting, picture",
38
+ "sofa",
39
+ "shelf",
40
+ "house",
41
+ "sea",
42
+ "mirror",
43
+ "rug",
44
+ "field",
45
+ "armchair",
46
+ "seat",
47
+ "fence",
48
+ "desk",
49
+ "rock, stone",
50
+ "wardrobe, closet, press",
51
+ "lamp",
52
+ "tub",
53
+ "rail",
54
+ "cushion",
55
+ "base, pedestal, stand",
56
+ "box",
57
+ "column, pillar",
58
+ "signboard, sign",
59
+ "chest of drawers, chest, bureau, dresser",
60
+ "counter",
61
+ "sand",
62
+ "sink",
63
+ "skyscraper",
64
+ "fireplace",
65
+ "refrigerator, icebox",
66
+ "grandstand, covered stand",
67
+ "path",
68
+ "stairs",
69
+ "runway",
70
+ "case, display case, showcase, vitrine",
71
+ "pool table, billiard table, snooker table",
72
+ "pillow",
73
+ "screen door, screen",
74
+ "stairway, staircase",
75
+ "river",
76
+ "bridge, span",
77
+ "bookcase",
78
+ "blind, screen",
79
+ "coffee table",
80
+ "toilet, can, commode, crapper, pot, potty, stool, throne",
81
+ "flower",
82
+ "book",
83
+ "hill",
84
+ "bench",
85
+ "countertop",
86
+ "stove",
87
+ "palm, palm tree",
88
+ "kitchen island",
89
+ "computer",
90
+ "swivel chair",
91
+ "boat",
92
+ "bar",
93
+ "arcade machine",
94
+ "hovel, hut, hutch, shack, shanty",
95
+ "bus",
96
+ "towel",
97
+ "light",
98
+ "truck",
99
+ "tower",
100
+ "chandelier",
101
+ "awning, sunshade, sunblind",
102
+ "street lamp",
103
+ "booth",
104
+ "tv",
105
+ "plane",
106
+ "dirt track",
107
+ "clothes",
108
+ "pole",
109
+ "land, ground, soil",
110
+ "bannister, banister, balustrade, balusters, handrail",
111
+ "escalator, moving staircase, moving stairway",
112
+ "ottoman, pouf, pouffe, puff, hassock",
113
+ "bottle",
114
+ "buffet, counter, sideboard",
115
+ "poster, posting, placard, notice, bill, card",
116
+ "stage",
117
+ "van",
118
+ "ship",
119
+ "fountain",
120
+ "conveyer belt, conveyor belt, conveyer, conveyor, transporter",
121
+ "canopy",
122
+ "washer, automatic washer, washing machine",
123
+ "plaything, toy",
124
+ "pool",
125
+ "stool",
126
+ "barrel, cask",
127
+ "basket, handbasket",
128
+ "falls",
129
+ "tent",
130
+ "bag",
131
+ "minibike, motorbike",
132
+ "cradle",
133
+ "oven",
134
+ "ball",
135
+ "food, solid food",
136
+ "step, stair",
137
+ "tank, storage tank",
138
+ "trade name",
139
+ "microwave",
140
+ "pot",
141
+ "animal",
142
+ "bicycle",
143
+ "lake",
144
+ "dishwasher",
145
+ "screen",
146
+ "blanket, cover",
147
+ "sculpture",
148
+ "hood, exhaust hood",
149
+ "sconce",
150
+ "vase",
151
+ "traffic light",
152
+ "tray",
153
+ "trash can",
154
+ "fan",
155
+ "pier",
156
+ "crt screen",
157
+ "plate",
158
+ "monitor",
159
+ "bulletin board",
160
+ "shower",
161
+ "radiator",
162
+ "glass, drinking glass",
163
+ "clock",
164
+ "flag", # noqa
165
+ ]
166
+
167
+ PALETTE = [
168
+ [120, 120, 120],
169
+ [180, 120, 120],
170
+ [6, 230, 230],
171
+ [80, 50, 50],
172
+ [4, 200, 3],
173
+ [120, 120, 80],
174
+ [140, 140, 140],
175
+ [204, 5, 255],
176
+ [230, 230, 230],
177
+ [4, 250, 7],
178
+ [224, 5, 255],
179
+ [235, 255, 7],
180
+ [150, 5, 61],
181
+ [120, 120, 70],
182
+ [8, 255, 51],
183
+ [255, 6, 82],
184
+ [143, 255, 140],
185
+ [204, 255, 4],
186
+ [255, 51, 7],
187
+ [204, 70, 3],
188
+ [0, 102, 200],
189
+ [61, 230, 250],
190
+ [255, 6, 51],
191
+ [11, 102, 255],
192
+ [255, 7, 71],
193
+ [255, 9, 224],
194
+ [9, 7, 230],
195
+ [220, 220, 220],
196
+ [255, 9, 92],
197
+ [112, 9, 255],
198
+ [8, 255, 214],
199
+ [7, 255, 224],
200
+ [255, 184, 6],
201
+ [10, 255, 71],
202
+ [255, 41, 10],
203
+ [7, 255, 255],
204
+ [224, 255, 8],
205
+ [102, 8, 255],
206
+ [255, 61, 6],
207
+ [255, 194, 7],
208
+ [255, 122, 8],
209
+ [0, 255, 20],
210
+ [255, 8, 41],
211
+ [255, 5, 153],
212
+ [6, 51, 255],
213
+ [235, 12, 255],
214
+ [160, 150, 20],
215
+ [0, 163, 255],
216
+ [140, 140, 200],
217
+ [250, 10, 15],
218
+ [20, 255, 0],
219
+ [31, 255, 0],
220
+ [255, 31, 0],
221
+ [255, 224, 0],
222
+ [153, 255, 0],
223
+ [0, 0, 255],
224
+ [255, 71, 0],
225
+ [0, 235, 255],
226
+ [0, 173, 255],
227
+ [31, 0, 255],
228
+ [11, 200, 200],
229
+ [255, 82, 0],
230
+ [0, 255, 245],
231
+ [0, 61, 255],
232
+ [0, 255, 112],
233
+ [0, 255, 133],
234
+ [255, 0, 0],
235
+ [255, 163, 0],
236
+ [255, 102, 0],
237
+ [194, 255, 0],
238
+ [0, 143, 255],
239
+ [51, 255, 0],
240
+ [0, 82, 255],
241
+ [0, 255, 41],
242
+ [0, 255, 173],
243
+ [10, 0, 255],
244
+ [173, 255, 0],
245
+ [0, 255, 153],
246
+ [255, 92, 0],
247
+ [255, 0, 255],
248
+ [255, 0, 245],
249
+ [255, 0, 102],
250
+ [255, 173, 0],
251
+ [255, 0, 20],
252
+ [255, 184, 184],
253
+ [0, 31, 255],
254
+ [0, 255, 61],
255
+ [0, 71, 255],
256
+ [255, 0, 204],
257
+ [0, 255, 194],
258
+ [0, 255, 82],
259
+ [0, 10, 255],
260
+ [0, 112, 255],
261
+ [51, 0, 255],
262
+ [0, 194, 255],
263
+ [0, 122, 255],
264
+ [0, 255, 163],
265
+ [255, 153, 0],
266
+ [0, 255, 10],
267
+ [255, 112, 0],
268
+ [143, 255, 0],
269
+ [82, 0, 255],
270
+ [163, 255, 0],
271
+ [255, 235, 0],
272
+ [8, 184, 170],
273
+ [133, 0, 255],
274
+ [0, 255, 92],
275
+ [184, 0, 255],
276
+ [255, 0, 31],
277
+ [0, 184, 255],
278
+ [0, 214, 255],
279
+ [255, 0, 112],
280
+ [92, 255, 0],
281
+ [0, 224, 255],
282
+ [112, 224, 255],
283
+ [70, 184, 160],
284
+ [163, 0, 255],
285
+ [153, 0, 255],
286
+ [71, 255, 0],
287
+ [255, 0, 163],
288
+ [255, 204, 0],
289
+ [255, 0, 143],
290
+ [0, 255, 235],
291
+ [133, 255, 0],
292
+ [255, 0, 235],
293
+ [245, 0, 255],
294
+ [255, 0, 122],
295
+ [255, 245, 0],
296
+ [10, 190, 212],
297
+ [214, 255, 0],
298
+ [0, 204, 255],
299
+ [20, 0, 255],
300
+ [255, 255, 0],
301
+ [0, 153, 255],
302
+ [0, 41, 255],
303
+ [0, 255, 204],
304
+ [41, 0, 255],
305
+ [41, 255, 0],
306
+ [173, 0, 255],
307
+ [0, 245, 255],
308
+ [71, 0, 255],
309
+ [122, 0, 255],
310
+ [0, 255, 184],
311
+ [0, 92, 255],
312
+ [184, 255, 0],
313
+ [0, 133, 255],
314
+ [255, 214, 0],
315
+ [25, 194, 194],
316
+ [102, 255, 0],
317
+ [92, 0, 255],
318
+ ]
319
+
320
+
321
+ if __name__ == "__main__":
322
+ dataset_dir = os.getenv("DETECTRON2_DATASETS", "datasets")
323
+
324
+ for name, dirname in [("train", "training"), ("val", "validation")]:
325
+ image_dir = os.path.join(dataset_dir, f"ADEChallengeData2016/images/{dirname}/")
326
+ semantic_dir = os.path.join(dataset_dir, f"ADEChallengeData2016/annotations/{dirname}/")
327
+ instance_dir = os.path.join(
328
+ dataset_dir, f"ADEChallengeData2016/annotations_instance/{dirname}/"
329
+ )
330
+
331
+ # folder to store panoptic PNGs
332
+ out_folder = os.path.join(dataset_dir, f"ADEChallengeData2016/ade20k_panoptic_{name}/")
333
+ # json with segmentations information
334
+ out_file = os.path.join(dataset_dir, f"ADEChallengeData2016/ade20k_panoptic_{name}.json")
335
+
336
+ if not os.path.isdir(out_folder):
337
+ print("Creating folder {} for panoptic segmentation PNGs".format(out_folder))
338
+ os.mkdir(out_folder)
339
+
340
+ # json config
341
+ config_file = "datasets/ade20k_instance_imgCatIds.json"
342
+ with open(config_file) as f:
343
+ config = json.load(f)
344
+
345
+ # load catid mapping
346
+ mapping_file = "datasets/ade20k_instance_catid_mapping.txt"
347
+ with open(mapping_file) as f:
348
+ map_id = {}
349
+ for i, line in enumerate(f.readlines()):
350
+ if i == 0:
351
+ continue
352
+ ins_id, sem_id, _ = line.strip().split()
353
+ # shift id by 1 because we want it to start from 0!
354
+ # ignore_label becomes 255
355
+ map_id[int(ins_id) - 1] = int(sem_id) - 1
356
+
357
+ ADE20K_150_CATEGORIES = []
358
+ for cat_id, cat_name in enumerate(ADE20K_SEM_SEG_CATEGORIES):
359
+ ADE20K_150_CATEGORIES.append(
360
+ {
361
+ "name": cat_name,
362
+ "id": cat_id,
363
+ "isthing": int(cat_id in map_id.values()),
364
+ "color": PALETTE[cat_id],
365
+ }
366
+ )
367
+ categories_dict = {cat["id"]: cat for cat in ADE20K_150_CATEGORIES}
368
+
369
+ panoptic_json_categories = ADE20K_150_CATEGORIES[:]
370
+ panoptic_json_images = []
371
+ panoptic_json_annotations = []
372
+
373
+ filenames = sorted(glob.glob(os.path.join(image_dir, "*.jpg")))
374
+ for idx, filename in enumerate(tqdm.tqdm(filenames)):
375
+ panoptic_json_image = {}
376
+ panoptic_json_annotation = {}
377
+
378
+ image_id = os.path.basename(filename).split(".")[0]
379
+
380
+ panoptic_json_image["id"] = image_id
381
+ panoptic_json_image["file_name"] = os.path.basename(filename)
382
+
383
+ original_format = np.array(Image.open(filename))
384
+ panoptic_json_image["width"] = original_format.shape[1]
385
+ panoptic_json_image["height"] = original_format.shape[0]
386
+
387
+ pan_seg = np.zeros(
388
+ (original_format.shape[0], original_format.shape[1], 3), dtype=np.uint8
389
+ )
390
+ id_generator = IdGenerator(categories_dict)
391
+
392
+ filename_semantic = os.path.join(semantic_dir, image_id + ".png")
393
+ filename_instance = os.path.join(instance_dir, image_id + ".png")
394
+
395
+ sem_seg = np.asarray(Image.open(filename_semantic))
396
+ ins_seg = np.asarray(Image.open(filename_instance))
397
+
398
+ assert sem_seg.dtype == np.uint8
399
+ assert ins_seg.dtype == np.uint8
400
+
401
+ semantic_cat_ids = sem_seg - 1
402
+ instance_cat_ids = ins_seg[..., 0] - 1
403
+ # instance id starts from 1!
404
+ # because 0 is reserved as VOID label
405
+ instance_ins_ids = ins_seg[..., 1]
406
+
407
+ segm_info = []
408
+
409
+ # NOTE: there is some overlap between semantic and instance annotation
410
+ # thus we paste stuffs first
411
+
412
+ # process stuffs
413
+ for semantic_cat_id in np.unique(semantic_cat_ids):
414
+ if semantic_cat_id == 255:
415
+ continue
416
+ if categories_dict[semantic_cat_id]["isthing"]:
417
+ continue
418
+ mask = semantic_cat_ids == semantic_cat_id
419
+ # should not have any overlap
420
+ assert pan_seg[mask].sum() == 0
421
+
422
+ segment_id, color = id_generator.get_id_and_color(semantic_cat_id)
423
+ pan_seg[mask] = color
424
+
425
+ area = np.sum(mask) # segment area computation
426
+ # bbox computation for a segment
427
+ hor = np.sum(mask, axis=0)
428
+ hor_idx = np.nonzero(hor)[0]
429
+ x = hor_idx[0]
430
+ width = hor_idx[-1] - x + 1
431
+ vert = np.sum(mask, axis=1)
432
+ vert_idx = np.nonzero(vert)[0]
433
+ y = vert_idx[0]
434
+ height = vert_idx[-1] - y + 1
435
+ bbox = [int(x), int(y), int(width), int(height)]
436
+
437
+ segm_info.append(
438
+ {
439
+ "id": int(segment_id),
440
+ "category_id": int(semantic_cat_id),
441
+ "area": int(area),
442
+ "bbox": bbox,
443
+ "iscrowd": 0,
444
+ }
445
+ )
446
+
447
+ # process things
448
+ for thing_id in np.unique(instance_ins_ids):
449
+ if thing_id == 0:
450
+ continue
451
+ mask = instance_ins_ids == thing_id
452
+ instance_cat_id = np.unique(instance_cat_ids[mask])
453
+ assert len(instance_cat_id) == 1
454
+
455
+ semantic_cat_id = map_id[instance_cat_id[0]]
456
+
457
+ segment_id, color = id_generator.get_id_and_color(semantic_cat_id)
458
+ pan_seg[mask] = color
459
+
460
+ area = np.sum(mask) # segment area computation
461
+ # bbox computation for a segment
462
+ hor = np.sum(mask, axis=0)
463
+ hor_idx = np.nonzero(hor)[0]
464
+ x = hor_idx[0]
465
+ width = hor_idx[-1] - x + 1
466
+ vert = np.sum(mask, axis=1)
467
+ vert_idx = np.nonzero(vert)[0]
468
+ y = vert_idx[0]
469
+ height = vert_idx[-1] - y + 1
470
+ bbox = [int(x), int(y), int(width), int(height)]
471
+
472
+ segm_info.append(
473
+ {
474
+ "id": int(segment_id),
475
+ "category_id": int(semantic_cat_id),
476
+ "area": int(area),
477
+ "bbox": bbox,
478
+ "iscrowd": 0,
479
+ }
480
+ )
481
+
482
+ panoptic_json_annotation = {
483
+ "image_id": image_id,
484
+ "file_name": image_id + ".png",
485
+ "segments_info": segm_info,
486
+ }
487
+
488
+ Image.fromarray(pan_seg).save(os.path.join(out_folder, image_id + ".png"))
489
+
490
+ panoptic_json_images.append(panoptic_json_image)
491
+ panoptic_json_annotations.append(panoptic_json_annotation)
492
+
493
+ # save this
494
+ d = {
495
+ "images": panoptic_json_images,
496
+ "annotations": panoptic_json_annotations,
497
+ "categories": panoptic_json_categories,
498
+ }
499
+
500
+ save_json(d, out_file)
datasets/prepare_ade20k_sem_seg.py ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # -*- coding: utf-8 -*-
3
+ # Copyright (c) Facebook, Inc. and its affiliates.
4
+ import os
5
+ from pathlib import Path
6
+
7
+ import numpy as np
8
+ import tqdm
9
+ from PIL import Image
10
+
11
+
12
+ def convert(input, output):
13
+ img = np.asarray(Image.open(input))
14
+ assert img.dtype == np.uint8
15
+ img = img - 1 # 0 (ignore) becomes 255. others are shifted by 1
16
+ Image.fromarray(img).save(output)
17
+
18
+
19
+ if __name__ == "__main__":
20
+ dataset_dir = Path(os.getenv("DETECTRON2_DATASETS", "datasets")) / "ADEChallengeData2016"
21
+ for name in ["training", "validation"]:
22
+ annotation_dir = dataset_dir / "annotations" / name
23
+ output_dir = dataset_dir / "annotations_detectron2" / name
24
+ output_dir.mkdir(parents=True, exist_ok=True)
25
+ for file in tqdm.tqdm(list(annotation_dir.iterdir())):
26
+ output_file = output_dir / file.name
27
+ convert(file, output_file)
datasets/prepare_coco_semantic_annos_from_panoptic_annos.py ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ # -*- coding: utf-8 -*-
3
+ # Copyright (c) Facebook, Inc. and its affiliates.
4
+
5
+ import functools
6
+ import json
7
+ import multiprocessing as mp
8
+ import numpy as np
9
+ import os
10
+ import time
11
+ from fvcore.common.download import download
12
+ from panopticapi.utils import rgb2id
13
+ from PIL import Image
14
+
15
+ from detectron2.data.datasets.builtin_meta import COCO_CATEGORIES
16
+
17
+
18
+ def _process_panoptic_to_semantic(input_panoptic, output_semantic, segments, id_map):
19
+ panoptic = np.asarray(Image.open(input_panoptic), dtype=np.uint32)
20
+ panoptic = rgb2id(panoptic)
21
+ output = np.zeros_like(panoptic, dtype=np.uint8) + 255
22
+ for seg in segments:
23
+ cat_id = seg["category_id"]
24
+ new_cat_id = id_map[cat_id]
25
+ output[panoptic == seg["id"]] = new_cat_id
26
+ Image.fromarray(output).save(output_semantic)
27
+
28
+
29
+ def separate_coco_semantic_from_panoptic(panoptic_json, panoptic_root, sem_seg_root, categories):
30
+ """
31
+ Create semantic segmentation annotations from panoptic segmentation
32
+ annotations, to be used by PanopticFPN.
33
+ It maps all thing categories to class 0, and maps all unlabeled pixels to class 255.
34
+ It maps all stuff categories to contiguous ids starting from 1.
35
+ Args:
36
+ panoptic_json (str): path to the panoptic json file, in COCO's format.
37
+ panoptic_root (str): a directory with panoptic annotation files, in COCO's format.
38
+ sem_seg_root (str): a directory to output semantic annotation files
39
+ categories (list[dict]): category metadata. Each dict needs to have:
40
+ "id": corresponds to the "category_id" in the json annotations
41
+ "isthing": 0 or 1
42
+ """
43
+ os.makedirs(sem_seg_root, exist_ok=True)
44
+
45
+ id_map = {} # map from category id to id in the output semantic annotation
46
+ assert len(categories) <= 254
47
+ for i, k in enumerate(categories):
48
+ id_map[k["id"]] = i
49
+ # what is id = 0?
50
+ # id_map[0] = 255
51
+ print(id_map)
52
+
53
+ with open(panoptic_json) as f:
54
+ obj = json.load(f)
55
+
56
+ pool = mp.Pool(processes=max(mp.cpu_count() // 2, 4))
57
+
58
+ def iter_annotations():
59
+ for anno in obj["annotations"]:
60
+ file_name = anno["file_name"]
61
+ segments = anno["segments_info"]
62
+ input = os.path.join(panoptic_root, file_name)
63
+ output = os.path.join(sem_seg_root, file_name)
64
+ yield input, output, segments
65
+
66
+ print("Start writing to {} ...".format(sem_seg_root))
67
+ start = time.time()
68
+ pool.starmap(
69
+ functools.partial(_process_panoptic_to_semantic, id_map=id_map),
70
+ iter_annotations(),
71
+ chunksize=100,
72
+ )
73
+ print("Finished. time: {:.2f}s".format(time.time() - start))
74
+
75
+
76
+ if __name__ == "__main__":
77
+ dataset_dir = os.path.join(os.getenv("DETECTRON2_DATASETS", "datasets"), "coco")
78
+ for s in ["val2017", "train2017"]:
79
+ separate_coco_semantic_from_panoptic(
80
+ os.path.join(dataset_dir, "annotations/panoptic_{}.json".format(s)),
81
+ os.path.join(dataset_dir, "panoptic_{}".format(s)),
82
+ os.path.join(dataset_dir, "panoptic_semseg_{}".format(s)),
83
+ COCO_CATEGORIES,
84
+ )
datasets/prepare_pascal_ctx_full_sem_seg.py ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ------------------------------------------------------------------------------
2
+ # Copyright (c) 2022-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3
+ #
4
+ # This work is made available under the Nvidia Source Code License.
5
+ # To view a copy of this license, visit
6
+ # https://github.com/NVlabs/ODISE/blob/main/LICENSE
7
+ #
8
+ # Written by Jiarui Xu
9
+ # ------------------------------------------------------------------------------
10
+
11
+ import os
12
+ import numpy as np
13
+ from pathlib import Path
14
+ from PIL import Image
15
+ import scipy.io as sio
16
+
17
+ import tqdm
18
+
19
+
20
+ def generate_labels(mat_file, out_dir):
21
+
22
+ mat = sio.loadmat(mat_file)
23
+ label_map = mat["LabelMap"]
24
+ assert label_map.dtype == np.uint16
25
+ label_map[label_map == 0] = 65535
26
+ label_map = label_map - 1
27
+ label_map[label_map == 65534] = 65535
28
+
29
+ out_file = out_dir / Path(mat_file.name).with_suffix(".tif")
30
+ Image.fromarray(label_map).save(out_file)
31
+
32
+
33
+ if __name__ == "__main__":
34
+ dataset_dir = Path(os.getenv("DETECTRON2_DATASETS", "datasets")) / "pascal_ctx_d2"
35
+ voc_dir = Path(os.getenv("DETECTRON2_DATASETS", "datasets")) / "VOCdevkit/VOC2010"
36
+ mat_dir = voc_dir / "trainval"
37
+ for split in ["training", "validation"]:
38
+ file_names = list((dataset_dir / "images" / split).glob("*.jpg"))
39
+ output_img_dir = dataset_dir / "images" / split
40
+ output_ann_dir = dataset_dir / "annotations_ctx459" / split
41
+
42
+ output_img_dir.mkdir(parents=True, exist_ok=True)
43
+ output_ann_dir.mkdir(parents=True, exist_ok=True)
44
+
45
+ for file_name in tqdm.tqdm(file_names):
46
+ mat_file_path = mat_dir / f"{file_name.stem}.mat"
47
+
48
+ generate_labels(mat_file_path, output_ann_dir)
datasets/prepare_pascal_ctx_sem_seg.py ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ------------------------------------------------------------------------------
2
+ # Copyright (c) 2022-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3
+ #
4
+ # This work is made available under the Nvidia Source Code License.
5
+ # To view a copy of this license, visit
6
+ # https://github.com/NVlabs/ODISE/blob/main/LICENSE
7
+ #
8
+ # Written by Jiarui Xu
9
+ # ------------------------------------------------------------------------------
10
+
11
+ import os
12
+ from pathlib import Path
13
+ import shutil
14
+
15
+ import numpy as np
16
+ import tqdm
17
+ from PIL import Image
18
+ import multiprocessing as mp
19
+ import functools
20
+ from detail import Detail
21
+
22
+ # fmt: off
23
+ _mapping = np.sort(
24
+ np.array([
25
+ 0, 2, 259, 260, 415, 324, 9, 258, 144, 18, 19, 22, 23, 397, 25, 284,
26
+ 158, 159, 416, 33, 162, 420, 454, 295, 296, 427, 44, 45, 46, 308, 59,
27
+ 440, 445, 31, 232, 65, 354, 424, 68, 326, 72, 458, 34, 207, 80, 355,
28
+ 85, 347, 220, 349, 360, 98, 187, 104, 105, 366, 189, 368, 113, 115
29
+ ]))
30
+ # fmt: on
31
+ _key = np.array(range(len(_mapping))).astype("uint8")
32
+
33
+
34
+ def generate_labels(img_info, detail_api, out_dir):
35
+ def _class_to_index(mask, _mapping, _key):
36
+ # assert the values
37
+ values = np.unique(mask)
38
+ for i in range(len(values)):
39
+ assert values[i] in _mapping
40
+ index = np.digitize(mask.ravel(), _mapping, right=True)
41
+ return _key[index].reshape(mask.shape)
42
+
43
+ sem_seg = _class_to_index(detail_api.getMask(img_info), _mapping=_mapping, _key=_key)
44
+ sem_seg = sem_seg - 1 # 0 (ignore) becomes 255. others are shifted by 1
45
+ filename = img_info["file_name"]
46
+
47
+ Image.fromarray(sem_seg).save(out_dir / filename.replace("jpg", "png"))
48
+
49
+
50
+ def copy_images(img_info, img_dir, out_dir):
51
+ filename = img_info["file_name"]
52
+ shutil.copy2(img_dir / filename, out_dir / filename)
53
+
54
+
55
+ if __name__ == "__main__":
56
+ dataset_dir = Path(os.getenv("DETECTRON2_DATASETS", "datasets")) / "pascal_ctx_d2"
57
+ voc_dir = Path(os.getenv("DETECTRON2_DATASETS", "datasets")) / "VOCdevkit/VOC2010"
58
+ for split in ["training", "validation"]:
59
+ img_dir = voc_dir / "JPEGImages"
60
+ if split == "training":
61
+ detail_api = Detail(voc_dir / "trainval_merged.json", img_dir, "train")
62
+ else:
63
+ detail_api = Detail(voc_dir / "trainval_merged.json", img_dir, "val")
64
+ img_infos = detail_api.getImgs()
65
+
66
+ output_img_dir = dataset_dir / "images" / split
67
+ output_ann_dir = dataset_dir / "annotations_ctx59" / split
68
+
69
+ output_img_dir.mkdir(parents=True, exist_ok=True)
70
+ output_ann_dir.mkdir(parents=True, exist_ok=True)
71
+
72
+ pool = mp.Pool(processes=max(mp.cpu_count() // 2, 4))
73
+
74
+ pool.map(
75
+ functools.partial(copy_images, img_dir=img_dir, out_dir=output_img_dir),
76
+ tqdm.tqdm(img_infos, desc=f"Writing {split} images to {output_img_dir} ..."),
77
+ chunksize=100,
78
+ )
79
+
80
+ pool.map(
81
+ functools.partial(generate_labels, detail_api=detail_api, out_dir=output_ann_dir),
82
+ tqdm.tqdm(img_infos, desc=f"Writing {split} images to {output_ann_dir} ..."),
83
+ chunksize=100,
84
+ )
datasets/prepare_pascal_voc_sem_seg.py ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ------------------------------------------------------------------------------
2
+ # Copyright (c) 2022-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3
+ #
4
+ # This work is made available under the Nvidia Source Code License.
5
+ # To view a copy of this license, visit
6
+ # https://github.com/NVlabs/ODISE/blob/main/LICENSE
7
+ #
8
+ # Written by Jiarui Xu
9
+ # ------------------------------------------------------------------------------
10
+
11
+ import os
12
+ from pathlib import Path
13
+ import shutil
14
+
15
+ import numpy as np
16
+ import tqdm
17
+ from PIL import Image
18
+
19
+
20
+ def convert_pas21(input, output):
21
+ img = np.asarray(Image.open(input))
22
+ assert img.dtype == np.uint8
23
+ # do nothing
24
+ Image.fromarray(img).save(output)
25
+
26
+ def convert_pas20(input, output):
27
+ img = np.array(Image.open(input))
28
+ img[img == 0] = 255
29
+ img = img - 1
30
+ img[img == 254] = 255
31
+ assert img.dtype == np.uint8
32
+ # do nothing
33
+ Image.fromarray(img).save(output)
34
+
35
+
36
+ if __name__ == "__main__":
37
+ dataset_dir = Path(os.getenv("DETECTRON2_DATASETS", "datasets")) / "pascal_voc_d2"
38
+ voc_dir = Path(os.getenv("DETECTRON2_DATASETS", "datasets")) / "VOCdevkit/VOC2012"
39
+ for split in ["training", "validation"]:
40
+ if split == "training":
41
+ img_name_path = voc_dir / "ImageSets/Segmentation/train.txt"
42
+ else:
43
+ img_name_path = voc_dir / "ImageSets/Segmentation/val.txt"
44
+ img_dir = voc_dir / "JPEGImages"
45
+ ann_dir = voc_dir / "SegmentationClass"
46
+
47
+ output_img_dir = dataset_dir / "images" / split
48
+ output_ann_dir_21 = dataset_dir / "annotations_pascal21" / split
49
+ output_ann_dir_20 = dataset_dir / "annotations_pascal20" / split
50
+
51
+ output_img_dir.mkdir(parents=True, exist_ok=True)
52
+ output_ann_dir_21.mkdir(parents=True, exist_ok=True)
53
+ output_ann_dir_20.mkdir(parents=True, exist_ok=True)
54
+
55
+ with open(img_name_path) as f:
56
+ for line in tqdm.tqdm(f.readlines()):
57
+ img_name = line.strip()
58
+ img_path = img_dir / f"{img_name}.jpg"
59
+ ann_path = ann_dir / f"{img_name}.png"
60
+
61
+ # print(f'copy2 {output_img_dir}')
62
+ shutil.copy2(img_path, output_img_dir)
63
+ # print(f"convert {ann_dir} to {output_ann_dir / f'{img_name}.png'}")
64
+ convert_pas21(ann_path, output_ann_dir_21 / f"{img_name}.png")
65
+ convert_pas20(ann_path, output_ann_dir_20 / f"{img_name}.png")
demo/__init__.py ADDED
File without changes
demo/demo.py ADDED
@@ -0,0 +1,195 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Facebook, Inc. and its affiliates.
2
+ # Modified by Bowen Cheng from: https://github.com/facebookresearch/detectron2/blob/master/demo/demo.py
3
+ import argparse
4
+ import glob
5
+ import multiprocessing as mp
6
+ import os
7
+
8
+ # fmt: off
9
+ import sys
10
+ sys.path.insert(1, os.path.join(sys.path[0], '..'))
11
+ # fmt: on
12
+
13
+ import tempfile
14
+ import time
15
+ import warnings
16
+
17
+ import cv2
18
+ import numpy as np
19
+ import tqdm
20
+
21
+ from detectron2.config import get_cfg
22
+ from detectron2.data.detection_utils import read_image
23
+ from detectron2.projects.deeplab import add_deeplab_config
24
+ from detectron2.utils.logger import setup_logger
25
+
26
+ from fcclip import add_maskformer2_config, add_fcclip_config
27
+ from predictor import VisualizationDemo
28
+
29
+
30
+ # constants
31
+ WINDOW_NAME = "fc-clip demo"
32
+
33
+
34
+ def setup_cfg(args):
35
+ # load config from file and command-line arguments
36
+ cfg = get_cfg()
37
+ add_deeplab_config(cfg)
38
+ add_maskformer2_config(cfg)
39
+ add_fcclip_config(cfg)
40
+ cfg.merge_from_file(args.config_file)
41
+ cfg.merge_from_list(args.opts)
42
+ cfg.freeze()
43
+ return cfg
44
+
45
+
46
+ def get_parser():
47
+ parser = argparse.ArgumentParser(description="fcclip demo for builtin configs")
48
+ parser.add_argument(
49
+ "--config-file",
50
+ default="configs/coco/panoptic-segmentation/fcclip/fcclip_convnext_large_eval_ade20k.yaml",
51
+ metavar="FILE",
52
+ help="path to config file",
53
+ )
54
+ parser.add_argument("--webcam", action="store_true", help="Take inputs from webcam.")
55
+ parser.add_argument("--video-input", help="Path to video file.")
56
+ parser.add_argument(
57
+ "--input",
58
+ nargs="+",
59
+ help="A list of space separated input images; "
60
+ "or a single glob pattern such as 'directory/*.jpg'",
61
+ )
62
+ parser.add_argument(
63
+ "--output",
64
+ help="A file or directory to save output visualizations. "
65
+ "If not given, will show output in an OpenCV window.",
66
+ )
67
+
68
+ parser.add_argument(
69
+ "--confidence-threshold",
70
+ type=float,
71
+ default=0.5,
72
+ help="Minimum score for instance predictions to be shown",
73
+ )
74
+ parser.add_argument(
75
+ "--opts",
76
+ help="Modify config options using the command-line 'KEY VALUE' pairs",
77
+ default=[],
78
+ nargs=argparse.REMAINDER,
79
+ )
80
+ return parser
81
+
82
+
83
+ def test_opencv_video_format(codec, file_ext):
84
+ with tempfile.TemporaryDirectory(prefix="video_format_test") as dir:
85
+ filename = os.path.join(dir, "test_file" + file_ext)
86
+ writer = cv2.VideoWriter(
87
+ filename=filename,
88
+ fourcc=cv2.VideoWriter_fourcc(*codec),
89
+ fps=float(30),
90
+ frameSize=(10, 10),
91
+ isColor=True,
92
+ )
93
+ [writer.write(np.zeros((10, 10, 3), np.uint8)) for _ in range(30)]
94
+ writer.release()
95
+ if os.path.isfile(filename):
96
+ return True
97
+ return False
98
+
99
+
100
+ if __name__ == "__main__":
101
+ mp.set_start_method("spawn", force=True)
102
+ args = get_parser().parse_args()
103
+ setup_logger(name="fvcore")
104
+ logger = setup_logger()
105
+ logger.info("Arguments: " + str(args))
106
+
107
+ cfg = setup_cfg(args)
108
+
109
+ demo = VisualizationDemo(cfg)
110
+
111
+ if args.input:
112
+ if len(args.input) == 1:
113
+ args.input = glob.glob(os.path.expanduser(args.input[0]))
114
+ assert args.input, "The input path(s) was not found"
115
+ for path in tqdm.tqdm(args.input, disable=not args.output):
116
+ # use PIL, to be consistent with evaluation
117
+ img = read_image(path, format="BGR")
118
+ start_time = time.time()
119
+ predictions, visualized_output = demo.run_on_image(img)
120
+ logger.info(
121
+ "{}: {} in {:.2f}s".format(
122
+ path,
123
+ "detected {} instances".format(len(predictions["instances"]))
124
+ if "instances" in predictions
125
+ else "finished",
126
+ time.time() - start_time,
127
+ )
128
+ )
129
+
130
+ if args.output:
131
+ if os.path.isdir(args.output):
132
+ assert os.path.isdir(args.output), args.output
133
+ out_filename = os.path.join(args.output, os.path.basename(path))
134
+ else:
135
+ assert len(args.input) == 1, "Please specify a directory with args.output"
136
+ out_filename = args.output
137
+ visualized_output.save(out_filename)
138
+ else:
139
+ cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
140
+ cv2.imshow(WINDOW_NAME, visualized_output.get_image()[:, :, ::-1])
141
+ if cv2.waitKey(0) == 27:
142
+ break # esc to quit
143
+ elif args.webcam:
144
+ assert args.input is None, "Cannot have both --input and --webcam!"
145
+ assert args.output is None, "output not yet supported with --webcam!"
146
+ cam = cv2.VideoCapture(0)
147
+ for vis in tqdm.tqdm(demo.run_on_video(cam)):
148
+ cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
149
+ cv2.imshow(WINDOW_NAME, vis)
150
+ if cv2.waitKey(1) == 27:
151
+ break # esc to quit
152
+ cam.release()
153
+ cv2.destroyAllWindows()
154
+ elif args.video_input:
155
+ video = cv2.VideoCapture(args.video_input)
156
+ width = int(video.get(cv2.CAP_PROP_FRAME_WIDTH))
157
+ height = int(video.get(cv2.CAP_PROP_FRAME_HEIGHT))
158
+ frames_per_second = video.get(cv2.CAP_PROP_FPS)
159
+ num_frames = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
160
+ basename = os.path.basename(args.video_input)
161
+ codec, file_ext = (
162
+ ("x264", ".mkv") if test_opencv_video_format("x264", ".mkv") else ("mp4v", ".mp4")
163
+ )
164
+ if codec == ".mp4v":
165
+ warnings.warn("x264 codec not available, switching to mp4v")
166
+ if args.output:
167
+ if os.path.isdir(args.output):
168
+ output_fname = os.path.join(args.output, basename)
169
+ output_fname = os.path.splitext(output_fname)[0] + file_ext
170
+ else:
171
+ output_fname = args.output
172
+ assert not os.path.isfile(output_fname), output_fname
173
+ output_file = cv2.VideoWriter(
174
+ filename=output_fname,
175
+ # some installation of opencv may not support x264 (due to its license),
176
+ # you can try other format (e.g. MPEG)
177
+ fourcc=cv2.VideoWriter_fourcc(*codec),
178
+ fps=float(frames_per_second),
179
+ frameSize=(width, height),
180
+ isColor=True,
181
+ )
182
+ assert os.path.isfile(args.video_input)
183
+ for vis_frame in tqdm.tqdm(demo.run_on_video(video), total=num_frames):
184
+ if args.output:
185
+ output_file.write(vis_frame)
186
+ else:
187
+ cv2.namedWindow(basename, cv2.WINDOW_NORMAL)
188
+ cv2.imshow(basename, vis_frame)
189
+ if cv2.waitKey(1) == 27:
190
+ break # esc to quit
191
+ video.release()
192
+ if args.output:
193
+ output_file.release()
194
+ else:
195
+ cv2.destroyAllWindows()
demo/examples/ade.jpg ADDED
demo/examples/coco.jpg ADDED
demo/examples/ego4d.jpg ADDED
demo/predictor.py ADDED
@@ -0,0 +1,275 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Facebook, Inc. and its affiliates.
2
+ # Copied from: https://github.com/facebookresearch/detectron2/blob/master/demo/predictor.py
3
+ import atexit
4
+ import bisect
5
+ import multiprocessing as mp
6
+ from collections import deque
7
+
8
+ import cv2
9
+ import torch
10
+ import itertools
11
+
12
+
13
+ from detectron2.data import DatasetCatalog, MetadataCatalog
14
+ from detectron2.engine.defaults import DefaultPredictor as d2_defaultPredictor
15
+ from detectron2.utils.video_visualizer import VideoVisualizer
16
+ from detectron2.utils.visualizer import ColorMode, Visualizer, random_color
17
+ import detectron2.utils.visualizer as d2_visualizer
18
+
19
+
20
+ class DefaultPredictor(d2_defaultPredictor):
21
+
22
+ def set_metadata(self, metadata):
23
+ self.model.set_metadata(metadata)
24
+
25
+
26
+ class OpenVocabVisualizer(Visualizer):
27
+ def draw_panoptic_seg(self, panoptic_seg, segments_info, area_threshold=None, alpha=0.7):
28
+ """
29
+ Draw panoptic prediction annotations or results.
30
+
31
+ Args:
32
+ panoptic_seg (Tensor): of shape (height, width) where the values are ids for each
33
+ segment.
34
+ segments_info (list[dict] or None): Describe each segment in `panoptic_seg`.
35
+ If it is a ``list[dict]``, each dict contains keys "id", "category_id".
36
+ If None, category id of each pixel is computed by
37
+ ``pixel // metadata.label_divisor``.
38
+ area_threshold (int): stuff segments with less than `area_threshold` are not drawn.
39
+
40
+ Returns:
41
+ output (VisImage): image object with visualizations.
42
+ """
43
+ pred = d2_visualizer._PanopticPrediction(panoptic_seg, segments_info, self.metadata)
44
+
45
+ if self._instance_mode == ColorMode.IMAGE_BW:
46
+ self.output.reset_image(self._create_grayscale_image(pred.non_empty_mask()))
47
+ # draw mask for all semantic segments first i.e. "stuff"
48
+ for mask, sinfo in pred.semantic_masks():
49
+ category_idx = sinfo["category_id"]
50
+ try:
51
+ mask_color = [x / 255 for x in self.metadata.stuff_colors[category_idx]]
52
+ except AttributeError:
53
+ mask_color = None
54
+
55
+ text = self.metadata.stuff_classes[category_idx].split(',')[0]
56
+ self.draw_binary_mask(
57
+ mask,
58
+ color=mask_color,
59
+ edge_color=d2_visualizer._OFF_WHITE,
60
+ text=text,
61
+ alpha=alpha,
62
+ area_threshold=area_threshold,
63
+ )
64
+ # draw mask for all instances second
65
+ all_instances = list(pred.instance_masks())
66
+ if len(all_instances) == 0:
67
+ return self.output
68
+ masks, sinfo = list(zip(*all_instances))
69
+ category_ids = [x["category_id"] for x in sinfo]
70
+
71
+ try:
72
+ scores = [x["score"] for x in sinfo]
73
+ except KeyError:
74
+ scores = None
75
+ stuff_classes = self.metadata.stuff_classes
76
+ stuff_classes = [x.split(',')[0] for x in stuff_classes]
77
+ labels = d2_visualizer._create_text_labels(
78
+ category_ids, scores, stuff_classes, [x.get("iscrowd", 0) for x in sinfo]
79
+ )
80
+
81
+ try:
82
+ colors = [
83
+ self._jitter([x / 255 for x in self.metadata.stuff_colors[c]]) for c in category_ids
84
+ ]
85
+ except AttributeError:
86
+ colors = None
87
+ self.overlay_instances(masks=masks, labels=labels, assigned_colors=colors, alpha=alpha)
88
+
89
+ return self.output
90
+
91
+
92
+ class VisualizationDemo(object):
93
+ def __init__(self, cfg, instance_mode=ColorMode.IMAGE, parallel=False):
94
+ """
95
+ Args:
96
+ cfg (CfgNode):
97
+ instance_mode (ColorMode):
98
+ parallel (bool): whether to run the model in different processes from visualization.
99
+ Useful since the visualization logic can be slow.
100
+ """
101
+
102
+ coco_metadata = MetadataCatalog.get("openvocab_coco_2017_val_panoptic_with_sem_seg")
103
+ ade20k_metadata = MetadataCatalog.get("openvocab_ade20k_panoptic_val")
104
+ lvis_classes = open("./fcclip/data/datasets/lvis_1203_with_prompt_eng.txt", 'r').read().splitlines()
105
+ lvis_classes = [x[x.find(':')+1:] for x in lvis_classes]
106
+ lvis_colors = list(
107
+ itertools.islice(itertools.cycle(coco_metadata.stuff_colors), len(lvis_classes))
108
+ )
109
+ # rerrange to thing_classes, stuff_classes
110
+ coco_thing_classes = coco_metadata.thing_classes
111
+ coco_stuff_classes = [x for x in coco_metadata.stuff_classes if x not in coco_thing_classes]
112
+ coco_thing_colors = coco_metadata.thing_colors
113
+ coco_stuff_colors = [x for x in coco_metadata.stuff_colors if x not in coco_thing_colors]
114
+ ade20k_thing_classes = ade20k_metadata.thing_classes
115
+ ade20k_stuff_classes = [x for x in ade20k_metadata.stuff_classes if x not in ade20k_thing_classes]
116
+ ade20k_thing_colors = ade20k_metadata.thing_colors
117
+ ade20k_stuff_colors = [x for x in ade20k_metadata.stuff_colors if x not in ade20k_thing_colors]
118
+
119
+ user_classes = []
120
+ user_colors = [random_color(rgb=True, maximum=1) for _ in range(len(user_classes))]
121
+
122
+ stuff_classes = coco_stuff_classes + ade20k_stuff_classes
123
+ stuff_colors = coco_stuff_colors + ade20k_stuff_colors
124
+ thing_classes = user_classes + coco_thing_classes + ade20k_thing_classes + lvis_classes
125
+ thing_colors = user_colors + coco_thing_colors + ade20k_thing_colors + lvis_colors
126
+
127
+ thing_dataset_id_to_contiguous_id = {x: x for x in range(len(thing_classes))}
128
+ DatasetCatalog.register(
129
+ "openvocab_dataset", lambda x: []
130
+ )
131
+ self.metadata = MetadataCatalog.get("openvocab_dataset").set(
132
+ stuff_classes=thing_classes+stuff_classes,
133
+ stuff_colors=thing_colors+stuff_colors,
134
+ thing_dataset_id_to_contiguous_id=thing_dataset_id_to_contiguous_id,
135
+ )
136
+ #print("self.metadata:", self.metadata)
137
+ self.cpu_device = torch.device("cpu")
138
+ self.instance_mode = instance_mode
139
+
140
+ self.parallel = parallel
141
+ if parallel:
142
+ num_gpu = torch.cuda.device_count()
143
+ self.predictor = AsyncPredictor(cfg, num_gpus=num_gpu)
144
+ else:
145
+ self.predictor = DefaultPredictor(cfg)
146
+ self.predictor.set_metadata(self.metadata)
147
+
148
+ def run_on_image(self, image):
149
+ """
150
+ Args:
151
+ image (np.ndarray): an image of shape (H, W, C) (in BGR order).
152
+ This is the format used by OpenCV.
153
+ Returns:
154
+ predictions (dict): the output of the model.
155
+ vis_output (VisImage): the visualized image output.
156
+ """
157
+ vis_output = None
158
+ predictions = self.predictor(image)
159
+ # Convert image from OpenCV BGR format to Matplotlib RGB format.
160
+ image = image[:, :, ::-1]
161
+ visualizer = OpenVocabVisualizer(image, self.metadata, instance_mode=self.instance_mode)
162
+ if "panoptic_seg" in predictions:
163
+ panoptic_seg, segments_info = predictions["panoptic_seg"]
164
+ vis_output = visualizer.draw_panoptic_seg(
165
+ panoptic_seg.to(self.cpu_device), segments_info
166
+ )
167
+ else:
168
+ if "sem_seg" in predictions:
169
+ vis_output = visualizer.draw_sem_seg(
170
+ predictions["sem_seg"].argmax(dim=0).to(self.cpu_device)
171
+ )
172
+ if "instances" in predictions:
173
+ instances = predictions["instances"].to(self.cpu_device)
174
+ vis_output = visualizer.draw_instance_predictions(predictions=instances)
175
+
176
+ return predictions, vis_output
177
+
178
+ def _frame_from_video(self, video):
179
+ while video.isOpened():
180
+ success, frame = video.read()
181
+ if success:
182
+ yield frame
183
+ else:
184
+ break
185
+
186
+
187
+ class AsyncPredictor:
188
+ """
189
+ A predictor that runs the model asynchronously, possibly on >1 GPUs.
190
+ Because rendering the visualization takes considerably amount of time,
191
+ this helps improve throughput a little bit when rendering videos.
192
+ """
193
+
194
+ class _StopToken:
195
+ pass
196
+
197
+ class _PredictWorker(mp.Process):
198
+ def __init__(self, cfg, task_queue, result_queue):
199
+ self.cfg = cfg
200
+ self.task_queue = task_queue
201
+ self.result_queue = result_queue
202
+ super().__init__()
203
+
204
+ def run(self):
205
+ predictor = DefaultPredictor(self.cfg)
206
+
207
+ while True:
208
+ task = self.task_queue.get()
209
+ if isinstance(task, AsyncPredictor._StopToken):
210
+ break
211
+ idx, data = task
212
+ result = predictor(data)
213
+ self.result_queue.put((idx, result))
214
+
215
+ def __init__(self, cfg, num_gpus: int = 1):
216
+ """
217
+ Args:
218
+ cfg (CfgNode):
219
+ num_gpus (int): if 0, will run on CPU
220
+ """
221
+ num_workers = max(num_gpus, 1)
222
+ self.task_queue = mp.Queue(maxsize=num_workers * 3)
223
+ self.result_queue = mp.Queue(maxsize=num_workers * 3)
224
+ self.procs = []
225
+ for gpuid in range(max(num_gpus, 1)):
226
+ cfg = cfg.clone()
227
+ cfg.defrost()
228
+ cfg.MODEL.DEVICE = "cuda:{}".format(gpuid) if num_gpus > 0 else "cpu"
229
+ self.procs.append(
230
+ AsyncPredictor._PredictWorker(cfg, self.task_queue, self.result_queue)
231
+ )
232
+
233
+ self.put_idx = 0
234
+ self.get_idx = 0
235
+ self.result_rank = []
236
+ self.result_data = []
237
+
238
+ for p in self.procs:
239
+ p.start()
240
+ atexit.register(self.shutdown)
241
+
242
+ def put(self, image):
243
+ self.put_idx += 1
244
+ self.task_queue.put((self.put_idx, image))
245
+
246
+ def get(self):
247
+ self.get_idx += 1 # the index needed for this request
248
+ if len(self.result_rank) and self.result_rank[0] == self.get_idx:
249
+ res = self.result_data[0]
250
+ del self.result_data[0], self.result_rank[0]
251
+ return res
252
+
253
+ while True:
254
+ # make sure the results are returned in the correct order
255
+ idx, res = self.result_queue.get()
256
+ if idx == self.get_idx:
257
+ return res
258
+ insert = bisect.bisect(self.result_rank, idx)
259
+ self.result_rank.insert(insert, idx)
260
+ self.result_data.insert(insert, res)
261
+
262
+ def __len__(self):
263
+ return self.put_idx - self.get_idx
264
+
265
+ def __call__(self, image):
266
+ self.put(image)
267
+ return self.get()
268
+
269
+ def shutdown(self):
270
+ for _ in self.procs:
271
+ self.task_queue.put(AsyncPredictor._StopToken())
272
+
273
+ @property
274
+ def default_buffer_size(self):
275
+ return len(self.procs) * 5
fcclip/.DS_Store ADDED
Binary file (6.15 kB). View file
 
fcclip/__init__.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Facebook, Inc. and its affiliates.
2
+ from . import data # register all new datasets
3
+ from . import modeling
4
+
5
+ # config
6
+ from .config import add_maskformer2_config, add_fcclip_config
7
+
8
+ # dataset loading
9
+ from .data.dataset_mappers.coco_instance_new_baseline_dataset_mapper import COCOInstanceNewBaselineDatasetMapper
10
+ from .data.dataset_mappers.coco_panoptic_new_baseline_dataset_mapper import COCOPanopticNewBaselineDatasetMapper
11
+ from .data.dataset_mappers.mask_former_instance_dataset_mapper import (
12
+ MaskFormerInstanceDatasetMapper,
13
+ )
14
+ from .data.dataset_mappers.mask_former_panoptic_dataset_mapper import (
15
+ MaskFormerPanopticDatasetMapper,
16
+ )
17
+ from .data.dataset_mappers.mask_former_semantic_dataset_mapper import (
18
+ MaskFormerSemanticDatasetMapper,
19
+ )
20
+
21
+ # models
22
+ from .fcclip import FCCLIP
23
+ from .test_time_augmentation import SemanticSegmentorWithTTA
24
+
25
+ # evaluation
26
+ from .evaluation.instance_evaluation import InstanceSegEvaluator
fcclip/config.py ADDED
@@ -0,0 +1,124 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # -*- coding: utf-8 -*-
2
+ # Copyright (c) Facebook, Inc. and its affiliates.
3
+ from detectron2.config import CfgNode as CN
4
+
5
+
6
+ def add_maskformer2_config(cfg):
7
+ """
8
+ Add config for MASK_FORMER.
9
+ """
10
+ # NOTE: configs from original maskformer
11
+ # data config
12
+ # select the dataset mapper
13
+ cfg.INPUT.DATASET_MAPPER_NAME = "mask_former_semantic"
14
+ # Color augmentation
15
+ cfg.INPUT.COLOR_AUG_SSD = False
16
+ # We retry random cropping until no single category in semantic segmentation GT occupies more
17
+ # than `SINGLE_CATEGORY_MAX_AREA` part of the crop.
18
+ cfg.INPUT.CROP.SINGLE_CATEGORY_MAX_AREA = 1.0
19
+ # Pad image and segmentation GT in dataset mapper.
20
+ cfg.INPUT.SIZE_DIVISIBILITY = -1
21
+
22
+ # solver config
23
+ # weight decay on embedding
24
+ cfg.SOLVER.WEIGHT_DECAY_EMBED = 0.0
25
+ # optimizer
26
+ cfg.SOLVER.OPTIMIZER = "ADAMW"
27
+ cfg.SOLVER.BACKBONE_MULTIPLIER = 0.1
28
+
29
+ # mask_former model config
30
+ cfg.MODEL.MASK_FORMER = CN()
31
+
32
+ # loss
33
+ cfg.MODEL.MASK_FORMER.DEEP_SUPERVISION = True
34
+ cfg.MODEL.MASK_FORMER.NO_OBJECT_WEIGHT = 0.1
35
+ cfg.MODEL.MASK_FORMER.CLASS_WEIGHT = 1.0
36
+ cfg.MODEL.MASK_FORMER.DICE_WEIGHT = 1.0
37
+ cfg.MODEL.MASK_FORMER.MASK_WEIGHT = 20.0
38
+
39
+ # transformer config
40
+ cfg.MODEL.MASK_FORMER.NHEADS = 8
41
+ cfg.MODEL.MASK_FORMER.DROPOUT = 0.1
42
+ cfg.MODEL.MASK_FORMER.DIM_FEEDFORWARD = 2048
43
+ cfg.MODEL.MASK_FORMER.ENC_LAYERS = 0
44
+ cfg.MODEL.MASK_FORMER.DEC_LAYERS = 6
45
+ cfg.MODEL.MASK_FORMER.PRE_NORM = False
46
+
47
+ cfg.MODEL.MASK_FORMER.HIDDEN_DIM = 256
48
+ cfg.MODEL.MASK_FORMER.NUM_OBJECT_QUERIES = 100
49
+
50
+ cfg.MODEL.MASK_FORMER.TRANSFORMER_IN_FEATURE = "res5"
51
+ cfg.MODEL.MASK_FORMER.ENFORCE_INPUT_PROJ = False
52
+
53
+ # mask_former inference config
54
+ cfg.MODEL.MASK_FORMER.TEST = CN()
55
+ cfg.MODEL.MASK_FORMER.TEST.SEMANTIC_ON = True
56
+ cfg.MODEL.MASK_FORMER.TEST.INSTANCE_ON = False
57
+ cfg.MODEL.MASK_FORMER.TEST.PANOPTIC_ON = False
58
+ cfg.MODEL.MASK_FORMER.TEST.OBJECT_MASK_THRESHOLD = 0.0
59
+ cfg.MODEL.MASK_FORMER.TEST.OVERLAP_THRESHOLD = 0.0
60
+ cfg.MODEL.MASK_FORMER.TEST.SEM_SEG_POSTPROCESSING_BEFORE_INFERENCE = False
61
+
62
+ # Sometimes `backbone.size_divisibility` is set to 0 for some backbone (e.g. ResNet)
63
+ # you can use this config to override
64
+ cfg.MODEL.MASK_FORMER.SIZE_DIVISIBILITY = 32
65
+
66
+ # pixel decoder config
67
+ cfg.MODEL.SEM_SEG_HEAD.MASK_DIM = 256
68
+ # adding transformer in pixel decoder
69
+ cfg.MODEL.SEM_SEG_HEAD.TRANSFORMER_ENC_LAYERS = 0
70
+ # pixel decoder
71
+ cfg.MODEL.SEM_SEG_HEAD.PIXEL_DECODER_NAME = "BasePixelDecoder"
72
+
73
+ # swin transformer backbone
74
+ cfg.MODEL.SWIN = CN()
75
+ cfg.MODEL.SWIN.PRETRAIN_IMG_SIZE = 224
76
+ cfg.MODEL.SWIN.PATCH_SIZE = 4
77
+ cfg.MODEL.SWIN.EMBED_DIM = 96
78
+ cfg.MODEL.SWIN.DEPTHS = [2, 2, 6, 2]
79
+ cfg.MODEL.SWIN.NUM_HEADS = [3, 6, 12, 24]
80
+ cfg.MODEL.SWIN.WINDOW_SIZE = 7
81
+ cfg.MODEL.SWIN.MLP_RATIO = 4.0
82
+ cfg.MODEL.SWIN.QKV_BIAS = True
83
+ cfg.MODEL.SWIN.QK_SCALE = None
84
+ cfg.MODEL.SWIN.DROP_RATE = 0.0
85
+ cfg.MODEL.SWIN.ATTN_DROP_RATE = 0.0
86
+ cfg.MODEL.SWIN.DROP_PATH_RATE = 0.3
87
+ cfg.MODEL.SWIN.APE = False
88
+ cfg.MODEL.SWIN.PATCH_NORM = True
89
+ cfg.MODEL.SWIN.OUT_FEATURES = ["res2", "res3", "res4", "res5"]
90
+ cfg.MODEL.SWIN.USE_CHECKPOINT = False
91
+
92
+ # NOTE: maskformer2 extra configs
93
+ # transformer module
94
+ cfg.MODEL.MASK_FORMER.TRANSFORMER_DECODER_NAME = "MultiScaleMaskedTransformerDecoder"
95
+
96
+ # LSJ aug
97
+ cfg.INPUT.IMAGE_SIZE = 1024
98
+ cfg.INPUT.MIN_SCALE = 0.1
99
+ cfg.INPUT.MAX_SCALE = 2.0
100
+
101
+ # MSDeformAttn encoder configs
102
+ cfg.MODEL.SEM_SEG_HEAD.DEFORMABLE_TRANSFORMER_ENCODER_IN_FEATURES = ["res3", "res4", "res5"]
103
+ cfg.MODEL.SEM_SEG_HEAD.DEFORMABLE_TRANSFORMER_ENCODER_N_POINTS = 4
104
+ cfg.MODEL.SEM_SEG_HEAD.DEFORMABLE_TRANSFORMER_ENCODER_N_HEADS = 8
105
+
106
+ # point loss configs
107
+ # Number of points sampled during training for a mask point head.
108
+ cfg.MODEL.MASK_FORMER.TRAIN_NUM_POINTS = 112 * 112
109
+ # Oversampling parameter for PointRend point sampling during training. Parameter `k` in the
110
+ # original paper.
111
+ cfg.MODEL.MASK_FORMER.OVERSAMPLE_RATIO = 3.0
112
+ # Importance sampling parameter for PointRend point sampling during training. Parametr `beta` in
113
+ # the original paper.
114
+ cfg.MODEL.MASK_FORMER.IMPORTANCE_SAMPLE_RATIO = 0.75
115
+
116
+
117
+ def add_fcclip_config(cfg):
118
+ # FC-CLIP model config
119
+ cfg.MODEL.FC_CLIP = CN()
120
+ cfg.MODEL.FC_CLIP.CLIP_MODEL_NAME = "convnext_large_d_320"
121
+ cfg.MODEL.FC_CLIP.CLIP_PRETRAINED_WEIGHTS = "laion2b_s29b_b131k_ft_soup"
122
+ cfg.MODEL.FC_CLIP.EMBED_DIM = 768
123
+ cfg.MODEL.FC_CLIP.GEOMETRIC_ENSEMBLE_ALPHA = 0.4
124
+ cfg.MODEL.FC_CLIP.GEOMETRIC_ENSEMBLE_BETA = 0.8
fcclip/data/.DS_Store ADDED
Binary file (6.15 kB). View file
 
fcclip/data/__init__.py ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ # Copyright (c) Facebook, Inc. and its affiliates.
2
+ from . import datasets
fcclip/data/dataset_mappers/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Copyright (c) Facebook, Inc. and its affiliates.
fcclip/data/dataset_mappers/coco_instance_new_baseline_dataset_mapper.py ADDED
@@ -0,0 +1,189 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Facebook, Inc. and its affiliates.
2
+ # Modified by Bowen Cheng from https://github.com/facebookresearch/detr/blob/master/d2/detr/dataset_mapper.py
3
+ import copy
4
+ import logging
5
+
6
+ import numpy as np
7
+ import torch
8
+
9
+ from detectron2.config import configurable
10
+ from detectron2.data import detection_utils as utils
11
+ from detectron2.data import transforms as T
12
+ from detectron2.data.transforms import TransformGen
13
+ from detectron2.structures import BitMasks, Instances
14
+
15
+ from pycocotools import mask as coco_mask
16
+
17
+ __all__ = ["COCOInstanceNewBaselineDatasetMapper"]
18
+
19
+
20
+ def convert_coco_poly_to_mask(segmentations, height, width):
21
+ masks = []
22
+ for polygons in segmentations:
23
+ rles = coco_mask.frPyObjects(polygons, height, width)
24
+ mask = coco_mask.decode(rles)
25
+ if len(mask.shape) < 3:
26
+ mask = mask[..., None]
27
+ mask = torch.as_tensor(mask, dtype=torch.uint8)
28
+ mask = mask.any(dim=2)
29
+ masks.append(mask)
30
+ if masks:
31
+ masks = torch.stack(masks, dim=0)
32
+ else:
33
+ masks = torch.zeros((0, height, width), dtype=torch.uint8)
34
+ return masks
35
+
36
+
37
+ def build_transform_gen(cfg, is_train):
38
+ """
39
+ Create a list of default :class:`Augmentation` from config.
40
+ Now it includes resizing and flipping.
41
+ Returns:
42
+ list[Augmentation]
43
+ """
44
+ assert is_train, "Only support training augmentation"
45
+ image_size = cfg.INPUT.IMAGE_SIZE
46
+ min_scale = cfg.INPUT.MIN_SCALE
47
+ max_scale = cfg.INPUT.MAX_SCALE
48
+
49
+ augmentation = []
50
+
51
+ if cfg.INPUT.RANDOM_FLIP != "none":
52
+ augmentation.append(
53
+ T.RandomFlip(
54
+ horizontal=cfg.INPUT.RANDOM_FLIP == "horizontal",
55
+ vertical=cfg.INPUT.RANDOM_FLIP == "vertical",
56
+ )
57
+ )
58
+
59
+ augmentation.extend([
60
+ T.ResizeScale(
61
+ min_scale=min_scale, max_scale=max_scale, target_height=image_size, target_width=image_size
62
+ ),
63
+ T.FixedSizeCrop(crop_size=(image_size, image_size)),
64
+ ])
65
+
66
+ return augmentation
67
+
68
+
69
+ # This is specifically designed for the COCO dataset.
70
+ class COCOInstanceNewBaselineDatasetMapper:
71
+ """
72
+ A callable which takes a dataset dict in Detectron2 Dataset format,
73
+ and map it into a format used by MaskFormer.
74
+
75
+ This dataset mapper applies the same transformation as DETR for COCO panoptic segmentation.
76
+
77
+ The callable currently does the following:
78
+
79
+ 1. Read the image from "file_name"
80
+ 2. Applies geometric transforms to the image and annotation
81
+ 3. Find and applies suitable cropping to the image and annotation
82
+ 4. Prepare image and annotation to Tensors
83
+ """
84
+
85
+ @configurable
86
+ def __init__(
87
+ self,
88
+ is_train=True,
89
+ *,
90
+ tfm_gens,
91
+ image_format,
92
+ ):
93
+ """
94
+ NOTE: this interface is experimental.
95
+ Args:
96
+ is_train: for training or inference
97
+ augmentations: a list of augmentations or deterministic transforms to apply
98
+ tfm_gens: data augmentation
99
+ image_format: an image format supported by :func:`detection_utils.read_image`.
100
+ """
101
+ self.tfm_gens = tfm_gens
102
+ logging.getLogger(__name__).info(
103
+ "[COCOInstanceNewBaselineDatasetMapper] Full TransformGens used in training: {}".format(str(self.tfm_gens))
104
+ )
105
+
106
+ self.img_format = image_format
107
+ self.is_train = is_train
108
+
109
+ @classmethod
110
+ def from_config(cls, cfg, is_train=True):
111
+ # Build augmentation
112
+ tfm_gens = build_transform_gen(cfg, is_train)
113
+
114
+ ret = {
115
+ "is_train": is_train,
116
+ "tfm_gens": tfm_gens,
117
+ "image_format": cfg.INPUT.FORMAT,
118
+ }
119
+ return ret
120
+
121
+ def __call__(self, dataset_dict):
122
+ """
123
+ Args:
124
+ dataset_dict (dict): Metadata of one image, in Detectron2 Dataset format.
125
+
126
+ Returns:
127
+ dict: a format that builtin models in detectron2 accept
128
+ """
129
+ dataset_dict = copy.deepcopy(dataset_dict) # it will be modified by code below
130
+ image = utils.read_image(dataset_dict["file_name"], format=self.img_format)
131
+ utils.check_image_size(dataset_dict, image)
132
+
133
+ # TODO: get padding mask
134
+ # by feeding a "segmentation mask" to the same transforms
135
+ padding_mask = np.ones(image.shape[:2])
136
+
137
+ image, transforms = T.apply_transform_gens(self.tfm_gens, image)
138
+ # the crop transformation has default padding value 0 for segmentation
139
+ padding_mask = transforms.apply_segmentation(padding_mask)
140
+ padding_mask = ~ padding_mask.astype(bool)
141
+
142
+ image_shape = image.shape[:2] # h, w
143
+
144
+ # Pytorch's dataloader is efficient on torch.Tensor due to shared-memory,
145
+ # but not efficient on large generic data structures due to the use of pickle & mp.Queue.
146
+ # Therefore it's important to use torch.Tensor.
147
+ dataset_dict["image"] = torch.as_tensor(np.ascontiguousarray(image.transpose(2, 0, 1)))
148
+ dataset_dict["padding_mask"] = torch.as_tensor(np.ascontiguousarray(padding_mask))
149
+
150
+ if not self.is_train:
151
+ # USER: Modify this if you want to keep them for some reason.
152
+ dataset_dict.pop("annotations", None)
153
+ return dataset_dict
154
+
155
+ if "annotations" in dataset_dict:
156
+ # USER: Modify this if you want to keep them for some reason.
157
+ for anno in dataset_dict["annotations"]:
158
+ # Let's always keep mask
159
+ # if not self.mask_on:
160
+ # anno.pop("segmentation", None)
161
+ anno.pop("keypoints", None)
162
+
163
+ # USER: Implement additional transformations if you have other types of data
164
+ annos = [
165
+ utils.transform_instance_annotations(obj, transforms, image_shape)
166
+ for obj in dataset_dict.pop("annotations")
167
+ if obj.get("iscrowd", 0) == 0
168
+ ]
169
+ # NOTE: does not support BitMask due to augmentation
170
+ # Current BitMask cannot handle empty objects
171
+ instances = utils.annotations_to_instances(annos, image_shape)
172
+ # After transforms such as cropping are applied, the bounding box may no longer
173
+ # tightly bound the object. As an example, imagine a triangle object
174
+ # [(0,0), (2,0), (0,2)] cropped by a box [(1,0),(2,2)] (XYXY format). The tight
175
+ # bounding box of the cropped triangle should be [(1,0),(2,1)], which is not equal to
176
+ # the intersection of original bounding box and the cropping box.
177
+ instances.gt_boxes = instances.gt_masks.get_bounding_boxes()
178
+ # Need to filter empty instances first (due to augmentation)
179
+ instances = utils.filter_empty_instances(instances)
180
+ # Generate masks from polygon
181
+ h, w = instances.image_size
182
+ # image_size_xyxy = torch.as_tensor([w, h, w, h], dtype=torch.float)
183
+ if hasattr(instances, 'gt_masks'):
184
+ gt_masks = instances.gt_masks
185
+ gt_masks = convert_coco_poly_to_mask(gt_masks.polygons, h, w)
186
+ instances.gt_masks = gt_masks
187
+ dataset_dict["instances"] = instances
188
+
189
+ return dataset_dict
fcclip/data/dataset_mappers/coco_panoptic_new_baseline_dataset_mapper.py ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Facebook, Inc. and its affiliates.
2
+ # Modified by Bowen Cheng from https://github.com/facebookresearch/detr/blob/master/d2/detr/dataset_mapper.py
3
+ import copy
4
+ import logging
5
+
6
+ import numpy as np
7
+ import torch
8
+
9
+ from detectron2.config import configurable
10
+ from detectron2.data import detection_utils as utils
11
+ from detectron2.data import transforms as T
12
+ from detectron2.data.transforms import TransformGen
13
+ from detectron2.structures import BitMasks, Boxes, Instances
14
+
15
+ __all__ = ["COCOPanopticNewBaselineDatasetMapper"]
16
+
17
+
18
+ def build_transform_gen(cfg, is_train):
19
+ """
20
+ Create a list of default :class:`Augmentation` from config.
21
+ Now it includes resizing and flipping.
22
+ Returns:
23
+ list[Augmentation]
24
+ """
25
+ assert is_train, "Only support training augmentation"
26
+ image_size = cfg.INPUT.IMAGE_SIZE
27
+ min_scale = cfg.INPUT.MIN_SCALE
28
+ max_scale = cfg.INPUT.MAX_SCALE
29
+
30
+ augmentation = []
31
+
32
+ if cfg.INPUT.RANDOM_FLIP != "none":
33
+ augmentation.append(
34
+ T.RandomFlip(
35
+ horizontal=cfg.INPUT.RANDOM_FLIP == "horizontal",
36
+ vertical=cfg.INPUT.RANDOM_FLIP == "vertical",
37
+ )
38
+ )
39
+
40
+ augmentation.extend([
41
+ T.ResizeScale(
42
+ min_scale=min_scale, max_scale=max_scale, target_height=image_size, target_width=image_size
43
+ ),
44
+ T.FixedSizeCrop(crop_size=(image_size, image_size)),
45
+ ])
46
+
47
+ return augmentation
48
+
49
+
50
+ # This is specifically designed for the COCO dataset.
51
+ class COCOPanopticNewBaselineDatasetMapper:
52
+ """
53
+ A callable which takes a dataset dict in Detectron2 Dataset format,
54
+ and map it into a format used by MaskFormer.
55
+
56
+ This dataset mapper applies the same transformation as DETR for COCO panoptic segmentation.
57
+
58
+ The callable currently does the following:
59
+
60
+ 1. Read the image from "file_name"
61
+ 2. Applies geometric transforms to the image and annotation
62
+ 3. Find and applies suitable cropping to the image and annotation
63
+ 4. Prepare image and annotation to Tensors
64
+ """
65
+
66
+ @configurable
67
+ def __init__(
68
+ self,
69
+ is_train=True,
70
+ *,
71
+ tfm_gens,
72
+ image_format,
73
+ ):
74
+ """
75
+ NOTE: this interface is experimental.
76
+ Args:
77
+ is_train: for training or inference
78
+ augmentations: a list of augmentations or deterministic transforms to apply
79
+ crop_gen: crop augmentation
80
+ tfm_gens: data augmentation
81
+ image_format: an image format supported by :func:`detection_utils.read_image`.
82
+ """
83
+ self.tfm_gens = tfm_gens
84
+ logging.getLogger(__name__).info(
85
+ "[COCOPanopticNewBaselineDatasetMapper] Full TransformGens used in training: {}".format(
86
+ str(self.tfm_gens)
87
+ )
88
+ )
89
+
90
+ self.img_format = image_format
91
+ self.is_train = is_train
92
+
93
+ @classmethod
94
+ def from_config(cls, cfg, is_train=True):
95
+ # Build augmentation
96
+ tfm_gens = build_transform_gen(cfg, is_train)
97
+
98
+ ret = {
99
+ "is_train": is_train,
100
+ "tfm_gens": tfm_gens,
101
+ "image_format": cfg.INPUT.FORMAT,
102
+ }
103
+ return ret
104
+
105
+ def __call__(self, dataset_dict):
106
+ """
107
+ Args:
108
+ dataset_dict (dict): Metadata of one image, in Detectron2 Dataset format.
109
+
110
+ Returns:
111
+ dict: a format that builtin models in detectron2 accept
112
+ """
113
+ dataset_dict = copy.deepcopy(dataset_dict) # it will be modified by code below
114
+ image = utils.read_image(dataset_dict["file_name"], format=self.img_format)
115
+ utils.check_image_size(dataset_dict, image)
116
+
117
+ image, transforms = T.apply_transform_gens(self.tfm_gens, image)
118
+ image_shape = image.shape[:2] # h, w
119
+
120
+ # Pytorch's dataloader is efficient on torch.Tensor due to shared-memory,
121
+ # but not efficient on large generic data structures due to the use of pickle & mp.Queue.
122
+ # Therefore it's important to use torch.Tensor.
123
+ dataset_dict["image"] = torch.as_tensor(np.ascontiguousarray(image.transpose(2, 0, 1)))
124
+
125
+ if not self.is_train:
126
+ # USER: Modify this if you want to keep them for some reason.
127
+ dataset_dict.pop("annotations", None)
128
+ return dataset_dict
129
+
130
+ if "pan_seg_file_name" in dataset_dict:
131
+ pan_seg_gt = utils.read_image(dataset_dict.pop("pan_seg_file_name"), "RGB")
132
+ segments_info = dataset_dict["segments_info"]
133
+
134
+ # apply the same transformation to panoptic segmentation
135
+ pan_seg_gt = transforms.apply_segmentation(pan_seg_gt)
136
+
137
+ from panopticapi.utils import rgb2id
138
+
139
+ pan_seg_gt = rgb2id(pan_seg_gt)
140
+
141
+ instances = Instances(image_shape)
142
+ classes = []
143
+ masks = []
144
+ for segment_info in segments_info:
145
+ class_id = segment_info["category_id"]
146
+ if not segment_info["iscrowd"]:
147
+ classes.append(class_id)
148
+ masks.append(pan_seg_gt == segment_info["id"])
149
+
150
+ classes = np.array(classes)
151
+ instances.gt_classes = torch.tensor(classes, dtype=torch.int64)
152
+ if len(masks) == 0:
153
+ # Some image does not have annotation (all ignored)
154
+ instances.gt_masks = torch.zeros((0, pan_seg_gt.shape[-2], pan_seg_gt.shape[-1]))
155
+ instances.gt_boxes = Boxes(torch.zeros((0, 4)))
156
+ else:
157
+ masks = BitMasks(
158
+ torch.stack([torch.from_numpy(np.ascontiguousarray(x.copy())) for x in masks])
159
+ )
160
+ instances.gt_masks = masks.tensor
161
+ instances.gt_boxes = masks.get_bounding_boxes()
162
+
163
+ dataset_dict["instances"] = instances
164
+
165
+ return dataset_dict
fcclip/data/dataset_mappers/mask_former_instance_dataset_mapper.py ADDED
@@ -0,0 +1,180 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Facebook, Inc. and its affiliates.
2
+ import copy
3
+ import logging
4
+
5
+ import numpy as np
6
+ import pycocotools.mask as mask_util
7
+ import torch
8
+ from torch.nn import functional as F
9
+
10
+ from detectron2.config import configurable
11
+ from detectron2.data import detection_utils as utils
12
+ from detectron2.data import transforms as T
13
+ from detectron2.projects.point_rend import ColorAugSSDTransform
14
+ from detectron2.structures import BitMasks, Instances, polygons_to_bitmask
15
+
16
+ __all__ = ["MaskFormerInstanceDatasetMapper"]
17
+
18
+
19
+ class MaskFormerInstanceDatasetMapper:
20
+ """
21
+ A callable which takes a dataset dict in Detectron2 Dataset format,
22
+ and map it into a format used by MaskFormer for instance segmentation.
23
+
24
+ The callable currently does the following:
25
+
26
+ 1. Read the image from "file_name"
27
+ 2. Applies geometric transforms to the image and annotation
28
+ 3. Find and applies suitable cropping to the image and annotation
29
+ 4. Prepare image and annotation to Tensors
30
+ """
31
+
32
+ @configurable
33
+ def __init__(
34
+ self,
35
+ is_train=True,
36
+ *,
37
+ augmentations,
38
+ image_format,
39
+ size_divisibility,
40
+ ):
41
+ """
42
+ NOTE: this interface is experimental.
43
+ Args:
44
+ is_train: for training or inference
45
+ augmentations: a list of augmentations or deterministic transforms to apply
46
+ image_format: an image format supported by :func:`detection_utils.read_image`.
47
+ size_divisibility: pad image size to be divisible by this value
48
+ """
49
+ self.is_train = is_train
50
+ self.tfm_gens = augmentations
51
+ self.img_format = image_format
52
+ self.size_divisibility = size_divisibility
53
+
54
+ logger = logging.getLogger(__name__)
55
+ mode = "training" if is_train else "inference"
56
+ logger.info(f"[{self.__class__.__name__}] Augmentations used in {mode}: {augmentations}")
57
+
58
+ @classmethod
59
+ def from_config(cls, cfg, is_train=True):
60
+ # Build augmentation
61
+ augs = [
62
+ T.ResizeShortestEdge(
63
+ cfg.INPUT.MIN_SIZE_TRAIN,
64
+ cfg.INPUT.MAX_SIZE_TRAIN,
65
+ cfg.INPUT.MIN_SIZE_TRAIN_SAMPLING,
66
+ )
67
+ ]
68
+ if cfg.INPUT.CROP.ENABLED:
69
+ augs.append(
70
+ T.RandomCrop(
71
+ cfg.INPUT.CROP.TYPE,
72
+ cfg.INPUT.CROP.SIZE,
73
+ )
74
+ )
75
+ if cfg.INPUT.COLOR_AUG_SSD:
76
+ augs.append(ColorAugSSDTransform(img_format=cfg.INPUT.FORMAT))
77
+ augs.append(T.RandomFlip())
78
+
79
+ ret = {
80
+ "is_train": is_train,
81
+ "augmentations": augs,
82
+ "image_format": cfg.INPUT.FORMAT,
83
+ "size_divisibility": cfg.INPUT.SIZE_DIVISIBILITY,
84
+ }
85
+ return ret
86
+
87
+ def __call__(self, dataset_dict):
88
+ """
89
+ Args:
90
+ dataset_dict (dict): Metadata of one image, in Detectron2 Dataset format.
91
+
92
+ Returns:
93
+ dict: a format that builtin models in detectron2 accept
94
+ """
95
+ assert self.is_train, "MaskFormerPanopticDatasetMapper should only be used for training!"
96
+
97
+ dataset_dict = copy.deepcopy(dataset_dict) # it will be modified by code below
98
+ image = utils.read_image(dataset_dict["file_name"], format=self.img_format)
99
+ utils.check_image_size(dataset_dict, image)
100
+
101
+ aug_input = T.AugInput(image)
102
+ aug_input, transforms = T.apply_transform_gens(self.tfm_gens, aug_input)
103
+ image = aug_input.image
104
+
105
+ # transform instnace masks
106
+ assert "annotations" in dataset_dict
107
+ for anno in dataset_dict["annotations"]:
108
+ anno.pop("keypoints", None)
109
+
110
+ annos = [
111
+ utils.transform_instance_annotations(obj, transforms, image.shape[:2])
112
+ for obj in dataset_dict.pop("annotations")
113
+ if obj.get("iscrowd", 0) == 0
114
+ ]
115
+
116
+ if len(annos):
117
+ assert "segmentation" in annos[0]
118
+ segms = [obj["segmentation"] for obj in annos]
119
+ masks = []
120
+ for segm in segms:
121
+ if isinstance(segm, list):
122
+ # polygon
123
+ masks.append(polygons_to_bitmask(segm, *image.shape[:2]))
124
+ elif isinstance(segm, dict):
125
+ # COCO RLE
126
+ masks.append(mask_util.decode(segm))
127
+ elif isinstance(segm, np.ndarray):
128
+ assert segm.ndim == 2, "Expect segmentation of 2 dimensions, got {}.".format(
129
+ segm.ndim
130
+ )
131
+ # mask array
132
+ masks.append(segm)
133
+ else:
134
+ raise ValueError(
135
+ "Cannot convert segmentation of type '{}' to BitMasks!"
136
+ "Supported types are: polygons as list[list[float] or ndarray],"
137
+ " COCO-style RLE as a dict, or a binary segmentation mask "
138
+ " in a 2D numpy array of shape HxW.".format(type(segm))
139
+ )
140
+
141
+ # Pad image and segmentation label here!
142
+ image = torch.as_tensor(np.ascontiguousarray(image.transpose(2, 0, 1)))
143
+ masks = [torch.from_numpy(np.ascontiguousarray(x)) for x in masks]
144
+
145
+ classes = [int(obj["category_id"]) for obj in annos]
146
+ classes = torch.tensor(classes, dtype=torch.int64)
147
+
148
+ if self.size_divisibility > 0:
149
+ image_size = (image.shape[-2], image.shape[-1])
150
+ padding_size = [
151
+ 0,
152
+ self.size_divisibility - image_size[1],
153
+ 0,
154
+ self.size_divisibility - image_size[0],
155
+ ]
156
+ # pad image
157
+ image = F.pad(image, padding_size, value=128).contiguous()
158
+ # pad mask
159
+ masks = [F.pad(x, padding_size, value=0).contiguous() for x in masks]
160
+
161
+ image_shape = (image.shape[-2], image.shape[-1]) # h, w
162
+
163
+ # Pytorch's dataloader is efficient on torch.Tensor due to shared-memory,
164
+ # but not efficient on large generic data structures due to the use of pickle & mp.Queue.
165
+ # Therefore it's important to use torch.Tensor.
166
+ dataset_dict["image"] = image
167
+
168
+ # Prepare per-category binary masks
169
+ instances = Instances(image_shape)
170
+ instances.gt_classes = classes
171
+ if len(masks) == 0:
172
+ # Some image does not have annotation (all ignored)
173
+ instances.gt_masks = torch.zeros((0, image.shape[-2], image.shape[-1]))
174
+ else:
175
+ masks = BitMasks(torch.stack(masks))
176
+ instances.gt_masks = masks.tensor
177
+
178
+ dataset_dict["instances"] = instances
179
+
180
+ return dataset_dict
fcclip/data/dataset_mappers/mask_former_panoptic_dataset_mapper.py ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Facebook, Inc. and its affiliates.
2
+ import copy
3
+ import logging
4
+
5
+ import numpy as np
6
+ import torch
7
+ from torch.nn import functional as F
8
+
9
+ from detectron2.config import configurable
10
+ from detectron2.data import detection_utils as utils
11
+ from detectron2.data import transforms as T
12
+ from detectron2.structures import BitMasks, Instances
13
+
14
+ from .mask_former_semantic_dataset_mapper import MaskFormerSemanticDatasetMapper
15
+
16
+ __all__ = ["MaskFormerPanopticDatasetMapper"]
17
+
18
+
19
+ class MaskFormerPanopticDatasetMapper(MaskFormerSemanticDatasetMapper):
20
+ """
21
+ A callable which takes a dataset dict in Detectron2 Dataset format,
22
+ and map it into a format used by MaskFormer for panoptic segmentation.
23
+
24
+ The callable currently does the following:
25
+
26
+ 1. Read the image from "file_name"
27
+ 2. Applies geometric transforms to the image and annotation
28
+ 3. Find and applies suitable cropping to the image and annotation
29
+ 4. Prepare image and annotation to Tensors
30
+ """
31
+
32
+ @configurable
33
+ def __init__(
34
+ self,
35
+ is_train=True,
36
+ *,
37
+ augmentations,
38
+ image_format,
39
+ ignore_label,
40
+ size_divisibility,
41
+ ):
42
+ """
43
+ NOTE: this interface is experimental.
44
+ Args:
45
+ is_train: for training or inference
46
+ augmentations: a list of augmentations or deterministic transforms to apply
47
+ image_format: an image format supported by :func:`detection_utils.read_image`.
48
+ ignore_label: the label that is ignored to evaluation
49
+ size_divisibility: pad image size to be divisible by this value
50
+ """
51
+ super().__init__(
52
+ is_train,
53
+ augmentations=augmentations,
54
+ image_format=image_format,
55
+ ignore_label=ignore_label,
56
+ size_divisibility=size_divisibility,
57
+ )
58
+
59
+ def __call__(self, dataset_dict):
60
+ """
61
+ Args:
62
+ dataset_dict (dict): Metadata of one image, in Detectron2 Dataset format.
63
+
64
+ Returns:
65
+ dict: a format that builtin models in detectron2 accept
66
+ """
67
+ assert self.is_train, "MaskFormerPanopticDatasetMapper should only be used for training!"
68
+
69
+ dataset_dict = copy.deepcopy(dataset_dict) # it will be modified by code below
70
+ image = utils.read_image(dataset_dict["file_name"], format=self.img_format)
71
+ utils.check_image_size(dataset_dict, image)
72
+
73
+ # semantic segmentation
74
+ if "sem_seg_file_name" in dataset_dict:
75
+ # PyTorch transformation not implemented for uint16, so converting it to double first
76
+ sem_seg_gt = utils.read_image(dataset_dict.pop("sem_seg_file_name")).astype("double")
77
+ else:
78
+ sem_seg_gt = None
79
+
80
+ # panoptic segmentation
81
+ if "pan_seg_file_name" in dataset_dict:
82
+ pan_seg_gt = utils.read_image(dataset_dict.pop("pan_seg_file_name"), "RGB")
83
+ segments_info = dataset_dict["segments_info"]
84
+ else:
85
+ pan_seg_gt = None
86
+ segments_info = None
87
+
88
+ if pan_seg_gt is None:
89
+ raise ValueError(
90
+ "Cannot find 'pan_seg_file_name' for panoptic segmentation dataset {}.".format(
91
+ dataset_dict["file_name"]
92
+ )
93
+ )
94
+
95
+ aug_input = T.AugInput(image, sem_seg=sem_seg_gt)
96
+ aug_input, transforms = T.apply_transform_gens(self.tfm_gens, aug_input)
97
+ image = aug_input.image
98
+ if sem_seg_gt is not None:
99
+ sem_seg_gt = aug_input.sem_seg
100
+
101
+ # apply the same transformation to panoptic segmentation
102
+ pan_seg_gt = transforms.apply_segmentation(pan_seg_gt)
103
+
104
+ from panopticapi.utils import rgb2id
105
+
106
+ pan_seg_gt = rgb2id(pan_seg_gt)
107
+
108
+ # Pad image and segmentation label here!
109
+ image = torch.as_tensor(np.ascontiguousarray(image.transpose(2, 0, 1)))
110
+ if sem_seg_gt is not None:
111
+ sem_seg_gt = torch.as_tensor(sem_seg_gt.astype("long"))
112
+ pan_seg_gt = torch.as_tensor(pan_seg_gt.astype("long"))
113
+
114
+ if self.size_divisibility > 0:
115
+ image_size = (image.shape[-2], image.shape[-1])
116
+ padding_size = [
117
+ 0,
118
+ self.size_divisibility - image_size[1],
119
+ 0,
120
+ self.size_divisibility - image_size[0],
121
+ ]
122
+ image = F.pad(image, padding_size, value=128).contiguous()
123
+ if sem_seg_gt is not None:
124
+ sem_seg_gt = F.pad(sem_seg_gt, padding_size, value=self.ignore_label).contiguous()
125
+ pan_seg_gt = F.pad(
126
+ pan_seg_gt, padding_size, value=0
127
+ ).contiguous() # 0 is the VOID panoptic label
128
+
129
+ image_shape = (image.shape[-2], image.shape[-1]) # h, w
130
+
131
+ # Pytorch's dataloader is efficient on torch.Tensor due to shared-memory,
132
+ # but not efficient on large generic data structures due to the use of pickle & mp.Queue.
133
+ # Therefore it's important to use torch.Tensor.
134
+ dataset_dict["image"] = image
135
+ if sem_seg_gt is not None:
136
+ dataset_dict["sem_seg"] = sem_seg_gt.long()
137
+
138
+ if "annotations" in dataset_dict:
139
+ raise ValueError("Pemantic segmentation dataset should not have 'annotations'.")
140
+
141
+ # Prepare per-category binary masks
142
+ pan_seg_gt = pan_seg_gt.numpy()
143
+ instances = Instances(image_shape)
144
+ classes = []
145
+ masks = []
146
+ for segment_info in segments_info:
147
+ class_id = segment_info["category_id"]
148
+ if not segment_info["iscrowd"]:
149
+ classes.append(class_id)
150
+ masks.append(pan_seg_gt == segment_info["id"])
151
+
152
+ classes = np.array(classes)
153
+ instances.gt_classes = torch.tensor(classes, dtype=torch.int64)
154
+ if len(masks) == 0:
155
+ # Some image does not have annotation (all ignored)
156
+ instances.gt_masks = torch.zeros((0, pan_seg_gt.shape[-2], pan_seg_gt.shape[-1]))
157
+ else:
158
+ masks = BitMasks(
159
+ torch.stack([torch.from_numpy(np.ascontiguousarray(x.copy())) for x in masks])
160
+ )
161
+ instances.gt_masks = masks.tensor
162
+
163
+ dataset_dict["instances"] = instances
164
+
165
+ return dataset_dict
fcclip/data/dataset_mappers/mask_former_semantic_dataset_mapper.py ADDED
@@ -0,0 +1,184 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Facebook, Inc. and its affiliates.
2
+ import copy
3
+ import logging
4
+
5
+ import numpy as np
6
+ import torch
7
+ from torch.nn import functional as F
8
+
9
+ from detectron2.config import configurable
10
+ from detectron2.data import MetadataCatalog
11
+ from detectron2.data import detection_utils as utils
12
+ from detectron2.data import transforms as T
13
+ from detectron2.projects.point_rend import ColorAugSSDTransform
14
+ from detectron2.structures import BitMasks, Instances
15
+
16
+ __all__ = ["MaskFormerSemanticDatasetMapper"]
17
+
18
+
19
+ class MaskFormerSemanticDatasetMapper:
20
+ """
21
+ A callable which takes a dataset dict in Detectron2 Dataset format,
22
+ and map it into a format used by MaskFormer for semantic segmentation.
23
+
24
+ The callable currently does the following:
25
+
26
+ 1. Read the image from "file_name"
27
+ 2. Applies geometric transforms to the image and annotation
28
+ 3. Find and applies suitable cropping to the image and annotation
29
+ 4. Prepare image and annotation to Tensors
30
+ """
31
+
32
+ @configurable
33
+ def __init__(
34
+ self,
35
+ is_train=True,
36
+ *,
37
+ augmentations,
38
+ image_format,
39
+ ignore_label,
40
+ size_divisibility,
41
+ ):
42
+ """
43
+ NOTE: this interface is experimental.
44
+ Args:
45
+ is_train: for training or inference
46
+ augmentations: a list of augmentations or deterministic transforms to apply
47
+ image_format: an image format supported by :func:`detection_utils.read_image`.
48
+ ignore_label: the label that is ignored to evaluation
49
+ size_divisibility: pad image size to be divisible by this value
50
+ """
51
+ self.is_train = is_train
52
+ self.tfm_gens = augmentations
53
+ self.img_format = image_format
54
+ self.ignore_label = ignore_label
55
+ self.size_divisibility = size_divisibility
56
+
57
+ logger = logging.getLogger(__name__)
58
+ mode = "training" if is_train else "inference"
59
+ logger.info(f"[{self.__class__.__name__}] Augmentations used in {mode}: {augmentations}")
60
+
61
+ @classmethod
62
+ def from_config(cls, cfg, is_train=True):
63
+ # Build augmentation
64
+ augs = [
65
+ T.ResizeShortestEdge(
66
+ cfg.INPUT.MIN_SIZE_TRAIN,
67
+ cfg.INPUT.MAX_SIZE_TRAIN,
68
+ cfg.INPUT.MIN_SIZE_TRAIN_SAMPLING,
69
+ )
70
+ ]
71
+ if cfg.INPUT.CROP.ENABLED:
72
+ augs.append(
73
+ T.RandomCrop_CategoryAreaConstraint(
74
+ cfg.INPUT.CROP.TYPE,
75
+ cfg.INPUT.CROP.SIZE,
76
+ cfg.INPUT.CROP.SINGLE_CATEGORY_MAX_AREA,
77
+ cfg.MODEL.SEM_SEG_HEAD.IGNORE_VALUE,
78
+ )
79
+ )
80
+ if cfg.INPUT.COLOR_AUG_SSD:
81
+ augs.append(ColorAugSSDTransform(img_format=cfg.INPUT.FORMAT))
82
+ augs.append(T.RandomFlip())
83
+
84
+ # Assume always applies to the training set.
85
+ dataset_names = cfg.DATASETS.TRAIN
86
+ meta = MetadataCatalog.get(dataset_names[0])
87
+ ignore_label = meta.ignore_label
88
+
89
+ ret = {
90
+ "is_train": is_train,
91
+ "augmentations": augs,
92
+ "image_format": cfg.INPUT.FORMAT,
93
+ "ignore_label": ignore_label,
94
+ "size_divisibility": cfg.INPUT.SIZE_DIVISIBILITY,
95
+ }
96
+ return ret
97
+
98
+ def __call__(self, dataset_dict):
99
+ """
100
+ Args:
101
+ dataset_dict (dict): Metadata of one image, in Detectron2 Dataset format.
102
+
103
+ Returns:
104
+ dict: a format that builtin models in detectron2 accept
105
+ """
106
+ assert self.is_train, "MaskFormerSemanticDatasetMapper should only be used for training!"
107
+
108
+ dataset_dict = copy.deepcopy(dataset_dict) # it will be modified by code below
109
+ image = utils.read_image(dataset_dict["file_name"], format=self.img_format)
110
+ utils.check_image_size(dataset_dict, image)
111
+
112
+ if "sem_seg_file_name" in dataset_dict:
113
+ # PyTorch transformation not implemented for uint16, so converting it to double first
114
+ sem_seg_gt = utils.read_image(dataset_dict.pop("sem_seg_file_name")).astype("double")
115
+ else:
116
+ sem_seg_gt = None
117
+
118
+ if sem_seg_gt is None:
119
+ raise ValueError(
120
+ "Cannot find 'sem_seg_file_name' for semantic segmentation dataset {}.".format(
121
+ dataset_dict["file_name"]
122
+ )
123
+ )
124
+
125
+ aug_input = T.AugInput(image, sem_seg=sem_seg_gt)
126
+ aug_input, transforms = T.apply_transform_gens(self.tfm_gens, aug_input)
127
+ image = aug_input.image
128
+ sem_seg_gt = aug_input.sem_seg
129
+
130
+ # Pad image and segmentation label here!
131
+ image = torch.as_tensor(np.ascontiguousarray(image.transpose(2, 0, 1)))
132
+ if sem_seg_gt is not None:
133
+ sem_seg_gt = torch.as_tensor(sem_seg_gt.astype("long"))
134
+
135
+ if self.size_divisibility > 0:
136
+ image_size = (image.shape[-2], image.shape[-1])
137
+ padding_size = [
138
+ 0,
139
+ self.size_divisibility - image_size[1],
140
+ 0,
141
+ self.size_divisibility - image_size[0],
142
+ ]
143
+ image = F.pad(image, padding_size, value=128).contiguous()
144
+ if sem_seg_gt is not None:
145
+ sem_seg_gt = F.pad(sem_seg_gt, padding_size, value=self.ignore_label).contiguous()
146
+
147
+ image_shape = (image.shape[-2], image.shape[-1]) # h, w
148
+
149
+ # Pytorch's dataloader is efficient on torch.Tensor due to shared-memory,
150
+ # but not efficient on large generic data structures due to the use of pickle & mp.Queue.
151
+ # Therefore it's important to use torch.Tensor.
152
+ dataset_dict["image"] = image
153
+
154
+ if sem_seg_gt is not None:
155
+ dataset_dict["sem_seg"] = sem_seg_gt.long()
156
+
157
+ if "annotations" in dataset_dict:
158
+ raise ValueError("Semantic segmentation dataset should not have 'annotations'.")
159
+
160
+ # Prepare per-category binary masks
161
+ if sem_seg_gt is not None:
162
+ sem_seg_gt = sem_seg_gt.numpy()
163
+ instances = Instances(image_shape)
164
+ classes = np.unique(sem_seg_gt)
165
+ # remove ignored region
166
+ classes = classes[classes != self.ignore_label]
167
+ instances.gt_classes = torch.tensor(classes, dtype=torch.int64)
168
+
169
+ masks = []
170
+ for class_id in classes:
171
+ masks.append(sem_seg_gt == class_id)
172
+
173
+ if len(masks) == 0:
174
+ # Some image does not have annotation (all ignored)
175
+ instances.gt_masks = torch.zeros((0, sem_seg_gt.shape[-2], sem_seg_gt.shape[-1]))
176
+ else:
177
+ masks = BitMasks(
178
+ torch.stack([torch.from_numpy(np.ascontiguousarray(x.copy())) for x in masks])
179
+ )
180
+ instances.gt_masks = masks.tensor
181
+
182
+ dataset_dict["instances"] = instances
183
+
184
+ return dataset_dict
fcclip/data/datasets/__init__.py ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from . import (
2
+ register_coco_panoptic_annos_semseg,
3
+ register_ade20k_panoptic,
4
+ register_cityscapes_panoptic,
5
+ register_mapillary_vistas_panoptic,
6
+ register_ade20k_full,
7
+ register_pascal_voc_20_semantic,
8
+ register_pascal_voc_21_semantic,
9
+ register_pascal_ctx_59_sem_seg,
10
+ register_pascal_ctx_459_sem_seg,
11
+ register_coco_instance,
12
+ register_ade20k_instance,
13
+ register_coco_stuff_164k,
14
+ openseg_classes
15
+ )
fcclip/data/datasets/ade20k_150_with_prompt_eng.txt ADDED
@@ -0,0 +1,151 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 0:invalid_class_id
2
+ 1:wall,walls,brick wall,stone wall,interior wall
3
+ 2:building,buildings,edifice,edifices
4
+ 3:sky,clouds
5
+ 4:floor,flooring
6
+ 5:tree,trees
7
+ 6:ceiling
8
+ 7:road,route,street,roads,streets,routes
9
+ 8:bed,beds
10
+ 9:windowpane,window,windows
11
+ 10:grass,grass field
12
+ 11:cabinet,cabinets,wall mounted cabine
13
+ 12:sidewalk,pavement
14
+ 13:person,child,girl,boy,woman,man,people,children,girls,boys,women,men
15
+ 14:earth,ground
16
+ 15:door,double door,doors
17
+ 16:table,tables,tablecloth
18
+ 17:mountain,mount,mountains
19
+ 18:plant,flora,plant life,plants,bushes
20
+ 19:curtain,drape,drapery,mantle,pall
21
+ 20:chair,chairs
22
+ 21:car,automobile,cars
23
+ 22:water
24
+ 23:painting,picture,paintings,pictures,wallart,framed canvas
25
+ 24:sofa,couch,sofas,couches
26
+ 25:shelf,shelves
27
+ 26:house exterior
28
+ 27:sea,ocean
29
+ 28:mirror,mirrors
30
+ 29:rug,carpet,carpeting
31
+ 30:field
32
+ 31:armchair,armchairs
33
+ 32:seat,seats
34
+ 33:fence,fencing
35
+ 34:desk,desks
36
+ 35:rock,stone,rocks,stones
37
+ 36:wardrobe,closet,press,wardrobes,closets
38
+ 37:lamp,lamps
39
+ 38:bathtub,bathing tub,bath,tub
40
+ 39:railing,rail
41
+ 40:cushion,cushions
42
+ 41:pedestal
43
+ 42:box,boxes
44
+ 43:column,pillar
45
+ 44:signboard,sign,signboards,signs
46
+ 45:chest of drawers,chest,bureau,dresser
47
+ 46:counter
48
+ 47:sand
49
+ 48:sink
50
+ 49:skyscraper,skyscrapers
51
+ 50:fireplace,hearth,open fireplace
52
+ 51:refrigerator,icebox
53
+ 52:grandstand,covered stand
54
+ 53:path
55
+ 54:stairs,steps
56
+ 55:runway
57
+ 56:case,display case,showcase,vitrine
58
+ 57:pool table,billiard table,snooker table
59
+ 58:pillow,pillows
60
+ 59:screen door,shower door
61
+ 60:stairway,staircase
62
+ 61:river
63
+ 62:bridge,span
64
+ 63:bookcase
65
+ 64:window screen,door screen
66
+ 65:coffee table,cocktail table
67
+ 66:toilet,commode,crapper,potty
68
+ 67:flower,flowers
69
+ 68:book,books
70
+ 69:hill
71
+ 70:bench,benches
72
+ 71:countertop,counter top,worktop
73
+ 72:stove,kitchen stove,kitchen range,kitchen range,cooking stove
74
+ 73:palm tree,palm trees
75
+ 74:kitchen island
76
+ 75:computer,computing machine,computing device,data processor,electronic computer,information processing system
77
+ 76:swivel chair
78
+ 77:boat
79
+ 78:bar
80
+ 79:arcade machine,arcade machines
81
+ 80:hovel,hut,hutch,shack,shanty
82
+ 81:bus,autobus,double-decker,jitney,motorbus,motorcoach,omnibus,passenger vehicle
83
+ 82:towel
84
+ 83:light bulb,lightbulb,bulb,incandescent lamp,electric light,electric-light bulb
85
+ 84:truck,motortruck
86
+ 85:tower,towers
87
+ 86:chandelier,pendant,pendent
88
+ 87:awning,sunshade,sunblind
89
+ 88:streetlight,street lamp
90
+ 89:booth,cubicle,stall,kiosk
91
+ 90:television receiver,television,television set,tv,tv set
92
+ 91:airplane,aeroplane,airplanes,aeroplanes
93
+ 92:dirt track
94
+ 93:apparel,wearing apparel,dress,clothes
95
+ 94:pole
96
+ 95:land,soil
97
+ 96:bannister,banister,balustrade,balusters,handrail
98
+ 97:escalator,moving staircase,moving stairway
99
+ 98:ottoman,pouf,pouffe,puff,hassock
100
+ 99:bottle,bottles,water bottle
101
+ 100:buffet,sideboard
102
+ 101:poster,posting,placard,notice,bill,card
103
+ 102:stage
104
+ 103:van
105
+ 104:ship
106
+ 105:fountain
107
+ 106:conveyer belt,conveyor belt,conveyer,conveyor,transporter
108
+ 107:canopy
109
+ 108:washer,automatic washer,washing machine
110
+ 109:plaything,toy,toys
111
+ 110:swimming pool,swimming bath
112
+ 111:stool,stools
113
+ 112:barrel,cask,barrels,casks
114
+ 113:basket,handbasket
115
+ 114:waterfall,falls
116
+ 115:tent,collapsible shelter
117
+ 116:bag,bags,gift bag,paper bag
118
+ 117:minibike,motorbike
119
+ 118:cradle
120
+ 119:oven
121
+ 120:ball,balls
122
+ 121:food,solid food
123
+ 122:step,stair
124
+ 123:tank,storage tank
125
+ 124:trade name,brand name,brand,marque
126
+ 125:microwave,microwave oven
127
+ 126:plant pots,plant pot,flower pot,flowerpot,planter
128
+ 127:animal,animate being,dog,cat,horse,cow,sheep,zebra,girraffe,bird
129
+ 128:bicycle,bike
130
+ 129:lake
131
+ 130:dishwasher,dish washer,dishwashing machine
132
+ 131:projection screen
133
+ 132:blanket,cover
134
+ 133:sculpture,sculptures
135
+ 134:exhaust hood
136
+ 135:sconce,sconce lamp,sconce light
137
+ 136:vase,vases
138
+ 137:traffic light,traffic signal,traffic lights
139
+ 138:tray,trays
140
+ 139:ashcan,trash can,garbage can,wastebin,ash bin,ash-bin,ashbin,dustbin,trash barrel,trash bin
141
+ 140:ceiling fan,floor fan
142
+ 141:pier,wharf,wharfage,dock
143
+ 142:crt screen
144
+ 143:plate,plates
145
+ 144:monitor,monitoring device,monitors
146
+ 145:bulletin board,notice board
147
+ 146:shower
148
+ 147:radiator
149
+ 148:cup,cups,drinking glass,drinking glasses
150
+ 149:clock
151
+ 150:flag,flags
fcclip/data/datasets/ade20k_847_with_prompt_eng.txt ADDED
@@ -0,0 +1,848 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 0:invalid_class_id
2
+ 1:wall,walls,interior wall,brick wall,stone wall
3
+ 2:building,buildings,edifice,edifices
4
+ 3:sky,clouds
5
+ 4:tree,trees
6
+ 5:road,route,street,roads,streets,routes
7
+ 6:floor,flooring
8
+ 7:ceiling
9
+ 8:bed,beds
10
+ 9:sidewalk,pavement
11
+ 10:earth,ground
12
+ 11:cabinet,cabinets,wall mounted cabine
13
+ 12:person,child,girl,boy,woman,man,people,children,girls,boys,women,men
14
+ 13:grass,grass field
15
+ 14:windowpane,window,windows
16
+ 15:car,automobile,cars
17
+ 16:mountain,mount,mountains
18
+ 17:plant,flora,plant life,plants,bushes
19
+ 18:table,tables,tablecloth
20
+ 19:chair,chairs
21
+ 20:curtain,drape,drapery,mantle,pall
22
+ 21:door,double door,doors
23
+ 22:sofa,couch,sofas,couches
24
+ 23:sea,ocean
25
+ 24:painting,picture,paintings,pictures,wallart,framed canvas
26
+ 25:water
27
+ 26:mirror,mirrors
28
+ 27:house exterior
29
+ 28:rug,carpet,carpeting
30
+ 29:shelf,shelves
31
+ 30:armchair,armchairs
32
+ 31:fence,fencing
33
+ 32:field
34
+ 33:lamp,lamps
35
+ 34:rock,stone,rocks,stones
36
+ 35:seat,seats
37
+ 36:river
38
+ 37:desk,desks
39
+ 38:bathtub,bathing tub,bath,tub
40
+ 39:railing,rail
41
+ 40:signboard,sign,signboards,signs
42
+ 41:cushion,cushions
43
+ 42:path
44
+ 43:work surface
45
+ 44:stairs,steps
46
+ 45:column,pillar
47
+ 46:sink
48
+ 47:wardrobe,closet,press,wardrobes,closets
49
+ 48:snow
50
+ 49:refrigerator,icebox
51
+ 50:pedestal
52
+ 51:bridge,span
53
+ 52:blind
54
+ 53:runway
55
+ 54:cliff,drop,drop-off
56
+ 55:sand
57
+ 56:fireplace,hearth,open fireplace
58
+ 57:pillow,pillows
59
+ 58:screen door,shower door
60
+ 59:toilet,commode,crapper,potty
61
+ 60:skyscraper,skyscrapers
62
+ 61:grandstand,covered stand
63
+ 62:box,boxes
64
+ 63:pool table,billiard table,snooker table
65
+ 64:palm tree,palm trees
66
+ 65:double door
67
+ 66:coffee table,cocktail table
68
+ 67:counter
69
+ 68:countertop,counter top,worktop
70
+ 69:chest of drawers,chest,bureau,dresser
71
+ 70:kitchen island
72
+ 71:boat
73
+ 72:waterfall,falls
74
+ 73:stove,kitchen stove,kitchen range,kitchen range,cooking stove
75
+ 74:flower,flowers
76
+ 75:bookcase
77
+ 76:controls
78
+ 77:book,books
79
+ 78:stairway,staircase
80
+ 79:streetlight,street lamp
81
+ 80:computer,computing machine,computing device,data processor,electronic computer,information processing system
82
+ 81:bus,autobus,double-decker,jitney,motorbus,motorcoach,omnibus,passenger vehicle
83
+ 82:swivel chair
84
+ 83:light,light source
85
+ 84:bench,benches
86
+ 85:case,display case,showcase,vitrine
87
+ 86:towel
88
+ 87:fountain
89
+ 88:embankment
90
+ 89:television receiver,television,television set,tv,tv set
91
+ 90:van
92
+ 91:hill
93
+ 92:awning,sunshade,sunblind
94
+ 93:poster,posting,placard,notice,bill,card
95
+ 94:truck,motortruck
96
+ 95:airplane,aeroplane,airplanes,aeroplanes
97
+ 96:pole
98
+ 97:tower,towers
99
+ 98:court
100
+ 99:ball,balls
101
+ 100:aircraft carrier,carrier,flattop,attack aircraft carrier
102
+ 101:buffet,sideboard
103
+ 102:hovel,hut,hutch,shack,shanty
104
+ 103:apparel,wearing apparel,dress,clothes
105
+ 104:minibike,motorbike
106
+ 105:animal,animate being,dog,cat,horse,cow,sheep,zebra,giraffe,bird
107
+ 106:chandelier,pendant,pendent
108
+ 107:step,stair
109
+ 108:booth,cubicle,stall,kiosk
110
+ 109:bicycle,bike
111
+ 110:doorframe,doorcase
112
+ 111:sconce,sconce lamp,sconce light
113
+ 112:pond
114
+ 113:trade name,brand name
115
+ 114:bannister,banister,balustrade,balusters,handrail
116
+ 115:bag,bags,gift bag,paper bag
117
+ 116:traffic light,traffic signal,traffic lights
118
+ 117:gazebo
119
+ 118:escalator,moving staircase,moving stairway
120
+ 119:land,soil
121
+ 120:board,plank
122
+ 121:arcade machine,arcade machines
123
+ 122:eiderdown,duvet,continental quilt
124
+ 123:bar
125
+ 124:stall,stand,sales booth
126
+ 125:playground
127
+ 126:ship
128
+ 127:ottoman,pouf,pouffe,puff,hassock
129
+ 128:ashcan,trash can,garbage can,wastebin,ash bin,ash-bin,ashbin,dustbin,trash barrel,trash bin
130
+ 129:bottle,bottles,water bottle
131
+ 130:cradle
132
+ 131:pot,flowerpot
133
+ 132:conveyer belt,conveyor belt,conveyer,conveyor,transporter
134
+ 133:train,railroad train
135
+ 134:stool,stools
136
+ 135:lake
137
+ 136:tank,storage tank
138
+ 137:ice,water ice
139
+ 138:basket,handbasket
140
+ 139:manhole
141
+ 140:tent,collapsible shelter
142
+ 141:canopy
143
+ 142:microwave,microwave oven
144
+ 143:barrel,cask,barrels,casks
145
+ 144:dirt track
146
+ 145:beam
147
+ 146:dishwasher,dish washer,dishwashing machine
148
+ 147:plate,plates
149
+ 148:crt screen
150
+ 149:ruins
151
+ 150:washer,automatic washer,washing machine
152
+ 151:blanket,cover
153
+ 152:plaything,toy,toys
154
+ 153:food,solid food
155
+ 154:projection screen
156
+ 155:oven
157
+ 156:stage
158
+ 157:beacon,lighthouse,beacon light,pharos
159
+ 158:umbrella
160
+ 159:sculpture,sculptures
161
+ 160:aqueduct
162
+ 161:container
163
+ 162:scaffolding,staging
164
+ 163:exhaust hood
165
+ 164:curb,curbing,kerb
166
+ 165:roller coaster
167
+ 166:horse,equus caballus
168
+ 167:catwalk
169
+ 168:glass,drinking glass
170
+ 169:vase,vases
171
+ 170:central reservation
172
+ 171:carousel
173
+ 172:radiator
174
+ 173:closet
175
+ 174:machine
176
+ 175:pier,wharf,wharfage,dock
177
+ 176:ceiling fan,floor fan
178
+ 177:inflatable bounce game
179
+ 178:pitch
180
+ 179:paper
181
+ 180:arcade,colonnade
182
+ 181:hot tub
183
+ 182:helicopter
184
+ 183:tray,trays
185
+ 184:partition,divider
186
+ 185:vineyard
187
+ 186:bowl
188
+ 187:bullring
189
+ 188:flag,flags
190
+ 189:pot
191
+ 190:footbridge,overcrossing,pedestrian bridge
192
+ 191:shower
193
+ 192:bag,traveling bag,travelling bag,grip,suitcase
194
+ 193:bulletin board,notice board
195
+ 194:confessional booth
196
+ 195:trunk,tree trunk,bole
197
+ 196:forest
198
+ 197:elevator door
199
+ 198:laptop,laptop computer
200
+ 199:instrument panel
201
+ 200:bucket,pail
202
+ 201:tapestry,tapis
203
+ 202:platform
204
+ 203:jacket
205
+ 204:gate
206
+ 205:monitor,monitoring device,monitors
207
+ 206:telephone booth,phone booth,call box,telephone box,telephone kiosk
208
+ 207:spotlight,spot
209
+ 208:ring
210
+ 209:control panel
211
+ 210:blackboard,chalkboard
212
+ 211:air conditioner,air conditioning
213
+ 212:chest
214
+ 213:clock
215
+ 214:sand dune
216
+ 215:pipe,pipage,piping
217
+ 216:vault
218
+ 217:table football
219
+ 218:cannon
220
+ 219:swimming pool,swimming bath
221
+ 220:fluorescent,fluorescent fixture
222
+ 221:statue
223
+ 222:loudspeaker,speaker,speaker unit,loudspeaker system,speaker system
224
+ 223:exhibitor
225
+ 224:ladder
226
+ 225:carport
227
+ 226:dam
228
+ 227:pulpit
229
+ 228:skylight,fanlight
230
+ 229:water tower
231
+ 230:grill,grille,grillwork
232
+ 231:display board
233
+ 232:pane,pane of glass,window glass
234
+ 233:rubbish,trash,scrap
235
+ 234:ice rink
236
+ 235:fruit
237
+ 236:patio
238
+ 237:vending machine
239
+ 238:telephone,phone,telephone set
240
+ 239:net
241
+ 240:backpack,back pack,knapsack,packsack,rucksack,haversack
242
+ 241:jar
243
+ 242:track
244
+ 243:magazine
245
+ 244:shutter
246
+ 245:roof
247
+ 246:banner,streamer
248
+ 247:landfill
249
+ 248:post
250
+ 249:altarpiece,reredos
251
+ 250:hat,chapeau,lid
252
+ 251:arch,archway
253
+ 252:table game
254
+ 253:bag,handbag,pocketbook,purse
255
+ 254:document,written document,papers
256
+ 255:dome
257
+ 256:pier
258
+ 257:shanties
259
+ 258:forecourt
260
+ 259:crane
261
+ 260:dog,domestic dog,canis familiaris
262
+ 261:piano,pianoforte,forte-piano
263
+ 262:drawing
264
+ 263:cabin
265
+ 264:ad,advertisement,advertizement,advertising,advertizing,advert
266
+ 265:amphitheater,amphitheatre,coliseum
267
+ 266:monument
268
+ 267:henhouse
269
+ 268:cockpit
270
+ 269:heater,warmer
271
+ 270:windmill,aerogenerator,wind generator
272
+ 271:pool
273
+ 272:elevator,lift
274
+ 273:decoration,ornament,ornamentation
275
+ 274:labyrinth
276
+ 275:text,textual matter
277
+ 276:printer
278
+ 277:mezzanine,first balcony
279
+ 278:mattress
280
+ 279:straw
281
+ 280:stalls
282
+ 281:patio,terrace
283
+ 282:billboard,hoarding
284
+ 283:bus stop
285
+ 284:trouser,pant
286
+ 285:console table,console
287
+ 286:rack
288
+ 287:notebook
289
+ 288:shrine
290
+ 289:pantry
291
+ 290:cart
292
+ 291:steam shovel
293
+ 292:porch
294
+ 293:postbox,mailbox,letter box
295
+ 294:figurine,statuette
296
+ 295:recycling bin
297
+ 296:folding screen
298
+ 297:telescope
299
+ 298:deck chair,beach chair
300
+ 299:kennel
301
+ 300:coffee maker
302
+ 301:altar,communion table,lord's table
303
+ 302:fish
304
+ 303:easel
305
+ 304:artificial golf green
306
+ 305:iceberg
307
+ 306:candlestick,candle holder
308
+ 307:shower stall,shower bath
309
+ 308:television stand
310
+ 309:wall socket,wall plug,electric outlet,electrical outlet,outlet,electric receptacle
311
+ 310:skeleton
312
+ 311:grand piano,grand
313
+ 312:candy,confect
314
+ 313:grille door
315
+ 314:pedestal,plinth,footstall
316
+ 315:jersey,t-shirt,tee shirt
317
+ 316:shoe
318
+ 317:gravestone,headstone,tombstone
319
+ 318:shanty
320
+ 319:structure
321
+ 320:rocking chair,rocker
322
+ 321:bird
323
+ 322:place mat
324
+ 323:tomb
325
+ 324:big top
326
+ 325:gas pump,gasoline pump,petrol pump,island dispenser
327
+ 326:lockers
328
+ 327:cage
329
+ 328:finger
330
+ 329:bleachers
331
+ 330:ferris wheel
332
+ 331:hairdresser chair
333
+ 332:mat
334
+ 333:stands
335
+ 334:aquarium,fish tank,marine museum
336
+ 335:streetcar,tram,tramcar,trolley,trolley car
337
+ 336:napkin,table napkin,serviette
338
+ 337:dummy
339
+ 338:booklet,brochure,folder,leaflet,pamphlet
340
+ 339:sand trap
341
+ 340:shop,store
342
+ 341:table cloth
343
+ 342:service station
344
+ 343:coffin
345
+ 344:drawer
346
+ 345:cages
347
+ 346:slot machine,coin machine
348
+ 347:balcony
349
+ 348:volleyball court
350
+ 349:table tennis
351
+ 350:control table
352
+ 351:shirt
353
+ 352:merchandise,ware,product
354
+ 353:railway
355
+ 354:parterre
356
+ 355:chimney
357
+ 356:can,tin,tin can
358
+ 357:tanks
359
+ 358:fabric,cloth,material,textile
360
+ 359:alga,algae
361
+ 360:system
362
+ 361:map
363
+ 362:greenhouse
364
+ 363:mug
365
+ 364:barbecue
366
+ 365:trailer
367
+ 366:toilet tissue,toilet paper,bathroom tissue
368
+ 367:organ
369
+ 368:dishrag,dishcloth
370
+ 369:island
371
+ 370:keyboard
372
+ 371:trench
373
+ 372:basket,basketball hoop,hoop
374
+ 373:steering wheel,wheel
375
+ 374:pitcher,ewer
376
+ 375:goal
377
+ 376:bread,breadstuff,staff of life
378
+ 377:beds
379
+ 378:wood
380
+ 379:file cabinet
381
+ 380:newspaper,paper
382
+ 381:motorboat
383
+ 382:rope
384
+ 383:guitar
385
+ 384:rubble
386
+ 385:scarf
387
+ 386:barrels
388
+ 387:cap
389
+ 388:leaves
390
+ 389:control tower
391
+ 390:dashboard
392
+ 391:bandstand
393
+ 392:lectern
394
+ 393:switch,electric switch,electrical switch
395
+ 394:baseboard,mopboard,skirting board
396
+ 395:shower room
397
+ 396:smoke
398
+ 397:faucet,spigot
399
+ 398:bulldozer
400
+ 399:saucepan
401
+ 400:shops
402
+ 401:meter
403
+ 402:crevasse
404
+ 403:gear
405
+ 404:candelabrum,candelabra
406
+ 405:sofa bed
407
+ 406:tunnel
408
+ 407:pallet
409
+ 408:wire,conducting wire
410
+ 409:kettle,boiler
411
+ 410:bidet
412
+ 411:baby buggy,baby carriage,carriage,perambulator,pram,stroller,go-cart,pushchair,pusher
413
+ 412:music stand
414
+ 413:pipe,tube
415
+ 414:cup,cups,drinking glass,drinking glasses
416
+ 415:parking meter
417
+ 416:ice hockey rink
418
+ 417:shelter
419
+ 418:weeds
420
+ 419:temple
421
+ 420:patty,cake
422
+ 421:ski slope
423
+ 422:panel
424
+ 423:wallet
425
+ 424:wheel
426
+ 425:towel rack,towel horse
427
+ 426:roundabout
428
+ 427:canister,cannister,tin
429
+ 428:rod
430
+ 429:soap dispenser
431
+ 430:bell
432
+ 431:canvas
433
+ 432:box office,ticket office,ticket booth
434
+ 433:teacup
435
+ 434:trellis
436
+ 435:workbench
437
+ 436:valley,vale
438
+ 437:toaster
439
+ 438:knife
440
+ 439:podium
441
+ 440:ramp
442
+ 441:tumble dryer
443
+ 442:fireplug,fire hydrant,plug
444
+ 443:gym shoe,sneaker,tennis shoe
445
+ 444:lab bench
446
+ 445:equipment
447
+ 446:rocky formation
448
+ 447:plastic
449
+ 448:calendar
450
+ 449:caravan
451
+ 450:check-in-desk
452
+ 451:ticket counter
453
+ 452:brush
454
+ 453:mill
455
+ 454:covered bridge
456
+ 455:bowling alley
457
+ 456:hanger
458
+ 457:excavator
459
+ 458:trestle
460
+ 459:revolving door
461
+ 460:blast furnace
462
+ 461:scale,weighing machine
463
+ 462:projector
464
+ 463:soap
465
+ 464:locker
466
+ 465:tractor
467
+ 466:stretcher
468
+ 467:frame
469
+ 468:grating
470
+ 469:alembic
471
+ 470:candle,taper,wax light
472
+ 471:barrier
473
+ 472:cardboard
474
+ 473:cave
475
+ 474:puddle
476
+ 475:tarp
477
+ 476:price tag
478
+ 477:watchtower
479
+ 478:meters
480
+ 479:light bulb,bulb,bulbs
481
+ 480:tracks
482
+ 481:hair dryer
483
+ 482:skirt
484
+ 483:viaduct
485
+ 484:paper towel
486
+ 485:coat
487
+ 486:sheet
488
+ 487:fire extinguisher,extinguisher,asphyxiator
489
+ 488:water wheel
490
+ 489:pottery,clayware
491
+ 490:magazine rack
492
+ 491:teapot
493
+ 492:microphone,mike
494
+ 493:support
495
+ 494:forklift
496
+ 495:canyon
497
+ 496:cash register,register
498
+ 497:leaf,leafage,foliage
499
+ 498:remote control,remote
500
+ 499:soap dish
501
+ 500:windshield,windscreen
502
+ 501:cat
503
+ 502:cue,cue stick,pool cue,pool stick
504
+ 503:vent,venthole,vent-hole,blowhole
505
+ 504:videos
506
+ 505:shovel
507
+ 506:eaves
508
+ 507:antenna,aerial,transmitting aerial
509
+ 508:shipyard
510
+ 509:hen,biddy
511
+ 510:traffic cone
512
+ 511:washing machines
513
+ 512:truck crane
514
+ 513:cds
515
+ 514:niche
516
+ 515:scoreboard
517
+ 516:briefcase
518
+ 517:boot
519
+ 518:sweater,jumper
520
+ 519:hay
521
+ 520:pack
522
+ 521:bottle rack
523
+ 522:glacier
524
+ 523:pergola
525
+ 524:building materials
526
+ 525:television camera
527
+ 526:first floor
528
+ 527:rifle
529
+ 528:tennis table
530
+ 529:stadium
531
+ 530:safety belt
532
+ 531:cover
533
+ 532:dish rack
534
+ 533:synthesizer
535
+ 534:pumpkin
536
+ 535:gutter
537
+ 536:fruit stand
538
+ 537:ice floe,floe
539
+ 538:handle,grip,handgrip,hold
540
+ 539:wheelchair
541
+ 540:mousepad,mouse mat
542
+ 541:diploma
543
+ 542:fairground ride
544
+ 543:radio
545
+ 544:hotplate
546
+ 545:junk
547
+ 546:wheelbarrow
548
+ 547:stream
549
+ 548:toll plaza
550
+ 549:punching bag
551
+ 550:trough
552
+ 551:throne
553
+ 552:chair desk
554
+ 553:weighbridge
555
+ 554:extractor fan
556
+ 555:hanging clothes
557
+ 556:dish,dish aerial,dish antenna,saucer
558
+ 557:alarm clock,alarm
559
+ 558:ski lift
560
+ 559:chain
561
+ 560:garage
562
+ 561:mechanical shovel
563
+ 562:wine rack
564
+ 563:tramway
565
+ 564:treadmill
566
+ 565:menu
567
+ 566:block
568
+ 567:well
569
+ 568:witness stand
570
+ 569:branch
571
+ 570:duck
572
+ 571:casserole
573
+ 572:frying pan
574
+ 573:desk organizer
575
+ 574:mast
576
+ 575:spectacles,specs,eyeglasses,glasses
577
+ 576:service elevator
578
+ 577:dollhouse
579
+ 578:hammock
580
+ 579:clothes hanging
581
+ 580:photocopier
582
+ 581:notepad
583
+ 582:golf cart
584
+ 583:footpath
585
+ 584:cross
586
+ 585:baptismal font
587
+ 586:boiler
588
+ 587:skip
589
+ 588:rotisserie
590
+ 589:tables
591
+ 590:water mill
592
+ 591:helmet
593
+ 592:cover curtain
594
+ 593:brick
595
+ 594:table runner
596
+ 595:ashtray
597
+ 596:street box
598
+ 597:stick
599
+ 598:hangers
600
+ 599:cells
601
+ 600:urinal
602
+ 601:centerpiece
603
+ 602:portable fridge
604
+ 603:dvds
605
+ 604:golf club
606
+ 605:skirting board
607
+ 606:water cooler
608
+ 607:clipboard
609
+ 608:camera,photographic camera
610
+ 609:pigeonhole
611
+ 610:chips
612
+ 611:food processor
613
+ 612:post box
614
+ 613:lid
615
+ 614:drum
616
+ 615:blender
617
+ 616:cave entrance
618
+ 617:dental chair
619
+ 618:obelisk
620
+ 619:canoe
621
+ 620:mobile
622
+ 621:monitors
623
+ 622:pool ball
624
+ 623:cue rack
625
+ 624:baggage carts
626
+ 625:shore
627
+ 626:fork
628
+ 627:paper filer
629
+ 628:bicycle rack
630
+ 629:coat rack
631
+ 630:garland
632
+ 631:sports bag
633
+ 632:fish tank
634
+ 633:towel dispenser
635
+ 634:carriage
636
+ 635:brochure
637
+ 636:plaque
638
+ 637:stringer
639
+ 638:iron
640
+ 639:spoon
641
+ 640:flag pole
642
+ 641:toilet brush
643
+ 642:book stand
644
+ 643:water faucet,water tap,tap,hydrant
645
+ 644:ticket office
646
+ 645:broom
647
+ 646:dvd
648
+ 647:ice bucket
649
+ 648:carapace,shell,cuticle,shield
650
+ 649:tureen
651
+ 650:folders
652
+ 651:chess
653
+ 652:root
654
+ 653:sewing machine
655
+ 654:model
656
+ 655:pen
657
+ 656:violin
658
+ 657:sweatshirt
659
+ 658:recycling materials
660
+ 659:mitten
661
+ 660:chopping board,cutting board
662
+ 661:mask
663
+ 662:log
664
+ 663:mouse,computer mouse
665
+ 664:grill
666
+ 665:hole
667
+ 666:target
668
+ 667:trash bag
669
+ 668:chalk
670
+ 669:sticks
671
+ 670:balloon
672
+ 671:score
673
+ 672:hair spray
674
+ 673:roll
675
+ 674:runner
676
+ 675:engine
677
+ 676:inflatable glove
678
+ 677:games
679
+ 678:pallets
680
+ 679:baskets
681
+ 680:coop
682
+ 681:dvd player
683
+ 682:rocking horse
684
+ 683:buckets
685
+ 684:bread rolls
686
+ 685:shawl
687
+ 686:watering can
688
+ 687:spotlights
689
+ 688:post-it
690
+ 689:bowls
691
+ 690:security camera
692
+ 691:runner cloth
693
+ 692:lock
694
+ 693:alarm,warning device,alarm system
695
+ 694:side
696
+ 695:roulette
697
+ 696:bone
698
+ 697:cutlery
699
+ 698:pool balls
700
+ 699:wheels
701
+ 700:spice rack
702
+ 701:plant pots,plant pot,flower pot,flowerpot,planter
703
+ 702:towel ring
704
+ 703:bread box
705
+ 704:video
706
+ 705:funfair
707
+ 706:breads
708
+ 707:tripod
709
+ 708:ironing board
710
+ 709:skimmer
711
+ 710:hollow
712
+ 711:scratching post
713
+ 712:tricycle
714
+ 713:file box
715
+ 714:mountain pass
716
+ 715:tombstones
717
+ 716:cooker
718
+ 717:card game,cards
719
+ 718:golf bag
720
+ 719:towel paper
721
+ 720:chaise lounge
722
+ 721:sun
723
+ 722:toilet paper holder
724
+ 723:rake
725
+ 724:key
726
+ 725:umbrella stand
727
+ 726:dartboard
728
+ 727:transformer
729
+ 728:fireplace utensils
730
+ 729:sweatshirts
731
+ 730:cellular telephone,cellular phone,cellphone,cell,mobile phone
732
+ 731:tallboy
733
+ 732:stapler
734
+ 733:sauna
735
+ 734:test tube
736
+ 735:palette
737
+ 736:shopping carts
738
+ 737:tools
739
+ 738:push button,push,button
740
+ 739:star
741
+ 740:roof rack
742
+ 741:barbed wire
743
+ 742:spray
744
+ 743:ear
745
+ 744:sponge
746
+ 745:racket
747
+ 746:tins
748
+ 747:eyeglasses
749
+ 748:file
750
+ 749:scarfs
751
+ 750:sugar bowl
752
+ 751:flip flop
753
+ 752:headstones
754
+ 753:laptop bag
755
+ 754:leash
756
+ 755:climbing frame
757
+ 756:suit hanger
758
+ 757:floor spotlight
759
+ 758:plate rack
760
+ 759:sewer
761
+ 760:hard drive
762
+ 761:sprinkler
763
+ 762:tools box
764
+ 763:necklace
765
+ 764:bulbs
766
+ 765:steel industry
767
+ 766:club
768
+ 767:jack
769
+ 768:door bars
770
+ 769:control panel,instrument panel,control board,board,panel
771
+ 770:hairbrush
772
+ 771:napkin holder
773
+ 772:office
774
+ 773:smoke detector
775
+ 774:utensils
776
+ 775:apron
777
+ 776:scissors
778
+ 777:terminal
779
+ 778:grinder
780
+ 779:entry phone
781
+ 780:newspaper stand
782
+ 781:pepper shaker
783
+ 782:onions
784
+ 783:central processing unit,cpu,central processor,processor,mainframe
785
+ 784:tape
786
+ 785:bat
787
+ 786:coaster
788
+ 787:calculator
789
+ 788:potatoes
790
+ 789:luggage rack
791
+ 790:salt
792
+ 791:street number
793
+ 792:viewpoint
794
+ 793:sword
795
+ 794:cd
796
+ 795:rowing machine
797
+ 796:plug
798
+ 797:andiron,firedog,dog,dog-iron
799
+ 798:pepper
800
+ 799:tongs
801
+ 800:bonfire
802
+ 801:dog dish
803
+ 802:belt
804
+ 803:dumbbells
805
+ 804:videocassette recorder,vcr
806
+ 805:hook
807
+ 806:envelopes
808
+ 807:shower faucet
809
+ 808:watch
810
+ 809:padlock
811
+ 810:swimming pool ladder
812
+ 811:spanners
813
+ 812:gravy boat
814
+ 813:notice board
815
+ 814:trash bags
816
+ 815:fire alarm
817
+ 816:ladle
818
+ 817:stethoscope
819
+ 818:rocket
820
+ 819:funnel
821
+ 820:bowling pins
822
+ 821:valve
823
+ 822:thermometer
824
+ 823:cups
825
+ 824:spice jar
826
+ 825:night light
827
+ 826:soaps
828
+ 827:games table
829
+ 828:slotted spoon
830
+ 829:reel
831
+ 830:scourer
832
+ 831:sleeping robe
833
+ 832:desk mat
834
+ 833:dumbbell
835
+ 834:hammer
836
+ 835:tie
837
+ 836:typewriter
838
+ 837:shaker
839
+ 838:cheese dish
840
+ 839:sea star
841
+ 840:racquet
842
+ 841:butane gas cylinder
843
+ 842:paper weight
844
+ 843:shaving brush
845
+ 844:sunglasses
846
+ 845:gear shift
847
+ 846:towel rail
848
+ 847:adding machine,totalizer,totaliser
fcclip/data/datasets/cityscapes_with_prompt_eng.txt ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 0:road,railroad
2
+ 1:sidewalk,pavement
3
+ 2:building,buildings,edifice,edifices,house,ceiling
4
+ 3:wall,walls,brick wall,stone wall,tile wall,wood wall
5
+ 4:fence,fences
6
+ 5:pole,poles
7
+ 6:traffic light,traffic lights
8
+ 7:traffic sign,stop sign
9
+ 8:vegetation,tree,trees,palm tree,bushes
10
+ 9:terrain,river,sand,sea,snow,water,mountain,grass,dirt,rock
11
+ 10:sky,clouds
12
+ 11:person
13
+ 12:rider
14
+ 13:car,cars
15
+ 14:truck,trucks
16
+ 15:bus,buses
17
+ 16:train,trains,locomotive,locomotives,freight train
18
+ 17:motorcycle,motorcycles
19
+ 18:bicycle,bicycles,bike,bikes
fcclip/data/datasets/coco_panoptic_with_prompt_eng.txt ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 0:invalid_class_id
2
+ 1:person,child,girl,boy,woman,man,people,children,girls,boys,women,men,lady,guy,ladies,guys,clothes
3
+ 2:bicycle,bicycles,bike,bikes
4
+ 3:car,cars
5
+ 4:motorcycle,motorcycles
6
+ 5:airplane,airplanes
7
+ 6:bus,buses
8
+ 7:train,trains,locomotive,locomotives,freight train
9
+ 8:truck,trucks
10
+ 9:boat,boats
11
+ 10:traffic light
12
+ 11:fire hydrant
13
+ 12:invalid_class_id
14
+ 13:stop sign
15
+ 14:parking meter
16
+ 15:bench,benches
17
+ 16:bird,birds
18
+ 17:cat,cats,kitties,kitty
19
+ 18:dog,dogs,puppy,puppies
20
+ 19:horse,horses,foal
21
+ 20:sheep
22
+ 21:cow,cows,calf
23
+ 22:elephant,elephants
24
+ 23:bear,bears
25
+ 24:zebra,zebras
26
+ 25:giraffe,giraffes
27
+ 26:invalid_class_id
28
+ 27:backpack,backpacks
29
+ 28:umbrella,umbrellas
30
+ 29:invalid_class_id
31
+ 30:invalid_class_id
32
+ 31:handbag,handbags
33
+ 32:tie
34
+ 33:suitcase,suitcases
35
+ 34:frisbee
36
+ 35:skis
37
+ 36:snowboard
38
+ 37:sports ball
39
+ 38:kite,kites
40
+ 39:baseball bat
41
+ 40:baseball glove
42
+ 41:skateboard
43
+ 42:surfboard
44
+ 43:tennis racket
45
+ 44:bottle,bottles,water bottle
46
+ 45:invalid_class_id
47
+ 46:wine glass,wine glasses,wineglass
48
+ 47:cup,cups,water cup,water glass
49
+ 48:fork,forks
50
+ 49:knife,knives
51
+ 50:spoon,spoons
52
+ 51:bowl,bowls
53
+ 52:banana,bananas
54
+ 53:apple,apples,apple fruit
55
+ 54:sandwich,sandwiches
56
+ 55:orange fruit
57
+ 56:broccoli
58
+ 57:carrot,carrots
59
+ 58:hot dog
60
+ 59:pizza
61
+ 60:donut,donuts
62
+ 61:cake,cakes
63
+ 62:chair,chairs
64
+ 63:couch,sofa,sofas
65
+ 64:potted plant,potted plants,pottedplant,pottedplants,planter,planters
66
+ 65:bed,beds
67
+ 66:invalid_class_id
68
+ 67:dining table,dining tables,diningtable,diningtables,plate,plates,diningtable tablecloth
69
+ 68:invalid_class_id
70
+ 69:invalid_class_id
71
+ 70:toilet
72
+ 71:invalid_class_id
73
+ 72:tv
74
+ 73:laptop
75
+ 74:mouse
76
+ 75:tv remote,remote control
77
+ 76:keyboard
78
+ 77:cell phone,mobile
79
+ 78:microwave
80
+ 79:oven,ovens
81
+ 80:toaster
82
+ 81:sink,sinks
83
+ 82:refrigerator,fridge
84
+ 83:invalid_class_id
85
+ 84:book,books
86
+ 85:clock
87
+ 86:vase,vases
88
+ 87:scissor,scissors
89
+ 88:teddy bear,teddy bears
90
+ 89:hair drier
91
+ 90:toothbrush,toothbrushes
92
+ 91:invalid_class_id
93
+ 92:banner,banners
94
+ 93:blanket,blankets
95
+ 94:invalid_class_id
96
+ 95:bridge
97
+ 96:invalid_class_id
98
+ 97:invalid_class_id
99
+ 98:invalid_class_id
100
+ 99:invalid_class_id
101
+ 100:cardboard
102
+ 101:invalid_class_id
103
+ 102:invalid_class_id
104
+ 103:invalid_class_id
105
+ 104:invalid_class_id
106
+ 105:invalid_class_id
107
+ 106:invalid_class_id
108
+ 107:counter
109
+ 108:invalid_class_id
110
+ 109:curtain,curtains
111
+ 110:invalid_class_id
112
+ 111:invalid_class_id
113
+ 112:door,doors
114
+ 113:invalid_class_id
115
+ 114:invalid_class_id
116
+ 115:invalid_class_id
117
+ 116:invalid_class_id
118
+ 117:invalid_class_id
119
+ 118:wood floor
120
+ 119:flower,flowers
121
+ 120:invalid_class_id
122
+ 121:invalid_class_id
123
+ 122:fruit,fruits
124
+ 123:invalid_class_id
125
+ 124:invalid_class_id
126
+ 125:gravel
127
+ 126:invalid_class_id
128
+ 127:invalid_class_id
129
+ 128:house
130
+ 129:invalid_class_id
131
+ 130:lamp,bulb,lamps,bulbs
132
+ 131:invalid_class_id
133
+ 132:invalid_class_id
134
+ 133:mirror
135
+ 134:invalid_class_id
136
+ 135:invalid_class_id
137
+ 136:invalid_class_id
138
+ 137:invalid_class_id
139
+ 138:tennis net
140
+ 139:invalid_class_id
141
+ 140:invalid_class_id
142
+ 141:pillow,pillows
143
+ 142:invalid_class_id
144
+ 143:invalid_class_id
145
+ 144:platform
146
+ 145:playingfield,tennis court,baseball field,soccer field,tennis field
147
+ 146:invalid_class_id
148
+ 147:railroad
149
+ 148:river
150
+ 149:road
151
+ 150:invalid_class_id
152
+ 151:roof
153
+ 152:invalid_class_id
154
+ 153:invalid_class_id
155
+ 154:sand
156
+ 155:sea,sea wave,wave,waves
157
+ 156:shelf
158
+ 157:invalid_class_id
159
+ 158:invalid_class_id
160
+ 159:snow
161
+ 160:invalid_class_id
162
+ 161:stairs
163
+ 162:invalid_class_id
164
+ 163:invalid_class_id
165
+ 164:invalid_class_id
166
+ 165:invalid_class_id
167
+ 166:tent
168
+ 167:invalid_class_id
169
+ 168:towel
170
+ 169:invalid_class_id
171
+ 170:invalid_class_id
172
+ 171:brick wall
173
+ 172:invalid_class_id
174
+ 173:invalid_class_id
175
+ 174:invalid_class_id
176
+ 175:stone wall
177
+ 176:tile wall
178
+ 177:wood wall
179
+ 178:water
180
+ 179:invalid_class_id
181
+ 180:window blind
182
+ 181:window
183
+ 182:invalid_class_id
184
+ 183:invalid_class_id
185
+ 184:tree,trees,palm tree,bushes
186
+ 185:fence,fences
187
+ 186:ceiling
188
+ 187:sky,clouds
189
+ 188:cabinet,cabinets
190
+ 189:table
191
+ 190:floor,flooring,tile floor
192
+ 191:pavement
193
+ 192:mountain,mountains
194
+ 193:grass
195
+ 194:dirt
196
+ 195:paper
197
+ 196:food
198
+ 197:building,buildings
199
+ 198:rock
200
+ 199:wall,walls
201
+ 200:rug
fcclip/data/datasets/coco_stuff_with_prompt_eng.txt ADDED
@@ -0,0 +1,183 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 0:invalid_class_id
2
+ 1:person,child,girl,boy,woman,man,people,children,girls,boys,women,men,lady,guy,ladies,guys
3
+ 2:bicycle,bicycles,bike,bikes
4
+ 3:car,cars
5
+ 4:motorcycle,motorcycles
6
+ 5:airplane,airplanes
7
+ 6:bus,buses
8
+ 7:train,trains,locomotive,locomotives,freight train
9
+ 8:truck,trucks
10
+ 9:boat,boats
11
+ 10:traffic light
12
+ 11:fire hydrant
13
+ 12:invalid_class_id
14
+ 13:stop sign
15
+ 14:parking meter
16
+ 15:bench,benches
17
+ 16:bird,birds
18
+ 17:cat,cats,kitties,kitty
19
+ 18:dog,dogs,puppy,puppies
20
+ 19:horse,horses,foal
21
+ 20:sheep
22
+ 21:cow,cows,calf
23
+ 22:elephant,elephants
24
+ 23:bear,bears
25
+ 24:zebra,zebras
26
+ 25:giraffe,giraffes
27
+ 26:invalid_class_id
28
+ 27:backpack,backpacks
29
+ 28:umbrella,umbrellas
30
+ 29:invalid_class_id
31
+ 30:invalid_class_id
32
+ 31:handbag,handbags
33
+ 32:tie
34
+ 33:suitcase,suitcases
35
+ 34:frisbee
36
+ 35:skis
37
+ 36:snowboard
38
+ 37:sports ball
39
+ 38:kite,kites
40
+ 39:baseball bat
41
+ 40:baseball glove
42
+ 41:skateboard
43
+ 42:surfboard
44
+ 43:tennis racket
45
+ 44:bottle,bottles,water bottle
46
+ 45:invalid_class_id
47
+ 46:wine glass,wine glasses,wineglass
48
+ 47:cup,cups,water cup,water glass
49
+ 48:fork,forks
50
+ 49:knife,knives
51
+ 50:spoon,spoons
52
+ 51:bowl,bowls
53
+ 52:banana,bananas
54
+ 53:apple,apples,apple fruit
55
+ 54:sandwich,sandwiches
56
+ 55:orange,oranges,orange fruit
57
+ 56:broccoli
58
+ 57:carrot,carrots
59
+ 58:hot dog
60
+ 59:pizza
61
+ 60:donut,donuts
62
+ 61:cake,cakes
63
+ 62:chair,chairs
64
+ 63:couch,sofa,sofas
65
+ 64:potted plant,potted plants,pottedplant,pottedplants,planter,planters
66
+ 65:bed,beds
67
+ 66:invalid_class_id
68
+ 67:dining table,dining tables,diningtable,diningtables,plate,plates,diningtable tablecloth
69
+ 68:invalid_class_id
70
+ 69:invalid_class_id
71
+ 70:toilet
72
+ 71:invalid_class_id
73
+ 72:tv
74
+ 73:laptop
75
+ 74:mouse
76
+ 75:remote,tv remote,remote control
77
+ 76:keyboard
78
+ 77:cell phone,mobile
79
+ 78:microwave
80
+ 79:oven,ovens
81
+ 80:toaster
82
+ 81:sink,sinks
83
+ 82:refrigerator,fridge
84
+ 83:invalid_class_id
85
+ 84:book,books
86
+ 85:clock
87
+ 86:vase,vases
88
+ 87:scissors,scissor
89
+ 88:teddy bear,teddy bears
90
+ 89:hair drier
91
+ 90:toothbrush,toothbrushes
92
+ 91:invalid_class_id
93
+ 92:banner,banners
94
+ 93:blanket,blankets
95
+ 94:branch
96
+ 95:bridge
97
+ 96:building,buildings
98
+ 97:bush,bushes
99
+ 98:cabinet,cabinets
100
+ 99:cage,cages
101
+ 100:cardboard
102
+ 101:carpet,carpets
103
+ 102:ceiling-other,ceiling
104
+ 103:ceiling-tile,ceiling tile
105
+ 104:cloth
106
+ 105:clothes
107
+ 106:clouds
108
+ 107:counter
109
+ 108:cupboard,cupboards
110
+ 109:curtain,curtains
111
+ 110:desk-stuff,desk,desks
112
+ 111:dirt
113
+ 112:door-stuff,door,doors
114
+ 113:fence,fences
115
+ 114:floor-marble,marble floor,floor marble
116
+ 115:floor-other,floor
117
+ 116:floor-stone,stone floor,floor stone
118
+ 117:floor-tile,tile floor,floor tile
119
+ 118:floor-wood,wood floor,floor wood
120
+ 119:flower,flowers
121
+ 120:fog
122
+ 121:food-other,food
123
+ 122:fruit,fruits
124
+ 123:furniture-other,furniture
125
+ 124:grass
126
+ 125:gravel
127
+ 126:ground-other,ground
128
+ 127:hill
129
+ 128:house
130
+ 129:leaves
131
+ 130:light
132
+ 131:mat
133
+ 132:metal
134
+ 133:mirror-stuff,mirror
135
+ 134:moss
136
+ 135:mountain,mountains
137
+ 136:mud
138
+ 137:napkin
139
+ 138:net
140
+ 139:paper
141
+ 140:pavement
142
+ 141:pillow,pillows
143
+ 142:plant-other
144
+ 143:plastic
145
+ 144:platform
146
+ 145:playingfield,tennis court,baseball field,soccer field,tennis field
147
+ 146:railing
148
+ 147:railroad
149
+ 148:river
150
+ 149:road
151
+ 150:rock
152
+ 151:roof
153
+ 152:rug
154
+ 153:salad
155
+ 154:sand
156
+ 155:sea,sea wave,wave,waves
157
+ 156:shelf
158
+ 157:sky-other,sky
159
+ 158:skyscraper
160
+ 159:snow
161
+ 160:solid-other,solid
162
+ 161:stairs
163
+ 162:stone
164
+ 163:straw
165
+ 164:structural-other,structural
166
+ 165:table
167
+ 166:tent
168
+ 167:textile-other,textile
169
+ 168:towel
170
+ 169:tree,trees,palm tree
171
+ 170:vegetable
172
+ 171:wall-brick,brick wall,wall brick
173
+ 172:wall-concrete,concrete wall,wall concrete
174
+ 173:wall-other,wall
175
+ 174:wall-panel,wall panel,panel wall
176
+ 175:wall-stone,stone wall,wall stone
177
+ 176:wall-tile,wall tile,tile wall
178
+ 177:wall-wood,wood wall, wall wood
179
+ 178:water-other,water
180
+ 179:waterdrops
181
+ 180:window-blind,window blind
182
+ 181:window-other,window
183
+ 182:wood
fcclip/data/datasets/lvis_1203_with_prompt_eng.txt ADDED
@@ -0,0 +1,1203 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 1:aerosol can,spray can
2
+ 2:air conditioner
3
+ 3:airplane,aeroplane
4
+ 4:alarm clock
5
+ 5:alcohol,alcoholic beverage
6
+ 6:alligator,gator
7
+ 7:almond
8
+ 8:ambulance
9
+ 9:amplifier
10
+ 10:anklet,ankle bracelet
11
+ 11:antenna,aerial,transmitting aerial
12
+ 12:apple
13
+ 13:applesauce
14
+ 14:apricot
15
+ 15:apron
16
+ 16:aquarium,fish tank
17
+ 17:arctic (type of shoe),galosh,golosh,rubber (type of shoe),gumshoe
18
+ 18:armband
19
+ 19:armchair
20
+ 20:armoire
21
+ 21:armor,armour
22
+ 22:artichoke
23
+ 23:trash can,garbage can,wastebin,dustbin,trash barrel,trash bin
24
+ 24:ashtray
25
+ 25:asparagus
26
+ 26:atomizer,atomiser,spray,sprayer,nebulizer,nebuliser
27
+ 27:avocado
28
+ 28:award,accolade
29
+ 29:awning
30
+ 30:ax,axe
31
+ 31:baboon
32
+ 32:baby buggy,baby carriage,perambulator,pram,stroller
33
+ 33:basketball backboard
34
+ 34:backpack,knapsack,packsack,rucksack,haversack
35
+ 35:handbag,purse,pocketbook
36
+ 36:suitcase,baggage,luggage
37
+ 37:bagel,beigel
38
+ 38:bagpipe
39
+ 39:baguet,baguette
40
+ 40:bait,lure
41
+ 41:ball
42
+ 42:ballet skirt,tutu
43
+ 43:balloon
44
+ 44:bamboo
45
+ 45:banana
46
+ 46:Band Aid
47
+ 47:bandage
48
+ 48:bandanna,bandana
49
+ 49:banjo
50
+ 50:banner,streamer
51
+ 51:barbell
52
+ 52:barge
53
+ 53:barrel,cask
54
+ 54:barrette
55
+ 55:barrow,garden cart,lawn cart,wheelbarrow
56
+ 56:baseball base
57
+ 57:baseball
58
+ 58:baseball bat
59
+ 59:baseball cap,jockey cap,golf cap
60
+ 60:baseball glove,baseball mitt
61
+ 61:basket,handbasket
62
+ 62:basketball
63
+ 63:bass horn,sousaphone,tuba
64
+ 64:bat (animal)
65
+ 65:bath mat
66
+ 66:bath towel
67
+ 67:bathrobe
68
+ 68:bathtub,bathing tub
69
+ 69:batter (food)
70
+ 70:battery
71
+ 71:beachball
72
+ 72:bead
73
+ 73:bean curd,tofu
74
+ 74:beanbag
75
+ 75:beanie,beany
76
+ 76:bear
77
+ 77:bed
78
+ 78:bedpan
79
+ 79:bedspread,bedcover,bed covering,counterpane,spread
80
+ 80:cow
81
+ 81:beef (food),boeuf (food)
82
+ 82:beeper,pager
83
+ 83:beer bottle
84
+ 84:beer can
85
+ 85:beetle
86
+ 86:bell
87
+ 87:bell pepper,capsicum
88
+ 88:belt
89
+ 89:belt buckle
90
+ 90:bench
91
+ 91:beret
92
+ 92:bib
93
+ 93:Bible
94
+ 94:bicycle,bike (bicycle)
95
+ 95:visor,vizor
96
+ 96:billboard
97
+ 97:binder,ring-binder
98
+ 98:binoculars,field glasses,opera glasses
99
+ 99:bird
100
+ 100:birdfeeder
101
+ 101:birdbath
102
+ 102:birdcage
103
+ 103:birdhouse
104
+ 104:birthday cake
105
+ 105:birthday card
106
+ 106:pirate flag
107
+ 107:black sheep
108
+ 108:blackberry
109
+ 109:blackboard,chalkboard
110
+ 110:blanket
111
+ 111:blazer,sport jacket,sport coat,sports jacket,sports coat
112
+ 112:blender,liquidizer,liquidiser
113
+ 113:blimp
114
+ 114:blinker,flasher
115
+ 115:blouse
116
+ 116:blueberry
117
+ 117:gameboard
118
+ 118:boat,ship (boat)
119
+ 119:bob,bobber,bobfloat
120
+ 120:bobbin,spool,reel
121
+ 121:bobby pin,hairgrip
122
+ 122:boiled egg,coddled egg
123
+ 123:bolo tie,bolo,bola tie,bola
124
+ 124:deadbolt
125
+ 125:bolt
126
+ 126:bonnet
127
+ 127:book
128
+ 128:bookcase
129
+ 129:booklet,brochure,leaflet,pamphlet
130
+ 130:bookmark,bookmarker
131
+ 131:boom microphone,microphone boom
132
+ 132:boot
133
+ 133:bottle
134
+ 134:bottle opener
135
+ 135:bouquet
136
+ 136:bow (weapon)
137
+ 137:bow (decorative ribbons)
138
+ 138:bow-tie,bowtie
139
+ 139:bowl
140
+ 140:pipe bowl
141
+ 141:bowler hat,bowler,derby hat,derby,plug hat
142
+ 142:bowling ball
143
+ 143:box
144
+ 144:boxing glove
145
+ 145:suspenders
146
+ 146:bracelet,bangle
147
+ 147:brass plaque
148
+ 148:brassiere,bra,bandeau
149
+ 149:bread-bin,breadbox
150
+ 150:bread
151
+ 151:breechcloth,breechclout,loincloth
152
+ 152:bridal gown,wedding gown,wedding dress
153
+ 153:briefcase
154
+ 154:broccoli
155
+ 155:broach
156
+ 156:broom
157
+ 157:brownie
158
+ 158:brussels sprouts
159
+ 159:bubble gum
160
+ 160:bucket,pail
161
+ 161:horse buggy
162
+ 162:horned cow
163
+ 163:bulldog
164
+ 164:bulldozer,dozer
165
+ 165:bullet train
166
+ 166:bulletin board,notice board
167
+ 167:bulletproof vest
168
+ 168:bullhorn,megaphone
169
+ 169:bun,roll
170
+ 170:bunk bed
171
+ 171:buoy
172
+ 172:burrito
173
+ 173:bus (vehicle),autobus,charabanc,double-decker,motorbus,motorcoach
174
+ 174:business card
175
+ 175:butter
176
+ 176:butterfly
177
+ 177:button
178
+ 178:cab (taxi),taxi,taxicab
179
+ 179:cabana
180
+ 180:cabin car,caboose
181
+ 181:cabinet
182
+ 182:locker,storage locker
183
+ 183:cake
184
+ 184:calculator
185
+ 185:calendar
186
+ 186:calf
187
+ 187:camcorder
188
+ 188:camel
189
+ 189:camera
190
+ 190:camera lens
191
+ 191:camper (vehicle),camping bus,motor home
192
+ 192:can,tin can
193
+ 193:can opener,tin opener
194
+ 194:candle,candlestick
195
+ 195:candle holder
196
+ 196:candy bar
197
+ 197:candy cane
198
+ 198:walking cane
199
+ 199:canister,cannister
200
+ 200:canoe
201
+ 201:cantaloup,cantaloupe
202
+ 202:canteen
203
+ 203:cap (headwear)
204
+ 204:bottle cap,cap (container lid)
205
+ 205:cape
206
+ 206:cappuccino,coffee cappuccino
207
+ 207:car (automobile),auto (automobile),automobile
208
+ 208:railcar (part of a train),railway car (part of a train),railroad car (part of a train)
209
+ 209:elevator car
210
+ 210:car battery,automobile battery
211
+ 211:identity card
212
+ 212:card
213
+ 213:cardigan
214
+ 214:cargo ship,cargo vessel
215
+ 215:carnation
216
+ 216:horse carriage
217
+ 217:carrot
218
+ 218:tote bag
219
+ 219:cart
220
+ 220:carton
221
+ 221:cash register,register (for cash transactions)
222
+ 222:casserole
223
+ 223:cassette
224
+ 224:cast,plaster cast,plaster bandage
225
+ 225:cat
226
+ 226:cauliflower
227
+ 227:cayenne (spice),cayenne pepper (spice),red pepper (spice)
228
+ 228:CD player
229
+ 229:celery
230
+ 230:cellular telephone,cellular phone,cellphone,mobile phone,smart phone
231
+ 231:chain mail,ring mail,chain armor,chain armour,ring armor,ring armour
232
+ 232:chair
233
+ 233:chaise longue,chaise,daybed
234
+ 234:chalice
235
+ 235:chandelier
236
+ 236:chap
237
+ 237:checkbook,chequebook
238
+ 238:checkerboard
239
+ 239:cherry
240
+ 240:chessboard
241
+ 241:chicken (animal)
242
+ 242:chickpea,garbanzo
243
+ 243:chili (vegetable),chili pepper (vegetable),chilli (vegetable),chilly (vegetable),chile (vegetable)
244
+ 244:chime,gong
245
+ 245:chinaware
246
+ 246:crisp (potato chip),potato chip
247
+ 247:poker chip
248
+ 248:chocolate bar
249
+ 249:chocolate cake
250
+ 250:chocolate milk
251
+ 251:chocolate mousse
252
+ 252:choker,collar,neckband
253
+ 253:chopping board,cutting board,chopping block
254
+ 254:chopstick
255
+ 255:Christmas tree
256
+ 256:slide
257
+ 257:cider,cyder
258
+ 258:cigar box
259
+ 259:cigarette
260
+ 260:cigarette case,cigarette pack
261
+ 261:cistern,water tank
262
+ 262:clarinet
263
+ 263:clasp
264
+ 264:cleansing agent,cleanser,cleaner
265
+ 265:cleat (for securing rope)
266
+ 266:clementine
267
+ 267:clip
268
+ 268:clipboard
269
+ 269:clippers (for plants)
270
+ 270:cloak
271
+ 271:clock,timepiece,timekeeper
272
+ 272:clock tower
273
+ 273:clothes hamper,laundry basket,clothes basket
274
+ 274:clothespin,clothes peg
275
+ 275:clutch bag
276
+ 276:coaster
277
+ 277:coat
278
+ 278:coat hanger,clothes hanger,dress hanger
279
+ 279:coatrack,hatrack
280
+ 280:cock,rooster
281
+ 281:cockroach
282
+ 282:cocoa (beverage),hot chocolate (beverage),drinking chocolate
283
+ 283:coconut,cocoanut
284
+ 284:coffee maker,coffee machine
285
+ 285:coffee table,cocktail table
286
+ 286:coffeepot
287
+ 287:coil
288
+ 288:coin
289
+ 289:colander,cullender
290
+ 290:coleslaw,slaw
291
+ 291:coloring material,colouring material
292
+ 292:combination lock
293
+ 293:pacifier,teething ring
294
+ 294:comic book
295
+ 295:compass
296
+ 296:computer keyboard,keyboard (computer)
297
+ 297:condiment
298
+ 298:cone,traffic cone
299
+ 299:control,controller
300
+ 300:convertible (automobile)
301
+ 301:sofa bed
302
+ 302:cooker
303
+ 303:cookie,cooky,biscuit (cookie)
304
+ 304:cooking utensil
305
+ 305:cooler (for food),ice chest
306
+ 306:cork (bottle plug),bottle cork
307
+ 307:corkboard
308
+ 308:corkscrew,bottle screw
309
+ 309:edible corn,corn,maize
310
+ 310:cornbread
311
+ 311:cornet,horn,trumpet
312
+ 312:cornice,valance,valance board,pelmet
313
+ 313:cornmeal
314
+ 314:corset,girdle
315
+ 315:costume
316
+ 316:cougar,puma,catamount,mountain lion,panther
317
+ 317:coverall
318
+ 318:cowbell
319
+ 319:cowboy hat,ten-gallon hat
320
+ 320:crab (animal)
321
+ 321:crabmeat
322
+ 322:cracker
323
+ 323:crape,crepe,French pancake
324
+ 324:crate
325
+ 325:crayon,wax crayon
326
+ 326:cream pitcher
327
+ 327:crescent roll,croissant
328
+ 328:crib,cot
329
+ 329:crock pot,earthenware jar
330
+ 330:crossbar
331
+ 331:crouton
332
+ 332:crow
333
+ 333:crowbar,wrecking bar,pry bar
334
+ 334:crown
335
+ 335:crucifix
336
+ 336:cruise ship,cruise liner
337
+ 337:police cruiser,patrol car,police car,squad car
338
+ 338:crumb
339
+ 339:crutch
340
+ 340:cub (animal)
341
+ 341:cube,square block
342
+ 342:cucumber,cuke
343
+ 343:cufflink
344
+ 344:cup
345
+ 345:trophy cup
346
+ 346:cupboard,closet
347
+ 347:cupcake
348
+ 348:hair curler,hair roller,hair crimper
349
+ 349:curling iron
350
+ 350:curtain,drapery
351
+ 351:cushion
352
+ 352:cylinder
353
+ 353:cymbal
354
+ 354:dagger
355
+ 355:dalmatian
356
+ 356:dartboard
357
+ 357:date (fruit)
358
+ 358:deck chair,beach chair
359
+ 359:deer,cervid
360
+ 360:dental floss,floss
361
+ 361:desk
362
+ 362:detergent
363
+ 363:diaper
364
+ 364:diary,journal
365
+ 365:die,dice
366
+ 366:dinghy,dory,rowboat
367
+ 367:dining table
368
+ 368:tux,tuxedo
369
+ 369:dish
370
+ 370:dish antenna
371
+ 371:dishrag,dishcloth
372
+ 372:dishtowel,tea towel
373
+ 373:dishwasher,dishwashing machine
374
+ 374:dishwasher detergent,dishwashing detergent,dishwashing liquid,dishsoap
375
+ 375:dispenser
376
+ 376:diving board
377
+ 377:Dixie cup,paper cup
378
+ 378:dog
379
+ 379:dog collar
380
+ 380:doll
381
+ 381:dollar,dollar bill,one dollar bill
382
+ 382:dollhouse,doll's house
383
+ 383:dolphin
384
+ 384:domestic ass,donkey
385
+ 385:doorknob,doorhandle
386
+ 386:doormat,welcome mat
387
+ 387:doughnut,donut
388
+ 388:dove
389
+ 389:dragonfly
390
+ 390:drawer
391
+ 391:underdrawers,boxers,boxershorts
392
+ 392:dress,frock
393
+ 393:dress hat,high hat,opera hat,silk hat,top hat
394
+ 394:dress suit
395
+ 395:dresser
396
+ 396:drill
397
+ 397:drone
398
+ 398:dropper,eye dropper
399
+ 399:drum (musical instrument)
400
+ 400:drumstick
401
+ 401:duck
402
+ 402:duckling
403
+ 403:duct tape
404
+ 404:duffel bag,duffle bag,duffel,duffle
405
+ 405:dumbbell
406
+ 406:dumpster
407
+ 407:dustpan
408
+ 408:eagle
409
+ 409:earphone,earpiece,headphone
410
+ 410:earplug
411
+ 411:earring
412
+ 412:easel
413
+ 413:eclair
414
+ 414:eel
415
+ 415:egg,eggs
416
+ 416:egg roll,spring roll
417
+ 417:egg yolk,yolk (egg)
418
+ 418:eggbeater,eggwhisk
419
+ 419:eggplant,aubergine
420
+ 420:electric chair
421
+ 421:refrigerator
422
+ 422:elephant
423
+ 423:elk,moose
424
+ 424:envelope
425
+ 425:eraser
426
+ 426:escargot
427
+ 427:eyepatch
428
+ 428:falcon
429
+ 429:fan
430
+ 430:faucet,spigot,tap
431
+ 431:fedora
432
+ 432:ferret
433
+ 433:Ferris wheel
434
+ 434:ferry,ferryboat
435
+ 435:fig (fruit)
436
+ 436:fighter jet,fighter aircraft,attack aircraft
437
+ 437:figurine
438
+ 438:file cabinet,filing cabinet
439
+ 439:file (tool)
440
+ 440:fire alarm,smoke alarm
441
+ 441:fire engine,fire truck
442
+ 442:fire extinguisher,extinguisher
443
+ 443:fire hose
444
+ 444:fireplace
445
+ 445:fireplug,fire hydrant,hydrant
446
+ 446:first-aid kit
447
+ 447:fish
448
+ 448:fish (food)
449
+ 449:fishbowl,goldfish bowl
450
+ 450:fishing rod,fishing pole
451
+ 451:flag
452
+ 452:flagpole,flagstaff
453
+ 453:flamingo
454
+ 454:flannel
455
+ 455:flap
456
+ 456:flash,flashbulb
457
+ 457:flashlight,torch
458
+ 458:fleece
459
+ 459:flip-flop (sandal)
460
+ 460:flipper (footwear),fin (footwear)
461
+ 461:flower arrangement,floral arrangement
462
+ 462:flute glass,champagne flute
463
+ 463:foal
464
+ 464:folding chair
465
+ 465:food processor
466
+ 466:football (American)
467
+ 467:football helmet
468
+ 468:footstool,footrest
469
+ 469:fork
470
+ 470:forklift
471
+ 471:freight car
472
+ 472:French toast
473
+ 473:freshener,air freshener
474
+ 474:frisbee
475
+ 475:frog,toad,toad frog
476
+ 476:fruit juice
477
+ 477:frying pan,frypan,skillet
478
+ 478:fudge
479
+ 479:funnel
480
+ 480:futon
481
+ 481:gag,muzzle
482
+ 482:garbage
483
+ 483:garbage truck
484
+ 484:garden hose
485
+ 485:gargle,mouthwash
486
+ 486:gargoyle
487
+ 487:garlic,ail
488
+ 488:gasmask,respirator,gas helmet
489
+ 489:gazelle
490
+ 490:gelatin,jelly
491
+ 491:gemstone
492
+ 492:generator
493
+ 493:giant panda,panda,panda bear
494
+ 494:gift wrap
495
+ 495:ginger,gingerroot
496
+ 496:giraffe
497
+ 497:cincture,sash,waistband,waistcloth
498
+ 498:glass (drink container),drinking glass
499
+ 499:globe
500
+ 500:glove
501
+ 501:goat
502
+ 502:goggles
503
+ 503:goldfish
504
+ 504:golf club,golf-club
505
+ 505:golfcart
506
+ 506:gondola (boat)
507
+ 507:goose
508
+ 508:gorilla
509
+ 509:gourd
510
+ 510:grape
511
+ 511:grater
512
+ 512:gravestone,headstone,tombstone
513
+ 513:gravy boat,gravy holder
514
+ 514:green bean
515
+ 515:green onion,spring onion,scallion
516
+ 516:griddle
517
+ 517:grill,grille,grillwork,radiator grille
518
+ 518:grits,hominy grits
519
+ 519:grizzly,grizzly bear
520
+ 520:grocery bag
521
+ 521:guitar
522
+ 522:gull,seagull
523
+ 523:gun
524
+ 524:hairbrush
525
+ 525:hairnet
526
+ 526:hairpin
527
+ 527:halter top
528
+ 528:ham,jambon,gammon
529
+ 529:hamburger,beefburger,burger
530
+ 530:hammer
531
+ 531:hammock
532
+ 532:hamper
533
+ 533:hamster
534
+ 534:hair dryer
535
+ 535:hand glass,hand mirror
536
+ 536:hand towel,face towel
537
+ 537:handcart,pushcart,hand truck
538
+ 538:handcuff
539
+ 539:handkerchief
540
+ 540:handle,grip,handgrip
541
+ 541:handsaw,carpenter's saw
542
+ 542:hardback book,hardcover book
543
+ 543:harmonium,organ (musical instrument),reed organ (musical instrument)
544
+ 544:hat
545
+ 545:hatbox
546
+ 546:veil
547
+ 547:headband
548
+ 548:headboard
549
+ 549:headlight,headlamp
550
+ 550:headscarf
551
+ 551:headset
552
+ 552:headstall (for horses),headpiece (for horses)
553
+ 553:heart
554
+ 554:heater,warmer
555
+ 555:helicopter
556
+ 556:helmet
557
+ 557:heron
558
+ 558:highchair,feeding chair
559
+ 559:hinge
560
+ 560:hippopotamus
561
+ 561:hockey stick
562
+ 562:hog,pig
563
+ 563:home plate (baseball),home base (baseball)
564
+ 564:honey
565
+ 565:fume hood,exhaust hood
566
+ 566:hook
567
+ 567:hookah,narghile,nargileh,sheesha,shisha,water pipe
568
+ 568:hornet
569
+ 569:horse
570
+ 570:hose,hosepipe
571
+ 571:hot-air balloon
572
+ 572:hotplate
573
+ 573:hot sauce
574
+ 574:hourglass
575
+ 575:houseboat
576
+ 576:hummingbird
577
+ 577:hummus,humus,hommos,hoummos,humous
578
+ 578:polar bear
579
+ 579:icecream
580
+ 580:popsicle
581
+ 581:ice maker
582
+ 582:ice pack,ice bag
583
+ 583:ice skate
584
+ 584:igniter,ignitor,lighter
585
+ 585:inhaler,inhalator
586
+ 586:iPod
587
+ 587:iron (for clothing),smoothing iron (for clothing)
588
+ 588:ironing board
589
+ 589:jacket
590
+ 590:jam
591
+ 591:jar
592
+ 592:jean,blue jean,denim
593
+ 593:jeep,landrover
594
+ 594:jelly bean,jelly egg
595
+ 595:jersey,T-shirt,tee shirt
596
+ 596:jet plane,jet-propelled plane
597
+ 597:jewel,gem,precious stone
598
+ 598:jewelry,jewellery
599
+ 599:joystick
600
+ 600:jumpsuit
601
+ 601:kayak
602
+ 602:keg
603
+ 603:kennel,doghouse
604
+ 604:kettle,boiler
605
+ 605:key
606
+ 606:keycard
607
+ 607:kilt
608
+ 608:kimono
609
+ 609:kitchen sink
610
+ 610:kitchen table
611
+ 611:kite
612
+ 612:kitten,kitty
613
+ 613:kiwi fruit
614
+ 614:knee pad
615
+ 615:knife
616
+ 616:knitting needle
617
+ 617:knob
618
+ 618:knocker (on a door),doorknocker
619
+ 619:koala,koala bear
620
+ 620:lab coat,laboratory coat
621
+ 621:ladder
622
+ 622:ladle
623
+ 623:ladybug,ladybeetle,ladybird beetle
624
+ 624:lamb (animal)
625
+ 625:lamb-chop,lambchop
626
+ 626:lamp
627
+ 627:lamppost
628
+ 628:lampshade
629
+ 629:lantern
630
+ 630:lanyard,laniard
631
+ 631:laptop computer,notebook computer
632
+ 632:lasagna,lasagne
633
+ 633:latch
634
+ 634:lawn mower
635
+ 635:leather
636
+ 636:legging (clothing),leging (clothing),leg covering
637
+ 637:Lego,Lego set
638
+ 638:legume
639
+ 639:lemon
640
+ 640:lemonade
641
+ 641:lettuce
642
+ 642:license plate,numberplate
643
+ 643:life buoy,lifesaver,life belt,life ring
644
+ 644:life jacket,life vest
645
+ 645:lightbulb
646
+ 646:lightning rod,lightning conductor
647
+ 647:lime
648
+ 648:limousine
649
+ 649:lion
650
+ 650:lip balm
651
+ 651:liquor,spirits,hard liquor,liqueur,cordial
652
+ 652:lizard
653
+ 653:log
654
+ 654:lollipop
655
+ 655:speaker (stero equipment)
656
+ 656:loveseat
657
+ 657:machine gun
658
+ 658:magazine
659
+ 659:magnet
660
+ 660:mail slot
661
+ 661:mailbox (at home),letter box (at home)
662
+ 662:mallard
663
+ 663:mallet
664
+ 664:mammoth
665
+ 665:manatee
666
+ 666:mandarin orange
667
+ 667:manger,trough
668
+ 668:manhole
669
+ 669:map
670
+ 670:marker
671
+ 671:martini
672
+ 672:mascot
673
+ 673:mashed potato
674
+ 674:masher
675
+ 675:mask,facemask
676
+ 676:mast
677
+ 677:mat (gym equipment),gym mat
678
+ 678:matchbox
679
+ 679:mattress
680
+ 680:measuring cup
681
+ 681:measuring stick,ruler (measuring stick),measuring rod
682
+ 682:meatball
683
+ 683:medicine
684
+ 684:melon
685
+ 685:microphone
686
+ 686:microscope
687
+ 687:microwave oven
688
+ 688:milestone,milepost
689
+ 689:milk
690
+ 690:milk can
691
+ 691:milkshake
692
+ 692:minivan
693
+ 693:mint candy
694
+ 694:mirror
695
+ 695:mitten
696
+ 696:mixer (kitchen tool),stand mixer
697
+ 697:money
698
+ 698:monitor (computer equipment) computer monitor
699
+ 699:monkey
700
+ 700:motor
701
+ 701:motor scooter,scooter
702
+ 702:motor vehicle,automotive vehicle
703
+ 703:motorcycle
704
+ 704:mound (baseball),pitcher's mound
705
+ 705:mouse (computer equipment),computer mouse
706
+ 706:mousepad
707
+ 707:muffin
708
+ 708:mug
709
+ 709:mushroom
710
+ 710:music stool,piano stool
711
+ 711:musical instrument,instrument (musical)
712
+ 712:nailfile
713
+ 713:napkin,table napkin,serviette
714
+ 714:neckerchief
715
+ 715:necklace
716
+ 716:necktie,tie (necktie)
717
+ 717:needle
718
+ 718:nest
719
+ 719:newspaper,paper (newspaper)
720
+ 720:newsstand
721
+ 721:nightshirt,nightwear,sleepwear,nightclothes
722
+ 722:nosebag (for animals),feedbag
723
+ 723:noseband (for animals),nosepiece (for animals)
724
+ 724:notebook
725
+ 725:notepad
726
+ 726:nut
727
+ 727:nutcracker
728
+ 728:oar
729
+ 729:octopus (food)
730
+ 730:octopus (animal)
731
+ 731:oil lamp,kerosene lamp,kerosine lamp
732
+ 732:olive oil
733
+ 733:omelet,omelette
734
+ 734:onion
735
+ 735:orange (fruit)
736
+ 736:orange juice
737
+ 737:ostrich
738
+ 738:ottoman,pouf,pouffe,hassock
739
+ 739:oven
740
+ 740:overalls (clothing)
741
+ 741:owl
742
+ 742:packet
743
+ 743:inkpad,inking pad,stamp pad
744
+ 744:pad
745
+ 745:paddle,boat paddle
746
+ 746:padlock
747
+ 747:paintbrush
748
+ 748:painting
749
+ 749:pajamas,pyjamas
750
+ 750:palette,pallet
751
+ 751:pan (for cooking),cooking pan
752
+ 752:pan (metal container)
753
+ 753:pancake
754
+ 754:pantyhose
755
+ 755:papaya
756
+ 756:paper plate
757
+ 757:paper towel
758
+ 758:paperback book,paper-back book,softback book,soft-cover book
759
+ 759:paperweight
760
+ 760:parachute
761
+ 761:parakeet,parrakeet,parroket,paraquet,paroquet,parroquet
762
+ 762:parasail (sports)
763
+ 763:parasol,sunshade
764
+ 764:parchment
765
+ 765:parka,anorak
766
+ 766:parking meter
767
+ 767:parrot
768
+ 768:passenger car (part of a train),coach (part of a train)
769
+ 769:passenger ship
770
+ 770:passport
771
+ 771:pastry
772
+ 772:patty (food)
773
+ 773:pea (food)
774
+ 774:peach
775
+ 775:peanut butter
776
+ 776:pear
777
+ 777:peeler (tool for fruit and vegetables)
778
+ 778:wooden leg,pegleg
779
+ 779:pegboard
780
+ 780:pelican
781
+ 781:pen
782
+ 782:pencil
783
+ 783:pencil box,pencil case
784
+ 784:pencil sharpener
785
+ 785:pendulum
786
+ 786:penguin
787
+ 787:pennant
788
+ 788:penny (coin)
789
+ 789:pepper,peppercorn
790
+ 790:pepper mill,pepper grinder
791
+ 791:perfume
792
+ 792:persimmon
793
+ 793:person,baby,child,boy,girl,man,woman,human
794
+ 794:pet
795
+ 795:pew (church bench),church bench
796
+ 796:phonebook,telephone book,telephone directory
797
+ 797:phonograph record,phonograph recording,record (phonograph recording)
798
+ 798:piano
799
+ 799:pickle
800
+ 800:pickup truck
801
+ 801:pie
802
+ 802:pigeon
803
+ 803:piggy bank,penny bank
804
+ 804:pillow
805
+ 805:pin (non jewelry)
806
+ 806:pineapple
807
+ 807:pinecone
808
+ 808:ping-pong ball
809
+ 809:pinwheel
810
+ 810:tobacco pipe
811
+ 811:pipe,piping
812
+ 812:pistol,handgun
813
+ 813:pita (bread),pocket bread
814
+ 814:pitcher (vessel for liquid),ewer
815
+ 815:pitchfork
816
+ 816:pizza
817
+ 817:place mat
818
+ 818:plate
819
+ 819:platter
820
+ 820:playpen
821
+ 821:pliers,plyers
822
+ 822:plow (farm equipment),plough (farm equipment)
823
+ 823:plume
824
+ 824:pocket watch
825
+ 825:pocketknife
826
+ 826:poker (fire stirring tool),stove poker,fire hook
827
+ 827:pole,post
828
+ 828:polo shirt,sport shirt
829
+ 829:poncho
830
+ 830:pony
831
+ 831:pool table,billiard table,snooker table
832
+ 832:pop (soda),soda (pop),tonic,soft drink
833
+ 833:postbox (public),mailbox (public)
834
+ 834:postcard,postal card,mailing-card
835
+ 835:poster,placard
836
+ 836:pot
837
+ 837:flowerpot
838
+ 838:potato
839
+ 839:potholder
840
+ 840:pottery,clayware
841
+ 841:pouch
842
+ 842:power shovel,excavator,digger
843
+ 843:prawn,shrimp
844
+ 844:pretzel
845
+ 845:printer,printing machine
846
+ 846:projectile (weapon),missile
847
+ 847:projector
848
+ 848:propeller,propellor
849
+ 849:prune
850
+ 850:pudding
851
+ 851:puffer (fish),pufferfish,blowfish,globefish
852
+ 852:puffin
853
+ 853:pug-dog
854
+ 854:pumpkin
855
+ 855:puncher
856
+ 856:puppet,marionette
857
+ 857:puppy
858
+ 858:quesadilla
859
+ 859:quiche
860
+ 860:quilt,comforter
861
+ 861:rabbit
862
+ 862:race car,racing car
863
+ 863:racket,racquet
864
+ 864:radar
865
+ 865:radiator
866
+ 866:radio receiver,radio set,radio,tuner (radio)
867
+ 867:radish,daikon
868
+ 868:raft
869
+ 869:rag doll
870
+ 870:raincoat,waterproof jacket
871
+ 871:ram (animal)
872
+ 872:raspberry
873
+ 873:rat
874
+ 874:razorblade
875
+ 875:reamer (juicer),juicer,juice reamer
876
+ 876:rearview mirror
877
+ 877:receipt
878
+ 878:recliner,reclining chair,lounger (chair)
879
+ 879:record player,phonograph (record player),turntable
880
+ 880:reflector
881
+ 881:remote control
882
+ 882:rhinoceros
883
+ 883:rib (food)
884
+ 884:rifle
885
+ 885:ring
886
+ 886:river boat
887
+ 887:road map
888
+ 888:robe
889
+ 889:rocking chair
890
+ 890:rodent
891
+ 891:roller skate
892
+ 892:Rollerblade
893
+ 893:rolling pin
894
+ 894:root beer
895
+ 895:router (computer equipment)
896
+ 896:rubber band,elastic band
897
+ 897:runner (carpet)
898
+ 898:plastic bag,paper bag
899
+ 899:saddle (on an animal)
900
+ 900:saddle blanket,saddlecloth,horse blanket
901
+ 901:saddlebag
902
+ 902:safety pin
903
+ 903:sail
904
+ 904:salad
905
+ 905:salad plate,salad bowl
906
+ 906:salami
907
+ 907:salmon (fish)
908
+ 908:salmon (food)
909
+ 909:salsa
910
+ 910:saltshaker
911
+ 911:sandal (type of shoe)
912
+ 912:sandwich
913
+ 913:satchel
914
+ 914:saucepan
915
+ 915:saucer
916
+ 916:sausage
917
+ 917:sawhorse,sawbuck
918
+ 918:saxophone
919
+ 919:scale (measuring instrument)
920
+ 920:scarecrow,strawman
921
+ 921:scarf
922
+ 922:school bus
923
+ 923:scissors
924
+ 924:scoreboard
925
+ 925:scraper
926
+ 926:screwdriver
927
+ 927:scrubbing brush
928
+ 928:sculpture
929
+ 929:seabird,seafowl
930
+ 930:seahorse
931
+ 931:seaplane,hydroplane
932
+ 932:seashell
933
+ 933:sewing machine
934
+ 934:shaker
935
+ 935:shampoo
936
+ 936:shark
937
+ 937:sharpener
938
+ 938:Sharpie
939
+ 939:shaver (electric),electric shaver,electric razor
940
+ 940:shaving cream,shaving soap
941
+ 941:shawl
942
+ 942:shears
943
+ 943:sheep
944
+ 944:shepherd dog,sheepdog
945
+ 945:sherbert,sherbet
946
+ 946:shield
947
+ 947:shirt
948
+ 948:shoe,sneaker (type of shoe),tennis shoe
949
+ 949:shopping bag
950
+ 950:shopping cart
951
+ 951:short pants,shorts (clothing),trunks (clothing)
952
+ 952:shot glass
953
+ 953:shoulder bag
954
+ 954:shovel
955
+ 955:shower head
956
+ 956:shower cap
957
+ 957:shower curtain
958
+ 958:shredder (for paper)
959
+ 959:signboard
960
+ 960:silo
961
+ 961:sink
962
+ 962:skateboard
963
+ 963:skewer
964
+ 964:ski
965
+ 965:ski boot
966
+ 966:ski parka,ski jacket
967
+ 967:ski pole
968
+ 968:skirt
969
+ 969:skullcap
970
+ 970:sled,sledge,sleigh
971
+ 971:sleeping bag
972
+ 972:sling (bandage),triangular bandage
973
+ 973:slipper (footwear),carpet slipper (footwear)
974
+ 974:smoothie
975
+ 975:snake,serpent
976
+ 976:snowboard
977
+ 977:snowman
978
+ 978:snowmobile
979
+ 979:soap
980
+ 980:soccer ball
981
+ 981:sock
982
+ 982:sofa,couch,lounge
983
+ 983:softball
984
+ 984:solar array,solar battery,solar panel
985
+ 985:sombrero
986
+ 986:soup
987
+ 987:soup bowl
988
+ 988:soupspoon
989
+ 989:sour cream,soured cream
990
+ 990:soya milk,soybean milk,soymilk
991
+ 991:space shuttle
992
+ 992:sparkler (fireworks)
993
+ 993:spatula
994
+ 994:spear,lance
995
+ 995:spectacles,specs,eyeglasses,glasses
996
+ 996:spice rack
997
+ 997:spider
998
+ 998:crawfish,crayfish
999
+ 999:sponge
1000
+ 1000:spoon
1001
+ 1001:sportswear,athletic wear,activewear
1002
+ 1002:spotlight
1003
+ 1003:squid (food),calamari,calamary
1004
+ 1004:squirrel
1005
+ 1005:stagecoach
1006
+ 1006:stapler (stapling machine)
1007
+ 1007:starfish,sea star
1008
+ 1008:statue (sculpture)
1009
+ 1009:steak (food)
1010
+ 1010:steak knife
1011
+ 1011:steering wheel
1012
+ 1012:stepladder
1013
+ 1013:step stool
1014
+ 1014:stereo (sound system)
1015
+ 1015:stew
1016
+ 1016:stirrer
1017
+ 1017:stirrup
1018
+ 1018:stool
1019
+ 1019:stop sign
1020
+ 1020:brake light
1021
+ 1021:stove,kitchen stove,range (kitchen appliance),kitchen range,cooking stove
1022
+ 1022:strainer
1023
+ 1023:strap
1024
+ 1024:straw (for drinking),drinking straw
1025
+ 1025:strawberry
1026
+ 1026:street sign
1027
+ 1027:streetlight,street lamp
1028
+ 1028:string cheese
1029
+ 1029:stylus
1030
+ 1030:subwoofer
1031
+ 1031:sugar bowl
1032
+ 1032:sugarcane (plant)
1033
+ 1033:suit (clothing)
1034
+ 1034:sunflower
1035
+ 1035:sunglasses
1036
+ 1036:sunhat
1037
+ 1037:surfboard
1038
+ 1038:sushi
1039
+ 1039:mop
1040
+ 1040:sweat pants
1041
+ 1041:sweatband
1042
+ 1042:sweater
1043
+ 1043:sweatshirt
1044
+ 1044:sweet potato
1045
+ 1045:swimsuit,swimwear,bathing suit,swimming costume,bathing costume,swimming trunks,bathing trunks
1046
+ 1046:sword
1047
+ 1047:syringe
1048
+ 1048:Tabasco sauce
1049
+ 1049:table-tennis table,ping-pong table
1050
+ 1050:table
1051
+ 1051:table lamp
1052
+ 1052:tablecloth
1053
+ 1053:tachometer
1054
+ 1054:taco
1055
+ 1055:tag
1056
+ 1056:taillight,rear light
1057
+ 1057:tambourine
1058
+ 1058:army tank,armored combat vehicle,armoured combat vehicle
1059
+ 1059:tank (storage vessel),storage tank
1060
+ 1060:tank top (clothing)
1061
+ 1061:tape (sticky cloth or paper)
1062
+ 1062:tape measure,measuring tape
1063
+ 1063:tapestry
1064
+ 1064:tarp
1065
+ 1065:tartan,plaid
1066
+ 1066:tassel
1067
+ 1067:tea bag
1068
+ 1068:teacup
1069
+ 1069:teakettle
1070
+ 1070:teapot
1071
+ 1071:teddy bear
1072
+ 1072:telephone,phone,telephone set
1073
+ 1073:telephone booth,phone booth,call box,telephone box,telephone kiosk
1074
+ 1074:telephone pole,telegraph pole,telegraph post
1075
+ 1075:telephoto lens,zoom lens
1076
+ 1076:television camera,tv camera
1077
+ 1077:television set,tv,tv set
1078
+ 1078:tennis ball
1079
+ 1079:tennis racket
1080
+ 1080:tequila
1081
+ 1081:thermometer
1082
+ 1082:thermos bottle
1083
+ 1083:thermostat
1084
+ 1084:thimble
1085
+ 1085:thread,yarn
1086
+ 1086:thumbtack,drawing pin,pushpin
1087
+ 1087:tiara
1088
+ 1088:tiger
1089
+ 1089:tights (clothing),leotards
1090
+ 1090:timer,stopwatch
1091
+ 1091:tinfoil
1092
+ 1092:tinsel
1093
+ 1093:tissue paper
1094
+ 1094:toast (food)
1095
+ 1095:toaster
1096
+ 1096:toaster oven
1097
+ 1097:toilet
1098
+ 1098:toilet tissue,toilet paper,bathroom tissue
1099
+ 1099:tomato
1100
+ 1100:tongs
1101
+ 1101:toolbox
1102
+ 1102:toothbrush
1103
+ 1103:toothpaste
1104
+ 1104:toothpick
1105
+ 1105:cover
1106
+ 1106:tortilla
1107
+ 1107:tow truck
1108
+ 1108:towel
1109
+ 1109:towel rack,towel rail,towel bar
1110
+ 1110:toy
1111
+ 1111:tractor (farm equipment)
1112
+ 1112:traffic light
1113
+ 1113:dirt bike
1114
+ 1114:trailer truck,tractor trailer,trucking rig,articulated lorry,semi truck
1115
+ 1115:train (railroad vehicle),railroad train
1116
+ 1116:trampoline
1117
+ 1117:tray
1118
+ 1118:trench coat
1119
+ 1119:triangle (musical instrument)
1120
+ 1120:tricycle
1121
+ 1121:tripod
1122
+ 1122:trousers,pants (clothing)
1123
+ 1123:truck
1124
+ 1124:truffle (chocolate),chocolate truffle
1125
+ 1125:trunk
1126
+ 1126:vat
1127
+ 1127:turban
1128
+ 1128:turkey (food)
1129
+ 1129:turnip
1130
+ 1130:turtle
1131
+ 1131:turtleneck (clothing),polo-neck
1132
+ 1132:typewriter
1133
+ 1133:umbrella
1134
+ 1134:underwear,underclothes,underclothing,underpants
1135
+ 1135:unicycle
1136
+ 1136:urinal
1137
+ 1137:urn
1138
+ 1138:vacuum cleaner
1139
+ 1139:vase
1140
+ 1140:vending machine
1141
+ 1141:vent,blowhole,air vent
1142
+ 1142:vest,waistcoat
1143
+ 1143:videotape
1144
+ 1144:vinegar
1145
+ 1145:violin,fiddle
1146
+ 1146:vodka
1147
+ 1147:volleyball
1148
+ 1148:vulture
1149
+ 1149:waffle
1150
+ 1150:waffle iron
1151
+ 1151:wagon
1152
+ 1152:wagon wheel
1153
+ 1153:walking stick
1154
+ 1154:wall clock
1155
+ 1155:wall socket,wall plug,electric outlet,electrical outlet,outlet,electric receptacle
1156
+ 1156:wallet,billfold
1157
+ 1157:walrus
1158
+ 1158:wardrobe
1159
+ 1159:washbasin,basin (for washing),washbowl,washstand,handbasin
1160
+ 1160:automatic washer,washing machine
1161
+ 1161:watch,wristwatch
1162
+ 1162:water bottle
1163
+ 1163:water cooler
1164
+ 1164:water faucet,water tap,tap (water faucet)
1165
+ 1165:water heater,hot-water heater
1166
+ 1166:water jug
1167
+ 1167:water gun,squirt gun
1168
+ 1168:water scooter,sea scooter,jet ski
1169
+ 1169:water ski
1170
+ 1170:water tower
1171
+ 1171:watering can
1172
+ 1172:watermelon
1173
+ 1173:weathervane,vane (weathervane),wind vane
1174
+ 1174:webcam
1175
+ 1175:wedding cake,bridecake
1176
+ 1176:wedding ring,wedding band
1177
+ 1177:wet suit
1178
+ 1178:wheel
1179
+ 1179:wheelchair
1180
+ 1180:whipped cream
1181
+ 1181:whistle
1182
+ 1182:wig
1183
+ 1183:wind chime
1184
+ 1184:windmill
1185
+ 1185:window box (for plants)
1186
+ 1186:windshield wiper,windscreen wiper,wiper (for windshield/screen)
1187
+ 1187:windsock,air sock,air-sleeve,wind sleeve,wind cone
1188
+ 1188:wine bottle
1189
+ 1189:wine bucket,wine cooler
1190
+ 1190:wineglass
1191
+ 1191:blinder (for horses)
1192
+ 1192:wok
1193
+ 1193:wolf
1194
+ 1194:wooden spoon
1195
+ 1195:wreath
1196
+ 1196:wrench,spanner
1197
+ 1197:wristband
1198
+ 1198:wristlet,wrist band
1199
+ 1199:yacht
1200
+ 1200:yogurt,yoghurt,yoghourt
1201
+ 1201:yoke (animal equipment)
1202
+ 1202:zebra
1203
+ 1203:zucchini,courgette