# Run DeepLab2 on Cityscapes dataset This page walks through the steps required to generate [Cityscapes](https://www.cityscapes-dataset.com/) data for DeepLab2. DeepLab2 uses sharded TFRecords for efficient processing of the data. ## Prework Before running any Deeplab2 scripts, the user should 1. register on the Cityscapes dataset [website](https://www.cityscapes-dataset.com) to download the dataset (gtFine_trainvaltest.zip and leftImg8bit_trainvaltest.zip). 2. install cityscapesscripts via pip: `bash # This will install the cityscapes scripts and its stand-alone tools. pip install cityscapesscripts` 1. run the tools provided by Cityscapes to generate the training groundtruth. See sample commandlines below: ```bash # Set CITYSCAPES_DATASET to your dataset root. # Create train ID label images. CITYSCAPES_DATASET='.' csCreateTrainIdLabelImgs # To generate panoptic groundtruth, run the following command. CITYSCAPES_DATASET='.' csCreatePanopticImgs --use-train-id # [Optional] Generate panoptic groundtruth with EvalId to match evaluation # on the server. This step is not required for generating TFRecords. CITYSCAPES_DATASET='.' csCreatePanopticImgs ``` After running above commandlines, the expected directory structure should be as follows: ``` cityscapes +-- gtFine | | | +-- train | | | | | +-- aachen | | | | | +-- *_color.png | | +-- *_instanceIds.png | | +-- *_labelIds.png | | +-- *_polygons.json | | +-- *_labelTrainIds.png | | ... | +-- val | +-- test | +-- cityscapes_panoptic_{train|val|test}_trainId.json | +-- cityscapes_panoptic_{train|val|test}_trainId | | | | | +-- *_panoptic.png | +-- cityscapes_panoptic_{train|val|test}.json | +-- cityscapes_panoptic_{train|val|test} | | | +-- *_panoptic.png | +-- leftImg8bit | +-- train +-- val +-- test ``` ## Convert prepared dataset to TFRecord Note: the rest of this doc and released DeepLab2 models use `TrainId` instead of `EvalId` (which is used on the evaluation server). For evaluation on the server, you would need to convert the predicted labels to `EvalId` . Use the following commandline to generate cityscapes TFRecords: ```bash # Assuming we are under the folder where deeplab2 is cloned to: # For generating data for semantic segmentation task only python deeplab2/data/build_cityscapes_data.py \ --cityscapes_root=${PATH_TO_CITYSCAPES_ROOT} \ --output_dir=${OUTPUT_PATH_FOR_SEMANTIC} \ --create_panoptic_data=false # For generating data for panoptic segmentation task python deeplab2/data/build_cityscapes_data.py \ --cityscapes_root=${PATH_TO_CITYSCAPES_ROOT} \ --output_dir=${OUTPUT_PATH_FOR_PANOPTIC} ``` Commandline above will output three sharded tfrecord files: `{train|val|test}@10.tfrecord`. In the tfrecords, for `train` and `val` set, it contains the RGB image pixels as well as corresponding annotations. For `test` set, it contains RGB images only. These files will be used as the input for the model training and evaluation. ### TFExample proto format for cityscapes The Example proto contains the following fields: * `image/encoded`: encoded image content. * `image/filename`: image filename. * `image/format`: image file format. * `image/height`: image height. * `image/width`: image width. * `image/channels`: image channels. * `image/segmentation/class/encoded`: encoded segmentation content. * `image/segmentation/class/format`: segmentation encoding format. For semantic segmentation (`--create_panoptic_data=false`), the encoded segmentation map will be the same as PNG file created by `createTrainIdLabelImgs.py`. For panoptic segmentation, the encoded segmentation map will be the raw bytes of a int32 panoptic map, where each pixel is assigned to a panoptic ID. Unlike the ID used in Cityscapes script (`json2instanceImg.py`), this panoptic ID is computed by: ``` panoptic ID = semantic ID * label divisor + instance ID ``` where semantic ID will be: * ignore label (255) for pixels not belonging to any segment * for segments associated with `iscrowd` label: * (default): ignore label (255) * (if set `--treat_crowd_as_ignore=false` while running `build_cityscapes_data.py`): `category_id` (use TrainId) * `category_id` (use TrainId) for other segments The instance ID will be 0 for pixels belonging to * `stuff` class * `thing` class with `iscrowd` label * pixels with ignore label and `[1, label divisor)` otherwise.