File size: 4,554 Bytes
506da10 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 |
# Run DeepLab2 on Cityscapes dataset
This page walks through the steps required to generate
[Cityscapes](https://www.cityscapes-dataset.com/) data for DeepLab2. DeepLab2
uses sharded TFRecords for efficient processing of the data.
## Prework
Before running any Deeplab2 scripts, the user should 1. register on the
Cityscapes dataset [website](https://www.cityscapes-dataset.com) to download the
dataset (gtFine_trainvaltest.zip and leftImg8bit_trainvaltest.zip). 2. install
cityscapesscripts via pip: `bash # This will install the cityscapes scripts and
its stand-alone tools. pip install cityscapesscripts`
1. run the tools provided by Cityscapes to generate the training groundtruth.
See sample commandlines below:
```bash
# Set CITYSCAPES_DATASET to your dataset root.
# Create train ID label images.
CITYSCAPES_DATASET='.' csCreateTrainIdLabelImgs
# To generate panoptic groundtruth, run the following command.
CITYSCAPES_DATASET='.' csCreatePanopticImgs --use-train-id
# [Optional] Generate panoptic groundtruth with EvalId to match evaluation
# on the server. This step is not required for generating TFRecords.
CITYSCAPES_DATASET='.' csCreatePanopticImgs
```
After running above commandlines, the expected directory structure should be as
follows:
```
cityscapes
+-- gtFine
| |
| +-- train
| | |
| | +-- aachen
| | |
| | +-- *_color.png
| | +-- *_instanceIds.png
| | +-- *_labelIds.png
| | +-- *_polygons.json
| | +-- *_labelTrainIds.png
| | ...
| +-- val
| +-- test
| +-- cityscapes_panoptic_{train|val|test}_trainId.json
| +-- cityscapes_panoptic_{train|val|test}_trainId
| | |
| | +-- *_panoptic.png
| +-- cityscapes_panoptic_{train|val|test}.json
| +-- cityscapes_panoptic_{train|val|test}
| |
| +-- *_panoptic.png
|
+-- leftImg8bit
|
+-- train
+-- val
+-- test
```
## Convert prepared dataset to TFRecord
Note: the rest of this doc and released DeepLab2 models use `TrainId` instead of
`EvalId` (which is used on the evaluation server). For evaluation on the server,
you would need to convert the predicted labels to `EvalId` .
Use the following commandline to generate cityscapes TFRecords:
```bash
# Assuming we are under the folder where deeplab2 is cloned to:
# For generating data for semantic segmentation task only
python deeplab2/data/build_cityscapes_data.py \
--cityscapes_root=${PATH_TO_CITYSCAPES_ROOT} \
--output_dir=${OUTPUT_PATH_FOR_SEMANTIC} \
--create_panoptic_data=false
# For generating data for panoptic segmentation task
python deeplab2/data/build_cityscapes_data.py \
--cityscapes_root=${PATH_TO_CITYSCAPES_ROOT} \
--output_dir=${OUTPUT_PATH_FOR_PANOPTIC}
```
Commandline above will output three sharded tfrecord files:
`{train|val|test}@10.tfrecord`. In the tfrecords, for `train` and `val` set, it
contains the RGB image pixels as well as corresponding annotations. For `test`
set, it contains RGB images only. These files will be used as the input for the
model training and evaluation.
### TFExample proto format for cityscapes
The Example proto contains the following fields:
* `image/encoded`: encoded image content.
* `image/filename`: image filename.
* `image/format`: image file format.
* `image/height`: image height.
* `image/width`: image width.
* `image/channels`: image channels.
* `image/segmentation/class/encoded`: encoded segmentation content.
* `image/segmentation/class/format`: segmentation encoding format.
For semantic segmentation (`--create_panoptic_data=false`), the encoded
segmentation map will be the same as PNG file created by
`createTrainIdLabelImgs.py`.
For panoptic segmentation, the encoded segmentation map will be the raw bytes of
a int32 panoptic map, where each pixel is assigned to a panoptic ID. Unlike the
ID used in Cityscapes script (`json2instanceImg.py`), this panoptic ID is
computed by:
```
panoptic ID = semantic ID * label divisor + instance ID
```
where semantic ID will be:
* ignore label (255) for pixels not belonging to any segment
* for segments associated with `iscrowd` label:
* (default): ignore label (255)
* (if set `--treat_crowd_as_ignore=false` while running
`build_cityscapes_data.py`): `category_id` (use TrainId)
* `category_id` (use TrainId) for other segments
The instance ID will be 0 for pixels belonging to
* `stuff` class
* `thing` class with `iscrowd` label
* pixels with ignore label
and `[1, label divisor)` otherwise.
|