File size: 4,554 Bytes
506da10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
# Run DeepLab2 on Cityscapes dataset

This page walks through the steps required to generate
[Cityscapes](https://www.cityscapes-dataset.com/) data for DeepLab2. DeepLab2
uses sharded TFRecords for efficient processing of the data.

## Prework

Before running any Deeplab2 scripts, the user should 1. register on the
Cityscapes dataset [website](https://www.cityscapes-dataset.com) to download the
dataset (gtFine_trainvaltest.zip and leftImg8bit_trainvaltest.zip). 2. install
cityscapesscripts via pip: `bash # This will install the cityscapes scripts and
its stand-alone tools. pip install cityscapesscripts`

1.  run the tools provided by Cityscapes to generate the training groundtruth.
    See sample commandlines below:

```bash
  # Set CITYSCAPES_DATASET to your dataset root.

  # Create train ID label images.
  CITYSCAPES_DATASET='.' csCreateTrainIdLabelImgs

  # To generate panoptic groundtruth, run the following command.
  CITYSCAPES_DATASET='.' csCreatePanopticImgs --use-train-id

  # [Optional] Generate panoptic groundtruth with EvalId to match evaluation
  # on the server. This step is not required for generating TFRecords.
  CITYSCAPES_DATASET='.' csCreatePanopticImgs
```

After running above commandlines, the expected directory structure should be as
follows:

```
cityscapes
+-- gtFine
|   |
|   +-- train
|   |   |
|   |   +-- aachen
|   |       |
|   |       +-- *_color.png
|   |       +-- *_instanceIds.png
|   |       +-- *_labelIds.png
|   |       +-- *_polygons.json
|   |       +-- *_labelTrainIds.png
|   |   ...
|   +-- val
|   +-- test
|   +-- cityscapes_panoptic_{train|val|test}_trainId.json
|   +-- cityscapes_panoptic_{train|val|test}_trainId
|   |   |
|   |   +-- *_panoptic.png
|   +-- cityscapes_panoptic_{train|val|test}.json
|   +-- cityscapes_panoptic_{train|val|test}
|       |
|       +-- *_panoptic.png
|
+-- leftImg8bit
     |
     +-- train
     +-- val
     +-- test
```

## Convert prepared dataset to TFRecord

Note: the rest of this doc and released DeepLab2 models use `TrainId` instead of
`EvalId` (which is used on the evaluation server). For evaluation on the server,
you would need to convert the predicted labels to `EvalId` .

Use the following commandline to generate cityscapes TFRecords:

```bash
# Assuming we are under the folder where deeplab2 is cloned to:

# For generating data for semantic segmentation task only
python deeplab2/data/build_cityscapes_data.py \
  --cityscapes_root=${PATH_TO_CITYSCAPES_ROOT} \
  --output_dir=${OUTPUT_PATH_FOR_SEMANTIC} \
  --create_panoptic_data=false

# For generating data for panoptic segmentation task
python deeplab2/data/build_cityscapes_data.py \
  --cityscapes_root=${PATH_TO_CITYSCAPES_ROOT} \
  --output_dir=${OUTPUT_PATH_FOR_PANOPTIC}
```

Commandline above will output three sharded tfrecord files:
`{train|val|test}@10.tfrecord`. In the tfrecords, for `train` and `val` set, it
contains the RGB image pixels as well as corresponding annotations. For `test`
set, it contains RGB images only. These files will be used as the input for the
model training and evaluation.

### TFExample proto format for cityscapes

The Example proto contains the following fields:

*   `image/encoded`: encoded image content.
*   `image/filename`: image filename.
*   `image/format`: image file format.
*   `image/height`: image height.
*   `image/width`: image width.
*   `image/channels`: image channels.
*   `image/segmentation/class/encoded`: encoded segmentation content.
*   `image/segmentation/class/format`: segmentation encoding format.

For semantic segmentation (`--create_panoptic_data=false`), the encoded
segmentation map will be the same as PNG file created by
`createTrainIdLabelImgs.py`.

For panoptic segmentation, the encoded segmentation map will be the raw bytes of
a int32 panoptic map, where each pixel is assigned to a panoptic ID. Unlike the
ID used in Cityscapes script (`json2instanceImg.py`), this panoptic ID is
computed by:

```
  panoptic ID = semantic ID * label divisor + instance ID
```

where semantic ID will be:

*   ignore label (255) for pixels not belonging to any segment
*   for segments associated with `iscrowd` label:
    *   (default): ignore label (255)
    *   (if set `--treat_crowd_as_ignore=false` while running
        `build_cityscapes_data.py`): `category_id` (use TrainId)
*   `category_id` (use TrainId) for other segments

The instance ID will be 0 for pixels belonging to

*   `stuff` class
*   `thing` class with `iscrowd` label
*   pixels with ignore label

and `[1, label divisor)` otherwise.