test2 / docs /dataset_prepare.md
mccaly's picture
Upload 660 files
b13b124
## Prepare datasets
It is recommended to symlink the dataset root to `$MMSEGMENTATION/data`.
If your folder structure is different, you may need to change the corresponding paths in config files.
```none
mmsegmentation
β”œβ”€β”€ mmseg
β”œβ”€β”€ tools
β”œβ”€β”€ configs
β”œβ”€β”€ data
β”‚ β”œβ”€β”€ cityscapes
β”‚ β”‚ β”œβ”€β”€ leftImg8bit
β”‚ β”‚ β”‚ β”œβ”€β”€ train
β”‚ β”‚ β”‚ β”œβ”€β”€ val
β”‚ β”‚ β”œβ”€β”€ gtFine
β”‚ β”‚ β”‚ β”œβ”€β”€ train
β”‚ β”‚ β”‚ β”œβ”€β”€ val
β”‚ β”œβ”€β”€ VOCdevkit
β”‚ β”‚ β”œβ”€β”€ VOC2012
β”‚ β”‚ β”‚ β”œβ”€β”€ JPEGImages
β”‚ β”‚ β”‚ β”œβ”€β”€ SegmentationClass
β”‚ β”‚ β”‚ β”œβ”€β”€ ImageSets
β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ Segmentation
β”‚ β”‚ β”œβ”€β”€ VOC2010
β”‚ β”‚ β”‚ β”œβ”€β”€ JPEGImages
β”‚ β”‚ β”‚ β”œβ”€β”€ SegmentationClassContext
β”‚ β”‚ β”‚ β”œβ”€β”€ ImageSets
β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ SegmentationContext
β”‚ β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ train.txt
β”‚ β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ val.txt
β”‚ β”‚ β”‚ β”œβ”€β”€ trainval_merged.json
β”‚ β”‚ β”œβ”€β”€ VOCaug
β”‚ β”‚ β”‚ β”œβ”€β”€ dataset
β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ cls
β”‚ β”œβ”€β”€ ade
β”‚ β”‚ β”œβ”€β”€ ADEChallengeData2016
β”‚ β”‚ β”‚ β”œβ”€β”€ annotations
β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ training
β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ validation
β”‚ β”‚ β”‚ β”œβ”€β”€ images
β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ training
β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ validation
β”‚ β”œβ”€β”€ CHASE_DB1
β”‚ β”‚ β”œβ”€β”€ images
β”‚ β”‚ β”‚ β”œβ”€β”€ training
β”‚ β”‚ β”‚ β”œβ”€β”€ validation
β”‚ β”‚ β”œβ”€β”€ annotations
β”‚ β”‚ β”‚ β”œβ”€β”€ training
β”‚ β”‚ β”‚ β”œβ”€β”€ validation
β”‚ β”œβ”€β”€ DRIVE
β”‚ β”‚ β”œβ”€β”€ images
β”‚ β”‚ β”‚ β”œβ”€β”€ training
β”‚ β”‚ β”‚ β”œβ”€β”€ validation
β”‚ β”‚ β”œβ”€β”€ annotations
β”‚ β”‚ β”‚ β”œβ”€β”€ training
β”‚ β”‚ β”‚ β”œβ”€β”€ validation
β”‚ β”œβ”€β”€ HRF
β”‚ β”‚ β”œβ”€β”€ images
β”‚ β”‚ β”‚ β”œβ”€β”€ training
β”‚ β”‚ β”‚ β”œβ”€β”€ validation
β”‚ β”‚ β”œβ”€β”€ annotations
β”‚ β”‚ β”‚ β”œβ”€β”€ training
β”‚ β”‚ β”‚ β”œβ”€β”€ validation
β”‚ β”œβ”€β”€ STARE
β”‚ β”‚ β”œβ”€β”€ images
β”‚ β”‚ β”‚ β”œβ”€β”€ training
β”‚ β”‚ β”‚ β”œβ”€β”€ validation
β”‚ β”‚ β”œβ”€β”€ annotations
β”‚ β”‚ β”‚ β”œβ”€β”€ training
β”‚ β”‚ β”‚ β”œβ”€β”€ validation
```
### Cityscapes
The data could be found [here](https://www.cityscapes-dataset.com/downloads/) after registration.
By convention, `**labelTrainIds.png` are used for cityscapes training.
We provided a [scripts](https://github.com/open-mmlab/mmsegmentation/blob/master/tools/convert_datasets/cityscapes.py) based on [cityscapesscripts](https://github.com/mcordts/cityscapesScripts)
to generate `**labelTrainIds.png`.
```shell
# --nproc means 8 process for conversion, which could be omitted as well.
python tools/convert_datasets/cityscapes.py data/cityscapes --nproc 8
```
### Pascal VOC
Pascal VOC 2012 could be downloaded from [here](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar).
Beside, most recent works on Pascal VOC dataset usually exploit extra augmentation data, which could be found [here](http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz).
If you would like to use augmented VOC dataset, please run following command to convert augmentation annotations into proper format.
```shell
# --nproc means 8 process for conversion, which could be omitted as well.
python tools/convert_datasets/voc_aug.py data/VOCdevkit data/VOCdevkit/VOCaug --nproc 8
```
Please refer to [concat dataset](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/tutorials/new_dataset.md#concatenate-dataset) for details about how to concatenate them and train them together.
### ADE20K
The training and validation set of ADE20K could be download from this [link](http://data.csail.mit.edu/places/ADEchallenge/ADEChallengeData2016.zip).
We may also download test set from [here](http://data.csail.mit.edu/places/ADEchallenge/release_test.zip).
### Pascal Context
The training and validation set of Pascal Context could be download from [here](http://host.robots.ox.ac.uk/pascal/VOC/voc2010/VOCtrainval_03-May-2010.tar). You may also download test set from [here](http://host.robots.ox.ac.uk:8080/eval/downloads/VOC2010test.tar) after registration.
To split the training and validation set from original dataset, you may download trainval_merged.json from [here](https://codalabuser.blob.core.windows.net/public/trainval_merged.json).
If you would like to use Pascal Context dataset, please install [Detail](https://github.com/zhanghang1989/detail-api) and then run the following command to convert annotations into proper format.
```shell
python tools/convert_datasets/pascal_context.py data/VOCdevkit data/VOCdevkit/VOC2010/trainval_merged.json
```
### CHASE DB1
The training and validation set of CHASE DB1 could be download from [here](https://staffnet.kingston.ac.uk/~ku15565/CHASE_DB1/assets/CHASEDB1.zip).
To convert CHASE DB1 dataset to MMSegmentation format, you should run the following command:
```shell
python tools/convert_datasets/chase_db1.py /path/to/CHASEDB1.zip
```
The script will make directory structure automatically.
### DRIVE
The training and validation set of DRIVE could be download from [here](https://drive.grand-challenge.org/). Before that, you should register an account. Currently '1st_manual' is not provided officially.
To convert DRIVE dataset to MMSegmentation format, you should run the following command:
```shell
python tools/convert_datasets/drive.py /path/to/training.zip /path/to/test.zip
```
The script will make directory structure automatically.
### HRF
First, download [healthy.zip](https://www5.cs.fau.de/fileadmin/research/datasets/fundus-images/healthy.zip), [glaucoma.zip](https://www5.cs.fau.de/fileadmin/research/datasets/fundus-images/glaucoma.zip), [diabetic_retinopathy.zip](https://www5.cs.fau.de/fileadmin/research/datasets/fundus-images/diabetic_retinopathy.zip), [healthy_manualsegm.zip](https://www5.cs.fau.de/fileadmin/research/datasets/fundus-images/healthy_manualsegm.zip), [glaucoma_manualsegm.zip](https://www5.cs.fau.de/fileadmin/research/datasets/fundus-images/glaucoma_manualsegm.zip) and [diabetic_retinopathy_manualsegm.zip](https://www5.cs.fau.de/fileadmin/research/datasets/fundus-images/diabetic_retinopathy_manualsegm.zip).
To convert HRF dataset to MMSegmentation format, you should run the following command:
```shell
python tools/convert_datasets/hrf.py /path/to/healthy.zip /path/to/healthy_manualsegm.zip /path/to/glaucoma.zip /path/to/glaucoma_manualsegm.zip /path/to/diabetic_retinopathy.zip /path/to/diabetic_retinopathy_manualsegm.zip
```
The script will make directory structure automatically.
### STARE
First, download [stare-images.tar](http://cecas.clemson.edu/~ahoover/stare/probing/stare-images.tar), [labels-ah.tar](http://cecas.clemson.edu/~ahoover/stare/probing/labels-ah.tar) and [labels-vk.tar](http://cecas.clemson.edu/~ahoover/stare/probing/labels-vk.tar).
To convert STARE dataset to MMSegmentation format, you should run the following command:
```shell
python tools/convert_datasets/stare.py /path/to/stare-images.tar /path/to/labels-ah.tar /path/to/labels-vk.tar
```
The script will make directory structure automatically.