Spaces:

xichen98cn
/

FrozenSeg

Runtime error

App Files Files Community

FrozenSeg / datasets /README.md

xichen98cn

Upload 135 files

3dac99f verified 5 months ago

preview code

raw

history blame

7.82 kB

	# Prepare Datasets for FrozenSeg

	A dataset can be used by accessing [DatasetCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.DatasetCatalog)
	for its data, or [MetadataCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.MetadataCatalog) for its metadata (class names, etc).
	This document explains how to setup the builtin datasets so they can be used by the above APIs.
	[Use Custom Datasets](https://detectron2.readthedocs.io/tutorials/datasets.html) gives a deeper dive on how to use `DatasetCatalog` and `MetadataCatalog`,
	and how to add new datasets to them.

	FrozenSeg has builtin support for a few datasets.
	The datasets are assumed to exist in a directory specified by the environment variable
	`DETECTRON2_DATASETS`.
	Under this directory, detectron2 will look for datasets in the structure described below, if needed.
	```
	$DETECTRON2_DATASETS/
	# panoptic datasets
	ADEChallengeData2016/
	coco/
	cityscapes/
	mapillary_vistas/
	bdd100k/
	# semantic datasets
	VOCdevkit/
	ADE20K_2021_17_01/
	pascal_ctx_d2/
	pascal_voc_d2/
	```

	You can set the location for builtin datasets by `export DETECTRON2_DATASETS=/path/to/datasets`.
	If left unset, the default is `./datasets` relative to your current working directory.


	## Expected dataset structure for [COCO](https://cocodataset.org/#download):

	```
	coco/
	annotations/
	instances_{train,val}2017.json
	panoptic_{train,val}2017.json
	{train,val}2017/
	# image files that are mentioned in the corresponding json
	panoptic_{train,val}2017/ # png annotations
	panoptic_semseg_{train,val}2017/ # generated by the script mentioned below
	```

	Install panopticapi by:
	```
	pip install git+https://github.com/cocodataset/panopticapi.git
	```
	Then, run `python datasets/prepare_coco_semantic_annos_from_panoptic_annos.py`, to extract semantic annotations from panoptic annotations (only used for evaluation).


	## Expected dataset structure for [cityscapes](https://www.cityscapes-dataset.com/downloads/):
	```
	cityscapes/
	gtFine/
	train/
	aachen/
	color.png, instanceIds.png, labelIds.png, polygons.json,
	labelTrainIds.png
	...
	val/
	test/
	# below are generated Cityscapes panoptic annotation
	cityscapes_panoptic_train.json
	cityscapes_panoptic_train/
	cityscapes_panoptic_val.json
	cityscapes_panoptic_val/
	cityscapes_panoptic_test.json
	cityscapes_panoptic_test/
	leftImg8bit/
	train/
	val/
	test/
	```
	Install cityscapes scripts by:
	```
	pip install git+https://github.com/mcordts/cityscapesScripts.git
	```

	Note: to create labelTrainIds.png, first prepare the above structure, then run cityscapesescript with:
	```
	CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createTrainIdLabelImgs.py
	```
	These files are not needed for instance segmentation.

	Note: to generate Cityscapes panoptic dataset, run cityscapesescript with:
	```
	CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createPanopticImgs.py
	```
	These files are not needed for semantic and instance segmentation.


	## Expected dataset structure for [ADE20k (A150)](http://sceneparsing.csail.mit.edu/):
	```
	ADEChallengeData2016/
	images/
	annotations/
	objectInfo150.txt
	# download instance annotation
	annotations_instance/
	# generated by prepare_ade20k_sem_seg.py
	annotations_detectron2/
	# below are generated by prepare_ade20k_pan_seg.py
	ade20k_panoptic_{train,val}.json
	ade20k_panoptic_{train,val}/
	# below are generated by prepare_ade20k_ins_seg.py
	ade20k_instance_{train,val}.json
	```

	The directory `annotations_detectron2` is generated by running `python datasets/prepare_ade20k_sem_seg.py`.

	Install panopticapi by:
	```bash
	pip install git+https://github.com/cocodataset/panopticapi.git
	```

	Download the instance annotation from http://sceneparsing.csail.mit.edu/:
	```bash
	wget http://sceneparsing.csail.mit.edu/data/ChallengeData2017/annotations_instance.tar
	```

	Then, run `python datasets/prepare_ade20k_pan_seg.py`, to combine semantic and instance annotations for panoptic annotations.

	And run `python datasets/prepare_ade20k_ins_seg.py`, to extract instance annotations in COCO format.


	## Expected dataset structure for [Mapillary Vistas](https://www.mapillary.com/dataset/vistas):
	```
	mapillary_vistas/
	training/
	images/
	instances/
	labels/
	panoptic/
	validation/
	images/
	instances/
	labels/
	panoptic/
	```

	No preprocessing is needed for Mapillary Vistas on semantic and panoptic segmentation.

	## Expected dataset structure for [BDD100K](https://doc.bdd100k.com/download.html#id1)
	```
	bdd100k/
	images/
	10k/
	train/
	val/
	test/
	json
	labels/
	pan_seg/
	sem_seg/
	```

	`coco-format` annotations is obtained by running:


	```
	cd $DETECTRON2_DATASETS
	wget https://github.com/chenxi52/FrozenSeg/releases/download/latest/bdd100k_json.zip
	unzip bdd100k_json.zip
	```


	## Expected dataset structure for [ADE20k-Full (A-847)](https://groups.csail.mit.edu/vision/datasets/ADE20K/):
	```
	ADE20K_2021_17_01/
	images/
	index_ade20k.pkl
	objects.txt
	# generated by prepare_ade20k_full_sem_seg.py
	images_detectron2/
	annotations_detectron2/
	```

	Register and download the dataset from https://groups.csail.mit.edu/vision/datasets/ADE20K/:
	```bash
	cd $DETECTRON2_DATASETS
	wget your/personal/download/link/{username}_{hash}.zip
	unzip {username}_{hash}.zip
	```

	Generate the directories `ADE20K_2021_17_01/images_detectron2` and `ADE20K_2021_17_01/annotations_detectron2` by running:
	```bash
	python datasets/prepare_ade20k_full_sem_seg.py
	```

	## Expected dataset structure for [PASCAL Context Full (PC-459)](https://www.cs.stanford.edu/~roozbeh/pascal-context/) and [PASCAL VOC (PAS-21)](http://host.robots.ox.ac.uk/pascal/VOC/):

	```bash
	VOCdevkit/
	VOC2012/
	Annotations/
	JPEGImages/
	ImageSets/
	Segmentation/
	VOC2010/
	JPEGImages/
	trainval/
	trainval_merged.json
	# generated by prepare_pascal_voc_sem_seg.py
	pascal_voc_d2/
	images/
	annotations_pascal21/
	# pascal 20 excludes the background class
	annotations_pascal20/
	# generated by prepare_pascal_ctx_sem_seg.py
	pascal_ctx_d2/
	images/
	annotations_ctx59/
	# generated by prepare_pascal_ctx_full_sem_seg.py
	annotations_ctx459/

	```
	### PASCAL VOC (PAS-21)

	Download the dataset from http://host.robots.ox.ac.uk/pascal/VOC/:
	```bash
	cd $DETECTRON2_DATASETS
	wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
	# generate folder VOCdevkit/VOC2012
	tar -xvf VOCtrainval_11-May-2012.tar
	```

	Generate directory `pascal_voc_d2` running:
	```bash
	python datasets/prepare_pascal_voc_sem_seg.py
	```

	### PASCAL Context Full (PC-459)

	Download the dataset from http://host.robots.ox.ac.uk/pascal/VOC/ and annotation from https://www.cs.stanford.edu/~roozbeh/pascal-context/:
	```bash
	cd $DETECTRON2_DATASETS
	wget http://host.robots.ox.ac.uk/pascal/VOC/voc2010/VOCtrainval_03-May-2010.tar
	# generate folder VOCdevkit/VOC2010
	tar -xvf VOCtrainval_03-May-2010.tar
	wget https://www.cs.stanford.edu/~roozbeh/pascal-context/trainval.tar.gz
	# generate folder VOCdevkit/VOC2010/trainval
	tar -xvzf trainval.tar.gz -C VOCdevkit/VOC2010
	wget https://codalabuser.blob.core.windows.net/public/trainval_merged.json -P VOCdevkit/VOC2010/
	```

	Install [Detail API](https://github.com/zhanghang1989/detail-api) by:
	```bash
	git clone https://github.com/zhanghang1989/detail-api.git
	rm detail-api/PythonAPI/detail/_mask.c
	pip install -e detail-api/PythonAPI/
	```

	Generate directory `pascal_ctx_d2/images` running:
	```bash
	python datasets/prepare_pascal_ctx_sem_seg.py
	```

	Generate directory `pascal_ctx_d2/annotations_ctx459` running:
	```bash
	python datasets/prepare_pascal_ctx_full_sem_seg.py
	```