ZJF-Thunder
/

Swin-Transformer-Object-Detection

Model card Files Files and versions Community

Swin-Transformer-Object-Detection / docs /2_new_data_model.md

ZJF-Thunder

添加文件

e26e560 over 2 years ago

preview code

raw

history blame contribute delete

6.94 kB

	# 2: Train with customized datasets

	In this note, you will know how to inference, test, and train predefined models with customized datasets. We use the [balloon dataset](https://github.com/matterport/Mask_RCNN/tree/master/samples/balloon) as an example to describe the whole process.

	The basic steps are as below:

	1. Prepare the customized dataset
	2. Prepare a config
	3. Train, test, inference models on the customized dataset.

	## Prepare the customized dataset

	There are three ways to support a new dataset in MMDetection:

	1. reorganize the dataset into COCO format.
	2. reorganize the dataset into a middle format.
	3. implement a new dataset.

	Usually we recommend to use the first two methods which are usually easier than the third.

	In this note, we give an example for converting the data into COCO format.

	Note: MMDetection only supports evaluating mask AP of dataset in COCO format for now.
	So for instance segmentation task users should convert the data into coco format.

	### COCO annotation format

	The necessary keys of COCO format for instance segmentation is as below, for the complete details, please refer [here](https://cocodataset.org/#format-data).

	```json
	{
	"images": [image],
	"annotations": [annotation],
	"categories": [category]
	}


	image = {
	"id": int,
	"width": int,
	"height": int,
	"file_name": str,
	}

	annotation = {
	"id": int,
	"image_id": int,
	"category_id": int,
	"segmentation": RLE or [polygon],
	"area": float,
	"bbox": [x,y,width,height],
	"iscrowd": 0 or 1,
	}

	categories = [{
	"id": int,
	"name": str,
	"supercategory": str,
	}]
	```

	Assume we use the balloon dataset.
	After downloading the data, we need to implement a function to convert the annotation format into the COCO format. Then we can use implemented COCODataset to load the data and perform training and evaluation.

	If you take a look at the dataset, you will find the dataset format is as below:

	```json
	{'base64_img_data': '',
	'file_attributes': {},
	'filename': '34020010494_e5cb88e1c4_k.jpg',
	'fileref': '',
	'regions': {'0': {'region_attributes': {},
	'shape_attributes': {'all_points_x': [1020,
	1000,
	994,
	1003,
	1023,
	1050,
	1089,
	1134,
	1190,
	1265,
	1321,
	1361,
	1403,
	1428,
	1442,
	1445,
	1441,
	1427,
	1400,
	1361,
	1316,
	1269,
	1228,
	1198,
	1207,
	1210,
	1190,
	1177,
	1172,
	1174,
	1170,
	1153,
	1127,
	1104,
	1061,
	1032,
	1020],
	'all_points_y': [963,
	899,
	841,
	787,
	738,
	700,
	663,
	638,
	621,
	619,
	643,
	672,
	720,
	765,
	800,
	860,
	896,
	942,
	990,
	1035,
	1079,
	1112,
	1129,
	1134,
	1144,
	1153,
	1166,
	1166,
	1150,
	1136,
	1129,
	1122,
	1112,
	1084,
	1037,
	989,
	963],
	'name': 'polygon'}}},
	'size': 1115004}
	```

	The annotation is a JSON file where each key indicates an image's all annotations.
	The code to convert the balloon dataset into coco format is as below.

	```python
	import os.path as osp

	def convert_balloon_to_coco(ann_file, out_file, image_prefix):
	data_infos = mmcv.load(ann_file)

	annotations = []
	images = []
	obj_count = 0
	for idx, v in enumerate(mmcv.track_iter_progress(data_infos.values())):
	filename = v['filename']
	img_path = osp.join(image_prefix, filename)
	height, width = mmcv.imread(img_path).shape[:2]

	images.append(dict(
	id=idx,
	file_name=filename,
	height=height,
	width=width))

	bboxes = []
	labels = []
	masks = []
	for _, obj in v['regions'].items():
	assert not obj['region_attributes']
	obj = obj['shape_attributes']
	px = obj['all_points_x']
	py = obj['all_points_y']
	poly = [(x + 0.5, y + 0.5) for x, y in zip(px, py)]
	poly = [p for x in poly for p in x]

	x_min, y_min, x_max, y_max = (
	min(px), min(py), max(px), max(py))


	data_anno = dict(
	image_id=idx,
	id=obj_count,
	category_id=0,
	bbox=[x_min, y_min, x_max - x_min, y_max - y_min],
	area=(x_max - x_min) * (y_max - y_min),
	segmentation=[poly],
	iscrowd=0)
	annotations.append(data_anno)
	obj_count += 1

	coco_format_json = dict(
	images=images,
	annotations=annotations,
	categories=[{'id':0, 'name': 'balloon'}])
	mmcv.dump(coco_format_json, out_file)

	```

	Using the function above, users can successfully convert the annotation file into json format, then we can use `CocoDataset` to train and evaluate the model.

	## Prepare a config

	The second step is to prepare a config thus the dataset could be successfully loaded. Assume that we want to use Mask R-CNN with FPN, the config to train the detector on balloon dataset is as below. Assume the config is under directory `configs/balloon/` and named as `mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_balloon.py`, the config is as below.

	```python
	# The new config inherits a base config to highlight the necessary modification
	_base_ = 'mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_coco.py'

	# We also need to change the num_classes in head to match the dataset's annotation
	model = dict(
	roi_head=dict(
	bbox_head=dict(num_classes=1),
	mask_head=dict(num_classes=1)))

	# Modify dataset related settings
	dataset_type = 'COCODataset'
	classes = ('balloon',)
	data = dict(
	train=dict(
	img_prefix='balloon/train/',
	classes=classes,
	ann_file='balloon/train/annotation_coco.json'),
	val=dict(
	img_prefix='balloon/val/',
	classes=classes,
	ann_file='balloon/val/annotation_coco.json'),
	test=dict(
	img_prefix='balloon/val/',
	classes=classes,
	ann_file='balloon/val/annotation_coco.json'))

	# We can use the pre-trained Mask RCNN model to obtain higher performance
	load_from = 'checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth'
	```

	## Train a new model

	To train a model with the new config, you can simply run

	```shell
	python tools/train.py configs/balloon/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_balloon.py
	```

	For more detailed usages, please refer to the [Case 1](1_exist_data_model.md).

	## Test and inference

	To test the trained model, you can simply run

	```shell
	python tools/test.py configs/balloon/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_balloon.py work_dirs/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_balloon.py/latest.pth --eval bbox segm
	```

	For more detailed usages, please refer to the [Case 1](1_exist_data_model.md).