3rd place solution

#23
by eugene123tw - opened

Summary

In this contest, I trained a ship detection model using mmdetection library. I implemented image tiling technique to detect both small and large objects in the images. I trained three VFNet-ResNet50 models pre-trained on COCO with 3 different tile resolutions and ensemble these models using Weighted-Boxes-Fusion to improve the accuracy of the model.

Data

I created one split of annotation where 80% of the data was used for training and 20% for validation. Although I did not create K-fold due to time constraints, I strongly recommend using K-fold to achieve better results. The image tiling technique was implemented in mmdetection, following the approach described in this paper: https://openaccess.thecvf.com/content_CVPRW_2019/papers/UAVision/Unel_The_Power_of_Tiling_for_Small_Object_Detection_CVPRW_2019_paper.pdf. Full images and tiled images were used in both training and inference phases to detect both large and small objects. The tiling implementation can be found in the mmdet dataset config and my mmdet fork.

Model

I trained three VFNet-ResNet50 models pre-trained on COCO with different tile sizes: 400x400 and 600x600, and full images (without tiling). These models were trained using the mmdetection config and ensemble using Weighted-Boxes-Fusion. The mmdetection config for each model is available below in the link section.

Results

The final results were ensemble using Weighted-Boxes-Fusion, a method introduced in https://arxiv.org/abs/1910.13302. Implementation for WBF can be found here: https://github.com/ZFTurbo/Weighted-Boxes-Fusion.

Links

Sign up or log in to comment