shubha07m/yolo_phantom · Hugging Face

MODIPHY:

Multimodal Obscured Detection for IoT using PHantom Convolution-Enabled Faster YOLO

We developed “YOLO Phantom” for detection in low-light conditions and occluded scenarios within resource-constrained IoT applications. We proposed the novel "Phantom Convolution," which enables YOLO Phantom to achieve comparable accuracy to YOLOv8n with a 43% reduction in parameters and size, resulting in a 19% reduction in GFLOPs. By employing transfer learning on our multimodal dataset, the model demonstrates enhanced vision capabilities in adverse conditions. Our Raspberry Pi IoT platform equipped with noIR cameras and integration with AWS IoT Core and SNS showcases a substantial 17% and 14% boost in frames per second for thermal and RGB data detection, respectively, compared to the baseline YOLOv8n model.

Comparison of small models

Detection in various low light and occluded conditions

To know more about the MODIPHY please refer to the preprint available in arXiv

Please refer to yolo_phantom for the implementation.

Download the multimodal dataset

If you find this work useful consider citing us:

@article{mukherjee2024modiphy, title={MODIPHY: Multimodal Obscured Detection for IoT using PHantom Convolution-Enabled Faster YOLO}, author={Mukherjee, Shubhabrata and Beard, Cory and Li, Zhu}, journal={arXiv preprint arXiv:2402.07894}, year={2024} }