- ResUNet++: An Advanced Architecture for Medical Image Segmentation Accurate computer-aided polyp detection and segmentation during colonoscopy examinations can help endoscopists resect abnormal tissue and thereby decrease chances of polyps growing into cancer. Towards developing a fully automated model for pixel-wise polyp segmentation, we propose ResUNet++, which is an improved ResUNet architecture for colonoscopic image segmentation. Our experimental evaluations show that the suggested architecture produces good segmentation results on publicly available datasets. Furthermore, ResUNet++ significantly outperforms U-Net and ResUNet, two key state-of-the-art deep learning architectures, by achieving high evaluation scores with a dice coefficient of 81.33%, and a mean Intersection over Union (mIoU) of 79.27% for the Kvasir-SEG dataset and a dice coefficient of 79.55%, and a mIoU of 79.62% with CVC-612 dataset. 7 authors · Nov 16, 2019
- Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation Deep neural network based methods have been successfully applied to music source separation. They typically learn a mapping from a mixture spectrogram to a set of source spectrograms, all with magnitudes only. This approach has several limitations: 1) its incorrect phase reconstruction degrades the performance, 2) it limits the magnitude of masks between 0 and 1 while we observe that 22% of time-frequency bins have ideal ratio mask values of over~1 in a popular dataset, MUSDB18, 3) its potential on very deep architectures is under-explored. Our proposed system is designed to overcome these. First, we propose to estimate phases by estimating complex ideal ratio masks (cIRMs) where we decouple the estimation of cIRMs into magnitude and phase estimations. Second, we extend the separation method to effectively allow the magnitude of the mask to be larger than 1. Finally, we propose a residual UNet architecture with up to 143 layers. Our proposed system achieves a state-of-the-art MSS result on the MUSDB18 dataset, especially, a SDR of 8.98~dB on vocals, outperforming the previous best performance of 7.24~dB. The source code is available at: https://github.com/bytedance/music_source_separation 5 authors · Sep 11, 2021
- Adversarial Perturbations Prevail in the Y-Channel of the YCbCr Color Space Deep learning offers state of the art solutions for image recognition. However, deep models are vulnerable to adversarial perturbations in images that are subtle but significantly change the model's prediction. In a white-box attack, these perturbations are generally learned for deep models that operate on RGB images and, hence, the perturbations are equally distributed in the RGB color space. In this paper, we show that the adversarial perturbations prevail in the Y-channel of the YCbCr space. Our finding is motivated from the fact that the human vision and deep models are more responsive to shape and texture rather than color. Based on our finding, we propose a defense against adversarial images. Our defence, coined ResUpNet, removes perturbations only from the Y-channel by exploiting ResNet features in an upsampling framework without the need for a bottleneck. At the final stage, the untouched CbCr-channels are combined with the refined Y-channel to restore the clean image. Note that ResUpNet is model agnostic as it does not modify the DNN structure. ResUpNet is trained end-to-end in Pytorch and the results are compared to existing defence techniques in the input transformation category. Our results show that our approach achieves the best balance between defence against adversarial attacks such as FGSM, PGD and DDN and maintaining the original accuracies of VGG-16, ResNet50 and DenseNet121 on clean images. We perform another experiment to show that learning adversarial perturbations only for the Y-channel results in higher fooling rates for the same perturbation magnitude. 5 authors · Feb 24, 2020