arxiv:2205.11423

Decoder Denoising Pretraining for Semantic Segmentation

Published on May 23, 2022

Authors:

Abstract

Semantic segmentation labels are expensive and time consuming to acquire. Hence, pretraining is commonly used to improve the label-efficiency of segmentation models. Typically, the encoder of a segmentation model is pretrained as a classifier and the decoder is randomly initialized. Here, we argue that random initialization of the decoder can be suboptimal, especially when few labeled examples are available. We propose a decoder pretraining approach based on denoising, which can be combined with supervised <PRE_TAG>pretraining</POST_TAG> of the encoder. We find that decoder denoising <PRE_TAG>pretraining</POST_TAG> on the ImageNet dataset strongly outperforms encoder-only supervised <PRE_TAG>pretraining</POST_TAG>. Despite its simplicity, decoder denoising <PRE_TAG>pretraining</POST_TAG> achieves state-of-the-art results on label-efficient semantic segmentation and offers considerable gains on the Cityscapes, Pascal Context, and ADE20K datasets.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

No model linking this paper

Cite arxiv.org/abs/2205.11423 in a model README.md to link it from this page.

No dataset linking this paper

Cite arxiv.org/abs/2205.11423 in a dataset README.md to link it from this page.

No Space linking this paper

Cite arxiv.org/abs/2205.11423 in a Space README.md to link it from this page.

No Collection including this paper

Add this paper to a collection to link it from this page.