Image Segmentation
Transformers
PyTorch
upernet
Inference Endpoints
test2 / configs /upernet /README.md
mccaly's picture
Upload 660 files
b13b124

Unified Perceptual Parsing for Scene Understanding

Introduction

[ALGORITHM]

@inproceedings{xiao2018unified,
  title={Unified perceptual parsing for scene understanding},
  author={Xiao, Tete and Liu, Yingcheng and Zhou, Bolei and Jiang, Yuning and Sun, Jian},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  pages={418--434},
  year={2018}
}

Results and models

Cityscapes

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) mIoU mIoU(ms+flip) download
UPerNet R-50 512x1024 40000 6.4 4.25 77.10 78.37 model | log
UPerNet R-101 512x1024 40000 7.4 3.79 78.69 80.11 model | log
UPerNet R-50 769x769 40000 7.2 1.76 77.98 79.70 model | log
UPerNet R-101 769x769 40000 8.4 1.56 79.03 80.77 model | log
UPerNet R-50 512x1024 80000 - - 78.19 79.19 model | log
UPerNet R-101 512x1024 80000 - - 79.40 80.46 model | log
UPerNet R-50 769x769 80000 - - 79.39 80.92 model | log
UPerNet R-101 769x769 80000 - - 80.10 81.49 model | log

ADE20K

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) mIoU mIoU(ms+flip) download
UPerNet R-50 512x512 80000 8.1 23.40 40.70 41.81 model | log
UPerNet R-101 512x512 80000 9.1 20.34 42.91 43.96 model | log
UPerNet R-50 512x512 160000 - - 42.05 42.78 model | log
UPerNet R-101 512x512 160000 - - 43.82 44.85 model | log

Pascal VOC 2012 + Aug

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) mIoU mIoU(ms+flip) download
UPerNet R-50 512x512 20000 6.4 23.17 74.82 76.35 model | log
UPerNet R-101 512x512 20000 7.5 19.98 77.10 78.29 model | log
UPerNet R-50 512x512 40000 - - 75.92 77.44 model | log
UPerNet R-101 512x512 40000 - - 77.43 78.56 model | log