Image Segmentation
Transformers
PyTorch
upernet
Inference Endpoints
mccaly's picture
Upload 660 files
b13b124

Non-local Neural Networks

Introduction

[ALGORITHM]

@inproceedings{wang2018non,
  title={Non-local neural networks},
  author={Wang, Xiaolong and Girshick, Ross and Gupta, Abhinav and He, Kaiming},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={7794--7803},
  year={2018}
}

Results and models

Cityscapes

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) mIoU mIoU(ms+flip) download
NonLocal R-50-D8 512x1024 40000 7.4 2.72 78.24 - model | log
NonLocal R-101-D8 512x1024 40000 10.9 1.95 78.66 - model | log
NonLocal R-50-D8 769x769 40000 8.9 1.52 78.33 79.92 model | log
NonLocal R-101-D8 769x769 40000 12.8 1.05 78.57 80.29 model | log
NonLocal R-50-D8 512x1024 80000 - - 78.01 - model | log
NonLocal R-101-D8 512x1024 80000 - - 78.93 - model | log
NonLocal R-50-D8 769x769 80000 - - 79.05 80.68 model | log
NonLocal R-101-D8 769x769 80000 - - 79.40 80.85 model | log

ADE20K

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) mIoU mIoU(ms+flip) download
NonLocal R-50-D8 512x512 80000 9.1 21.37 40.75 42.05 model | log
NonLocal R-101-D8 512x512 80000 12.6 13.97 42.90 44.27 model | log
NonLocal R-50-D8 512x512 160000 - - 42.03 43.04 model | log
NonLocal R-101-D8 512x512 160000 - - 43.36 44.83 model | log

Pascal VOC 2012 + Aug

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) mIoU mIoU(ms+flip) download
NonLocal R-50-D8 512x512 20000 6.4 21.21 76.20 77.12 model | log
NonLocal R-101-D8 512x512 20000 9.8 14.01 78.15 78.86 model | log
NonLocal R-50-D8 512x512 40000 - - 76.65 77.47 model | log
NonLocal R-101-D8 512x512 40000 - - 78.27 79.12 model | log