Mask2Former (Swin-Base)

Cheng et al., 2022 — Masked-attention Mask Transformer for Universal Image Segmentation (arXiv:2112.01527)

Lucid port of facebook/mask2former-swin-base-ade-semantic, converted to Lucid-native safetensors.

Available weights

Tag mIoU Params GFLOPs Size Source
ADE20K (default) 53.9 106.9M — 407.98 MB facebook

Usage

import lucid.models as models
from lucid.models.weights import Mask2FormerSwinBaseWeights

# default tag
model = models.mask2former_swin_base(pretrained=True)

# explicit tag (enum or string)
model = models.mask2former_swin_base(weights=Mask2FormerSwinBaseWeights.ADE20K)
model = models.mask2former_swin_base(pretrained="ADE20K")

# preprocessing travels with the weights
weights = Mask2FormerSwinBaseWeights.ADE20K
preprocess = weights.transforms()
out = model(preprocess(image)[None])
# SemanticSegmentationOutput: per-pixel class logits (B, C, H, W)
seg = out.logits.argmax(axis=1)  # (B, H, W) class indices

Conversion

Converted from facebook/mask2former-swin-base-ade-semantic via python -m tools.convert_weights mask2former_swin_base --tag ADE20K. Key mapping + numerical parity verified against the source.

License

other — inherited from the original weights.

Citation

@inproceedings{cheng2022mask2former,
  title={Masked-attention Mask Transformer for Universal Image Segmentation},
  author={Cheng, Bowen and Misra, Ishan and Schwing, Alexander G. and Kirillov, Alexander and Girdhar, Rohit},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2022}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for lucid-dl/mask2former-swin-base-ade

Evaluation results