--- license: openrail language: - en pipeline_tag: image-segmentation --- ### Description Semantic segmentation is a computer vision technique for assigning a label to each pixel in an image, representing the semantic class of the objects or regions in the image. It's a form of dense prediction because it involves assigning a label to each pixel in an image, instead of just boxes around objects or key points as in object detection or instance segmentation. The goal of semantic segmentation is to recognize and understand the objects and scenes in an image, and partition the image into segments corresponding to different entities. ## Parameters ``` model = SegformerForSemanticSegmentation.from_pretrained("nvidia/mit-b4", num_labels=2, id2label=id2label, label2id=label2id, ) ``` ## Usage ```python from torch import nn import numpy as np import matplotlib.pyplot as plt # Transforms _transform = A.Compose([ A.Resize(height = 512, width=512), ToTensorV2(), ]) trans_image = _transform(image=np.array(image)) outputs = model(trans_image['image'].float().unsqueeze(0)) logits = outputs.logits.cpu() print(logits.shape) # First, rescale logits to original image size upsampled_logits = nn.functional.interpolate(logits, size=image.size[::-1], # (height, width) mode='bilinear', align_corners=False) seg = upsampled_logits.argmax(dim=1)[0] color_seg = np.zeros((seg.shape[0], seg.shape[1], 3), dtype=np.uint8) # height, width, 3 palette = np.array([[0, 0, 0],[255, 255, 255]]) for label, color in enumerate(palette): color_seg[seg == label, :] = color # Convert to BGR color_seg = color_seg[..., ::-1] ``` #Metric Todo #Note This model was not built by using Huggingface based feature extractor, so automatic api could not work.