SegFormer (b3-sized) model fine-tuned on CCAgT dataset

SegFormer model fine-tuned on CCAgT dataset at resolution 400x300. It was introduced in the paper Semantic Segmentation for the Detection of Very Small Objects on Cervical Cell Samples Stained with the {AgNOR} Technique by J. G. A. Amorim et al.

This model was trained in a subset of CCAgT dataset, so perform a evaluation of this model on the dataset available at HF will differ from the results presented in the paper. For more information about how the model was trained, read the paper.

Disclaimer: This model card has been written based on the SegFormer model card by the Hugging Face team.

Model description

SegFormer consists of a hierarchical Transformer encoder and a lightweight all-MLP decode head to achieve great results on semantic segmentation benchmarks such as ADE20K and Cityscapes. The hierarchical Transformer is first pre-trained on ImageNet-1k, after which a decode head is added and fine-tuned altogether on a downstream dataset.

This repository only contains the pre-trained hierarchical Transformer, hence it can be used for fine-tuning purposes.

Intended uses & limitations

You can use the model for fine-tuning of semantic segmentation. See the model hub to look for fine-tuned versions on a task that interests you.

How to use

Here is how to use this model to segment an image of the CCAgT dataset:

from transformers import AutoFeatureExtractor, SegformerForSemanticSegmentation
from PIL import Image
import requests

url = "https://huggingface.co/lapix/segformer-b3-finetuned-ccagt-400-300/resolve/main/sampleB.png"
image = Image.open(requests.get(url, stream=True).raw))

model = SegformerForSemanticSegmentation.from_pretrained("lapix/segformer-b3-finetuned-ccagt-400-300")
feature_extractor = AutoFeatureExtractor.from_pretrained("lapix/segformer-b3-finetuned-ccagt-400-300")

pixel_values = feature_extractor(images=image, return_tensors="pt")
outputs = model(pixel_values=pixel_values)
logits = outputs.logits

# Rescale logits to original image size (400, 300)
upsampled_logits = nn.functional.interpolate(
    logits,
    size=img.size[::-1],  # (height, width)
    mode="bilinear",
    align_corners=False,
)

segmentation_mask = upsampled_logits.argmax(dim=1)[0]

print("Predicted mask:", segmentation_mask)

For more code examples, we refer to the documentation.

License

The license for this model can be found here.

BibTeX entry and citation info

@article{AtkinsonSegmentationAgNORSSRN2022,  
    author= {Jo{\~{a}}o Gustavo Atkinson Amorim and Andr{\'{e}} Vict{\'{o}}ria Matias and Allan Cerentini and Fabiana Botelho de Miranda Onofre and Alexandre Sherlley Casimiro Onofre and Aldo von Wangenheim},
    doi = {10.2139/ssrn.4126881},
    url = {https://doi.org/10.2139/ssrn.4126881},
    year = {2022},
    publisher = {Elsevier {BV}},
    title = {Semantic Segmentation for the Detection of Very Small Objects on Cervical Cell Samples Stained with the {AgNOR} Technique},
    journal = {{SSRN} Electronic Journal}
}

lapix
/

segformer-b3-finetuned-ccagt-400-300