@merve on Hugging Face: "Posting about a very underrated model that tops paperswithcode across…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

merve

posted an update Jan 18, 2024

Post

Posting about a very underrated model that tops paperswithcode across different segmentation benchmarks: OneFormer 👑

OneFormer is a "truly universal" model for semantic, instance and panoptic segmentation tasks ⚔️
What makes is truly universal is that it's a single model that is trained only once and can be used across all tasks.
The enabler here is the text conditioning, i.e. the model is given a text query that states task type along with the appropriate input, and using contrastive loss, the model learns the difference between different task types 👇 (see in the image below)

It's also super easy to use with transformers.

from transformers import OneFormerProcessor, OneFormerForUniversalSegmentation

processor = OneFormerProcessor.from_pretrained("shi-labs/oneformer_ade20k_swin_large")
model = OneFormerForUniversalSegmentation.from_pretrained("shi-labs/oneformer_ade20k_swin_large")

# swap the postprocessing and task_inputs for different types of segmentation
semantic_inputs = processor(images=image, task_inputs=["semantic"], return_tensors="pt")
semantic_outputs = model(**semantic_inputs)
predicted_semantic_map = processor.post_process_semantic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]

I have drafted a notebook for you to try right away ✨ https://colab.research.google.com/drive/1wfJhoTFqUqcTAYAOUc6TXUubBTmOYaVa?usp=sharing
You can also check out the Space without checking out the code itself 👉 shi-labs/OneFormer

Taranjeet94

Jan 19, 2024

AWESOME

vztay

Jan 19, 2024

Wow, love it, great work... Something like this would be great implemented in bin-picking and working station in my work-place.

I work as a Mechanical Engineer for a foundry and I would like to learn more, but I don't know anything about coding.

Is there anyway that I could learn or train Mechanical Robot with zero/minor coding abilities?

Thanks, Tay from Italy.

moonlock

Jan 21, 2024

So good! This kind of architecture has been on my mind, thank you very much for sharing. The approach and method used for generating 'pixel tokens' looks great.

In this post