tue-mps
/

coco_instance_eomt_large_640_dinov3

Image Segmentation

Model card Files Files and versions

neikos00 commited on Sep 15

Commit

4083365

·

verified ·

1 Parent(s): b3308c4

Create README.md

Files changed (1) hide show

README.md +21 -0

README.md ADDED Viewed

	@@ -0,0 +1,21 @@

+---
+library_name: transformers
+license: mit
+tags:
+- vision
+- image-segmentation
+- pytorch
+---
+# EoMT
+[![PyTorch](https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white)](https://pytorch.org/)
+**EoMT (Encoder-only Mask Transformer)** is a Vision Transformer (ViT) architecture designed for high-quality and efficient image segmentation. It was introduced in the CVPR 2025 highlight paper:
+**[Your ViT is Secretly an Image Segmentation Model](https://www.tue-mps.org/eomt)**
+by Tommie Kerssies, Niccolò Cavagnero, Alexander Hermans, Narges Norouzi, Giuseppe Averta, Bastian Leibe, Gijs Dubbelman, and Daan de Geus.
+> **Key Insight**: Given sufficient scale and pretraining, a plain ViT along with additional few params can perform segmentation without the need for task-specific decoders or pixel fusion modules. The same model backbone supports semantic, instance, and panoptic segmentation with different post-processing 🤗
+The original implementation can be found in this [repository](https://github.com/tue-mps/eomt)
+---