EPFL-ECEO/coralscapes
Viewer • Updated • 2.08k • 1.72k • 10
SegFormer model with a DinoV3 VIT-B backbone fine-tuned for The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs.
The simplest way to use this model to segment an image of the Coralscapes dataset is as follows:
Install dependencies:
pip install torch transformers safetensors huggingface_hub datasets pillow
Run the model:
from pathlib import Path
import importlib.util
import torch
from datasets import load_dataset
from huggingface_hub import snapshot_download
from PIL import Image
REPO_ID = "EPFL-ECEO/coralscapes-vit-b-dpt" # replace if needed
# 1) Download model repo snapshot
root = Path(snapshot_download(REPO_ID))
# 2) Load self-contained model code from the repo
spec = importlib.util.spec_from_file_location("coralscapes_hub_model", root / "coralscapes_hub_model.py")
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
# 3) Build model + load weights
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = mod.Dinov3DPTSegmenter.from_pretrained(root, map_location=device).eval()
# 4) Load one test image from HF dataset
ds = load_dataset("EPFL-ECEO/coralscapes", split="test")
image = ds[42]["image"].convert("RGB") # PIL image
image = image.resize((1376, 768), resample=Image.BILINEAR) # (W, H), divisible by 16, agnostic to aspect ratio
# 5) Preprocess + inference
batch = model.processor(images=image, return_tensors="pt", do_resize=False)["pixel_values"].to(device)
with torch.no_grad():
logits = model(batch) # shape [1, C, H, W]
pred = logits.argmax(dim=1)[0].cpu() # shape [H, W], class IDs
The model is trained on extended versions of train+val splits of the Coralscapes dataset which is a general-purpose dense semantic segmentation dataset for coral reefs.
Single pass (768x1376 resolution):
Double pass (1024x1024 left and right half of image, as in the paper):
If you use this model, cite:
@inproceedings{sauder2025coralscapes,
title={The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs},
author={Sauder, Jonathan and Domazetoski, Viktor and Banc-Prandi, Guilhem and Perna, Gabriela and Meibom, Anders and Tuia, Devis},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision: Joint Workshop on Marine Vision},
pages={2115--2122},
year={2025}
}
Base model
facebook/dinov3-vit7b16-pretrain-lvd1689m