dinovol β€” v2 backbone, patch size 6 (step 32350)

⚠️ Not used in the arXiv paper. This is an additional, exploratory patch-size-6 variant of the 3D DINOv2 representation model, released for completeness. The model used in the paper is the patch-size-8 scrollprize/dinovol_v2_ps8_with_paris4_352500.

A 3D, DINOv2/DINOv3-style self-supervised representation model for volumetric micro-CT of carbonized Herculaneum scrolls. This repository publishes the EMA teacher backbone from the patch-size-6 pretraining run (p6g240l120x6) at training step 32350. There is no task-specific head β€” you take its dense patch embeddings and use them downstream.

This slim file was consolidated from the run's FSDP2 rank-sharded checkpoint (4-way dp_shard, reassembled along dim 0 in dp_shard order) into a single inference checkpoint. The model config travels inside the weights, so the architecture is rebuilt automatically.

Model details

Backbone family DINOv2/EVA ViT, 3D, with 3D RoPE (DINOv3-style)
model_type v2
Embedding dim 864
Depth 24 blocks
Attention heads 16
MLP SwiGLU, mlp_ratio 8/3
Patch size 6 Γ— 6 Γ— 6
Global / local crop size (train) 240Β³ / 120Β³
Input channels 1 (grayscale CT)
Backbone parameters ~215.6 M
Training step 32350
W&B run p6g240l120x6_…

Patch size 6 produces a finer token grid than the ps8 model (β‰ˆ2.4Γ— more tokens per unit volume, hence higher inference compute). Pretraining objective: DINO + iBOT + KoLeo (AMP).

Files

  • dinovol_v2_ps6_step032350_teacher_backbone.pt β€” slim EMA teacher backbone (single consolidated file); contains config + teacher backbone weights.
  • config.json β€” model configuration (for reference).

How to load

from huggingface_hub import hf_hub_download
from dinovol_2.eval.embedding_utils import load_backbone_from_checkpoint

path = hf_hub_download("scrollprize/dinovol_v2_ps6_step032350",
                       "dinovol_v2_ps6_step032350_teacher_backbone.pt")
loaded = load_backbone_from_checkpoint(path, device="cuda")  # or "cpu"
backbone = loaded.backbone.eval()

Training / inference code: https://github.com/ScrollPrize/dinovol

Related

License

MIT β€” released by the Vesuvius Challenge. Underlying tomographic data are distributed under CC BY-NC 4.0.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including scrollprize/dinovol_v2_ps6_step032350