dinovol β v2 backbone, patch size 6 (step 32350)
β οΈ Not used in the arXiv paper. This is an additional, exploratory patch-size-6 variant of the 3D DINOv2 representation model, released for completeness. The model used in the paper is the patch-size-8
scrollprize/dinovol_v2_ps8_with_paris4_352500.
A 3D, DINOv2/DINOv3-style self-supervised representation model for volumetric micro-CT
of carbonized Herculaneum scrolls. This repository publishes the EMA teacher backbone
from the patch-size-6 pretraining run (p6g240l120x6) at training step 32350. There is
no task-specific head β you take its dense patch embeddings and use them downstream.
This slim file was consolidated from the run's FSDP2 rank-sharded checkpoint (4-way
dp_shard, reassembled along dim 0 in dp_shard order) into a single inference checkpoint.
The model config travels inside the weights, so the architecture is rebuilt automatically.
Model details
| Backbone family | DINOv2/EVA ViT, 3D, with 3D RoPE (DINOv3-style) |
model_type |
v2 |
| Embedding dim | 864 |
| Depth | 24 blocks |
| Attention heads | 16 |
| MLP | SwiGLU, mlp_ratio 8/3 |
| Patch size | 6 Γ 6 Γ 6 |
| Global / local crop size (train) | 240Β³ / 120Β³ |
| Input channels | 1 (grayscale CT) |
| Backbone parameters | ~215.6 M |
| Training step | 32350 |
| W&B run | p6g240l120x6_β¦ |
Patch size 6 produces a finer token grid than the ps8 model (β2.4Γ more tokens per unit volume, hence higher inference compute). Pretraining objective: DINO + iBOT + KoLeo (AMP).
Files
dinovol_v2_ps6_step032350_teacher_backbone.ptβ slim EMA teacher backbone (single consolidated file); containsconfig+teacherbackbone weights.config.jsonβ model configuration (for reference).
How to load
from huggingface_hub import hf_hub_download
from dinovol_2.eval.embedding_utils import load_backbone_from_checkpoint
path = hf_hub_download("scrollprize/dinovol_v2_ps6_step032350",
"dinovol_v2_ps6_step032350_teacher_backbone.pt")
loaded = load_backbone_from_checkpoint(path, device="cuda") # or "cpu"
backbone = loaded.backbone.eval()
Training / inference code: https://github.com/ScrollPrize/dinovol
Related
- Paper representation model (patch size 8):
scrollprize/dinovol_v2_ps8_with_paris4_352500 - Vesuvius Challenge: https://scrollprize.org
License
MIT β released by the Vesuvius Challenge. Underlying tomographic data are distributed under CC BY-NC 4.0.
- Downloads last month
- -