G2G: Exploiting Intra-Group Geometry for Inter-Group Pose Estimation

🌐 Project Page   |   📄 arXiv   |   💻 Code (GitHub)

Recovering the relative 6-DoF pose between two image groups underlies cross-sequence relocalization, multi-camera rig odometry, and other multi-view tasks. Each group carries known intra-group geometry from a pre-built map, odometry, or rig calibration, and pretrained multi-view backbones already fuse such geometry into visual features. Yet current models treat all views as an unstructured set, leaving cross-group reasoning as the missing piece.

G2G keeps a multi-view foundation model entirely frozen and adds three lightweight trainable modules (32M parameters, under 6% of the full model) to bridge the two groups: a perceiver resampler, a cross-group bridge with merged self-attention, and a multi-frame pose head. Supervised only by relative poses, G2G attains state-of-the-art accuracy on both tasks across four datasets.

This repository hosts the released artifacts for the paper. Code, installation, and full usage live on GitHub: https://github.com/WeiYuFei0217/G2G

Contents

Path Description
release_weights/*.pth 10 pretrained G2G-only weights (frozen backbone excluded; each ~123 MB)
map-anything-model/ Frozen MapAnything backbone (DINOv2-large/1024, ~2.1 GB)
examples.zip Sanity-check input bundles (reloc/ + rig/)
eval_results.zip Paper-subset per-pair evaluation CSVs

MapAnything backbone (mirrored here). G2G runs on a frozen DINOv2-large / 1024-dim MapAnything backbone (MapAnything v1.0.1, commit fde8425). The exact compatible checkpoint (~2.1 GB) is included under map-anything-model/, because the current facebook/map-anything Hugging Face weights are the newer giant / 1536-dim variant and are incompatible with these G2G modules. MapAnything is a work by Meta AI (facebookresearch/map-anything); please also respect its original license when using this backbone.

Pretrained weights

Weight Task Dataset
HM3D-Reloc.pth Reloc HM3D
TartanGround-Reloc.pth Reloc TartanGround
NCLT-Reloc.pth Reloc NCLT
ZJH-Reloc.pth Reloc ZJH
HM3D-Rig-8.pth Rig HM3D (8-cam)
HM3D-Rig-4.pth Rig HM3D (4-cam)
TartanGround-Rig-4.pth Rig TartanGround (4-cam)
NCLT-Rig-Intra.pth Rig NCLT intra-season (5-cam)
NCLT-Rig-Cross.pth Rig NCLT cross-season (5-cam)
ZJH-Rig-4.pth Rig ZJH (4-cam)

These are G2G-only weights (frozen backbone excluded). The evaluation scripts in the GitHub repo automatically handle partial loading.

Download

Grab a single weight:

from huggingface_hub import hf_hub_download

ckpt = hf_hub_download(
    repo_id="feixue22/G2G",
    filename="release_weights/HM3D-Reloc.pth",
)
print(ckpt)

Or pull everything (weights + example/eval bundles):

from huggingface_hub import snapshot_download

local_dir = snapshot_download(repo_id="feixue22/G2G")
print(local_dir)

Usage

These weights plug into the G2G code on GitHub. After cloning and installing (https://github.com/WeiYuFei0217/G2G), run evaluation with the downloaded checkpoint:

python scripts/eval_reloc.py \
    --config configs/reloc/hm3d.yaml \
    --checkpoint release_weights/HM3D-Reloc.pth \
    --output-dir outputs/eval_HM3D-Reloc \
    --batch-size 16 --min-overlap 0.1

License

CC BY-NC 4.0.

Citation

@misc{wei2026g2gexploitingintragroupgeometry,
      title={G2G: Exploiting Intra-Group Geometry for Inter-Group Pose Estimation},
      author={Yufei Wei and Shuhao Ye and Chenxiao Hu and Yiyuan Pan and Dongyu Feng and Rong Xiong and Yue Wang and Yanmei Jiao},
      year={2026},
      eprint={2606.08284},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2606.08284},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Paper for feixue22/G2G