Spaces:
Running
GIM: Learning Generalizable Image Matcher From Internet Videos
Method |
Mean AUC@5Β° (%) β |
GL3 | BLE | ETI | ETO | KIT | WEA | SEA | NIG | MUL | SCE | ICL | GTA | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Handcrafted | ||||||||||||||
RootSIFT | 31.8 | 43.5 | 33.6 | 49.9 | 48.7 | 35.2 | 21.4 | 44.1 | 14.7 | 33.4 | 7.6 | 14.8 | 35.1 | |
Sparse Matching | ||||||||||||||
SuperGlue (in) | 21.6 | 19.2 | 16.0 | 38.2 | 37.7 | 22.0 | 20.8 | 40.8 | 13.7 | 21.4 | 0.8 | 9.6 | 18.8 | |
SuperGlue (out) | 31.2 | 29.7 | 24.2 | 52.3 | 59.3 | 28.0 | 28.4 | 48.0 | 20.9 | 33.4 | 4.5 | 16.6 | 29.3 | |
GIM_SuperGlue (50h) |
34.3 | 43.2 | 34.2 | 58.7 | 61.0 | 29.0 | 28.3 | 48.4 | 18.8 | 34.8 | 2.8 | 15.4 | 36.5 | |
LightGlue | 31.7 | 28.9 | 23.9 | 51.6 | 56.3 | 32.1 | 29.5 | 48.9 | 22.2 | 37.4 | 3.0 | 16.2 | 30.4 | |
β | GIM_LightGlue (100h) |
38.3 | 46.6 | 38.1 | 61.7 | 62.9 | 34.9 | 31.2 | 50.6 | 22.6 | 41.8 | 6.9 | 19.0 | 43.4 |
Semi-dense Matching | ||||||||||||||
LoFTR (in) | 10.7 | 5.6 | 5.1 | 11.8 | 7.5 | 17.2 | 6.4 | 9.7 | 3.5 | 22.4 | 1.3 | 14.9 | 23.4 | |
LoFTR (out) | 33.1 | 29.3 | 22.5 | 51.1 | 60.1 | 36.1 | 29.7 | 48.6 | 19.4 | 37.0 | 13.1 | 20.5 | 30.3 | |
GIM_LoFTR (50h) |
39.1 | 50.6 | 43.9 | 62.6 | 61.6 | 35.9 | 26.8 | 47.5 | 17.6 | 41.4 | 10.2 | 25.6 | 45.0 | |
π© | GIM_LoFTR (100h) |
ToDO | ||||||||||||
Dense Matching | ||||||||||||||
DKM (in) | 46.2 | 44.4 | 37.0 | 65.7 | 73.3 | 40.2 | 32.8 | 51.0 | 23.1 | 54.7 | 33.0 | 43.6 | 55.7 | |
DKM (out) | 45.8 | 45.7 | 37.0 | 66.8 | 75.8 | 41.7 | 33.5 | 51.4 | 22.9 | 56.3 | 27.3 | 37.8 | 52.9 | |
GIM_DKM (50h) |
49.4 | 58.3 | 47.8 | 72.7 | 74.5 | 42.1 | 34.6 | 52.0 | 25.1 | 53.7 | 32.3 | 38.8 | 60.6 | |
β | GIM_DKM (100h) |
51.2 | 63.3 | 53.0 | 73.9 | 76.7 | 43.4 | 34.6 | 52.5 | 24.5 | 56.6 | 32.2 | 42.5 | 61.6 |
RoMa (in) | 46.7 | 46.0 | 39.3 | 68.8 | 77.2 | 36.5 | 31.1 | 50.4 | 20.8 | 57.8 | 33.8 | 41.7 | 57.6 | |
RoMa (out) | 48.8 | 48.3 | 40.6 | 73.6 | 79.8 | 39.9 | 34.4 | 51.4 | 24.2 | 59.9 | 33.7 | 41.3 | 59.2 | |
π© | GIM_RoMa | ToDO |
The data in this table comes from the ZEB: Zero-shot Evaluation Benchmark for Image Matching proposed in the paper. This benchmark consists of 12 public datasets that cover a variety of scenes, weather conditions, and camera models, corresponding to the 12 test sequences starting from GL3 in the table. We will release ZEB as soon as possible.
β TODO List
- Inference code
- gim_roma
- gim_dkm
- gim_loftr
- gim_lightglue
- Training code
We are actively continuing with the remaining open-source work and appreciate everyone's attention.
π€ Online demo
Go to Huggingface to quickly try our model online.
βοΈ Environment
I set up the running environment on a new machine using the commands listed below.
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install albumentations==1.0.1 --no-binary=imgaug,albumentations
pip install pytorch-lightning==1.5.10
pip install opencv-python==4.5.3.56
pip install imagesize==1.2.0
pip install kornia==0.6.10
pip install einops==0.3.0
pip install loguru==0.5.3
pip install joblib==1.0.1
pip install yacs==0.1.8
pip install h5py==3.1.0
π¨ Usage
Clone the repository
git clone https://github.com/xuelunshen/gim.git
cd gim
Download gim_dkm
model weight from Google Drive
Put it on the folder weights
Run the following command
python demo.py --model gim_dkm
or
python demo.py --model gim_lightglue
The code will match a1.png
and a2.png
in the folder assets/demo
, and output a1_a2_match.png
and a1_a2_warp.png
.
Click to show
a1.png
and
a2.png
.
Click to show
a1_a2_match.png
.
a1_a2_match.png
is a visualization of the match between the two images
Click to show
a1_a2_warp.png
.
a1_a2_warp.png
shows the effect of projecting image a2
onto image a1
using homography
There are more images in the assets/demo
folder, you can try them out.
Click to show other images.
π Citation
If the paper and code from gim
help your research, we kindly ask you to give a citation to our paper β€οΈ. Additionally, if you appreciate our work and find this repository useful, giving it a star βοΈ would be a wonderful way to support our work. Thank you very much.
@inproceedings{
xuelun2024gim,
title={GIM: Learning Generalizable Image Matcher From Internet Videos},
author={Xuelun Shen and Zhipeng Cai and Wei Yin and Matthias MΓΌller and Zijun Li and Kaixuan Wang and Xiaozhi Chen and Cheng Wang},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024}
}
π Star History
License
This repository is under the MIT License. This content/model is provided here for research purposes only. Any use beyond this is your sole responsibility and subject to your securing the necessary rights for your purpose.