Realcat
add: GIM (https://github.com/xuelunshen/gim)
4d4dd90

A newer version of the Gradio SDK is available: 5.9.1

Upgrade

English Chinese

GIM: Learning Generalizable Image Matcher From Internet Videos

ICLR 2024 Spotlight Project Page arxiv HuggingFace Space Overview Video GitHub Repo stars

Intel Intel Intel

Method
Mean
AUC@5Β°
(%) ↑
GL3 BLE ETI ETO KIT WEA SEA NIG MUL SCE ICL GTA
Handcrafted
RootSIFT 31.8 43.5 33.6 49.9 48.7 35.2 21.4 44.1 14.7 33.4 7.6 14.8 35.1
Sparse Matching
SuperGlue (in) 21.6 19.2 16.0 38.2 37.7 22.0 20.8 40.8 13.7 21.4 0.8 9.6 18.8
SuperGlue (out) 31.2 29.7 24.2 52.3 59.3 28.0 28.4 48.0 20.9 33.4 4.5 16.6 29.3
GIM_SuperGlue
(50h)
34.3 43.2 34.2 58.7 61.0 29.0 28.3 48.4 18.8 34.8 2.8 15.4 36.5
LightGlue 31.7 28.9 23.9 51.6 56.3 32.1 29.5 48.9 22.2 37.4 3.0 16.2 30.4
βœ… GIM_LightGlue
(100h)
38.3 46.6 38.1 61.7 62.9 34.9 31.2 50.6 22.6 41.8 6.9 19.0 43.4
Semi-dense Matching
LoFTR (in) 10.7 5.6 5.1 11.8 7.5 17.2 6.4 9.7 3.5 22.4 1.3 14.9 23.4
LoFTR (out) 33.1 29.3 22.5 51.1 60.1 36.1 29.7 48.6 19.4 37.0 13.1 20.5 30.3
GIM_LoFTR
(50h)
39.1 50.6 43.9 62.6 61.6 35.9 26.8 47.5 17.6 41.4 10.2 25.6 45.0
🟩 GIM_LoFTR
(100h)
ToDO
Dense Matching
DKM (in) 46.2 44.4 37.0 65.7 73.3 40.2 32.8 51.0 23.1 54.7 33.0 43.6 55.7
DKM (out) 45.8 45.7 37.0 66.8 75.8 41.7 33.5 51.4 22.9 56.3 27.3 37.8 52.9
GIM_DKM
(50h)
49.4 58.3 47.8 72.7 74.5 42.1 34.6 52.0 25.1 53.7 32.3 38.8 60.6
βœ… GIM_DKM
(100h)
51.2 63.3 53.0 73.9 76.7 43.4 34.6 52.5 24.5 56.6 32.2 42.5 61.6
RoMa (in) 46.7 46.0 39.3 68.8 77.2 36.5 31.1 50.4 20.8 57.8 33.8 41.7 57.6
RoMa (out) 48.8 48.3 40.6 73.6 79.8 39.9 34.4 51.4 24.2 59.9 33.7 41.3 59.2
🟩 GIM_RoMa ToDO

The data in this table comes from the ZEB: Zero-shot Evaluation Benchmark for Image Matching proposed in the paper. This benchmark consists of 12 public datasets that cover a variety of scenes, weather conditions, and camera models, corresponding to the 12 test sequences starting from GL3 in the table. We will release ZEB as soon as possible.

βœ… TODO List

  • Inference code
    • gim_roma
    • gim_dkm
    • gim_loftr
    • gim_lightglue
  • Training code

We are actively continuing with the remaining open-source work and appreciate everyone's attention.

πŸ€— Online demo

Go to Huggingface to quickly try our model online.

βš™οΈ Environment

I set up the running environment on a new machine using the commands listed below.

conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install albumentations==1.0.1 --no-binary=imgaug,albumentations
pip install pytorch-lightning==1.5.10
pip install opencv-python==4.5.3.56
pip install imagesize==1.2.0
pip install kornia==0.6.10
pip install einops==0.3.0
pip install loguru==0.5.3
pip install joblib==1.0.1
pip install yacs==0.1.8
pip install h5py==3.1.0

πŸ”¨ Usage

Clone the repository

git clone https://github.com/xuelunshen/gim.git
cd gim

Download gim_dkm model weight from Google Drive

Put it on the folder weights

Run the following command

python demo.py --model gim_dkm

or

python demo.py --model gim_lightglue

The code will match a1.png and a2.png in the folder assets/demo
, and output a1_a2_match.png and a1_a2_warp.png.

Click to show a1.png and a2.png.

Click to show a1_a2_match.png.

a1_a2_match.png is a visualization of the match between the two images

Click to show a1_a2_warp.png.

a1_a2_warp.png shows the effect of projecting image a2 onto image a1 using homography

There are more images in the assets/demo folder, you can try them out.

Click to show other images.

πŸ“Œ Citation

If the paper and code from gim help your research, we kindly ask you to give a citation to our paper ❀️. Additionally, if you appreciate our work and find this repository useful, giving it a star ⭐️ would be a wonderful way to support our work. Thank you very much.

@inproceedings{
xuelun2024gim,
title={GIM: Learning Generalizable Image Matcher From Internet Videos},
author={Xuelun Shen and Zhipeng Cai and Wei Yin and Matthias MΓΌller and Zijun Li and Kaixuan Wang and Xiaozhi Chen and Cheng Wang},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024}
}

🌟 Star History

Star History Chart

License

This repository is under the MIT License. This content/model is provided here for research purposes only. Any use beyond this is your sole responsibility and subject to your securing the necessary rights for your purpose.