Spaces:
Running
Running
# Submodule used in [hloc](https://github.com/Vincentqyw/Hierarchical-Localization) toolbox | |
# [AAAI-23] TopicFM: Robust and Interpretable Topic-Assisted Feature Matching | |
Our method first inferred the latent topics (high-level context information) for each image and then use them to explicitly learn robust feature representation for the matching task. Please check out the details in [our paper](https://arxiv.org/abs/2207.00328) | |
![Alt Text](demo/topicfm.gif) | |
**Overall Architecture:** | |
![Alt Text](demo/architecture_v4.png) | |
## TODO List | |
- [x] Release training and evaluation code on MegaDepth and ScanNet | |
- [x] Evaluation on HPatches, Aachen Day&Night, and InLoc | |
- [x] Evaluation for Image Matching Challenge | |
## Requirements | |
All experiments in this paper are implemented on the Ubuntu environment | |
with a NVIDIA driver of at least 430.64 and CUDA 10.1. | |
First, create a virtual environment by anaconda as follows, | |
conda create -n topicfm python=3.8 | |
conda activate topicfm | |
conda install pytorch==1.8.1 torchvision==0.9.1 cudatoolkit=10.1 -c pytorch | |
pip install -r requirements.txt | |
# using pip to install any missing packages | |
## Data Preparation | |
The proposed method is trained on the MegaDepth dataset and evaluated on the MegaDepth test, ScanNet, HPatches, Aachen Day and Night (v1.1), and InLoc dataset. | |
All these datasets are large, so we cannot include them in this code. | |
The following descriptions help download these datasets. | |
### MegaDepth | |
This dataset is used for both training and evaluation (Li and Snavely 2018). | |
To use this dataset with our code, please follow the [instruction of LoFTR](https://github.com/zju3dv/LoFTR/blob/master/docs/TRAINING.md) (Sun et al. 2021) | |
### ScanNet | |
We only use 1500 image pairs of ScanNet (Dai et al. 2017) for evaluation. | |
Please download and prepare [test data](https://drive.google.com/drive/folders/1DOcOPZb3-5cWxLqn256AhwUVjBPifhuf) of ScanNet | |
provided by [LoFTR](https://github.com/zju3dv/LoFTR/blob/master/docs/TRAINING.md). | |
## Training | |
To train our model, we recommend to use GPUs card as much as possible, and each GPU should be at least 12GB. | |
In our settings, we train on 4 GPUs, each of which is 12GB. | |
Please setup your hardware environment in `scripts/reproduce_train/outdoor.sh`. | |
And then run this command to start training. | |
bash scripts/reproduce_train/outdoor.sh | |
We then provide the trained model in `pretrained/model_best.ckpt` | |
## Evaluation | |
### MegaDepth (relative pose estimation) | |
bash scripts/reproduce_test/outdoor.sh | |
### ScanNet (relative pose estimation) | |
bash scripts/reproduce_test/indoor.sh | |
### HPatches, Aachen v1.1, InLoc | |
To evaluate on these datasets, we integrate our code to the image-matching-toolbox provided by Zhou et al. (2021). | |
The updated code is available [here](https://github.com/TruongKhang/image-matching-toolbox). | |
After cloning this code, please follow instructions of image-matching-toolbox to install all required packages and prepare data for evaluation. | |
Then, run these commands to perform evaluation: (note that all hyperparameter settings are in `configs/topicfm.yml`) | |
**HPatches (homography estimation)** | |
python -m immatch.eval_hpatches --gpu 0 --config 'topicfm' --task 'both' --h_solver 'cv' --ransac_thres 3 --root_dir . --odir 'outputs/hpatches' | |
**Aachen Day-Night v1.1 (visual localization)** | |
python -m immatch.eval_aachen --gpu 0 --config 'topicfm' --colmap <path to use colmap> --benchmark_name 'aachen_v1.1' | |
**InLoc (visual localization)** | |
python -m immatch.eval_inloc --gpu 0 --config 'topicfm' | |
### Image Matching Challenge 2022 (IMC-2022) | |
IMC-2022 was held on [Kaggle](https://www.kaggle.com/competitions/image-matching-challenge-2022/overview). | |
Most high ranking methods were achieved by using an ensemble method which combines the matching results of | |
various state-of-the-art methods including LoFTR, SuperPoint+SuperGlue, MatchFormer, or QuadTree Attention. | |
In this evaluation, we only submit the results produced by our method (TopicFM) alone. Please refer to [this notebook](https://www.kaggle.com/code/khangtg09121995/topicfm-eval). | |
This table compares our results with the other methods such as LoFTR (ref. [here](https://www.kaggle.com/code/mcwema/imc-2022-kornia-loftr-score-plateau-0-726)), | |
SP+SuperGlue (ref. [here](https://www.kaggle.com/code/yufei12/superglue-baseline)). | |
| | Public Score | Private Score | | |
|----------------|--------------|---------------| | |
| SP + SuperGlue | 0.678 | 0.677 | | |
| LoFTR | 0.726 | 0.736 | | |
| TopicFM (ours) | **0.804** | **0.811** | | |
### Runtime comparison | |
The runtime reported in the paper is measured by averaging runtime of 1500 image pairs of the ScanNet evaluation dataset. | |
The image size can be changed at `configs/data/scannet_test_1500.py` | |
python visualization.py --method <method_name> --dataset_name "scannet" --measure_time --no_viz | |
# note that method_name is in ["topicfm", "loftr"] | |
To measure time for LoFTR, please download the LoFTR's code as follows: | |
git submodule update --init | |
# download pretrained models | |
mkdir third_party/loftr/pretrained | |
gdown --id 1M-VD35-qdB5Iw-AtbDBCKC7hPolFW9UY -O third_party/loftr/pretrained/outdoor_ds.ckpt | |
## Citations | |
If you find this work useful, please cite this: | |
@article{giang2022topicfm, | |
title={TopicFM: Robust and Interpretable Topic-assisted Feature Matching}, | |
author={Giang, Khang Truong and Song, Soohwan and Jo, Sungho}, | |
journal={arXiv preprint arXiv:2207.00328}, | |
year={2022} | |
} | |
## Acknowledgement | |
This code is built based on [LoFTR](https://github.com/zju3dv/LoFTR). We thank the authors for their useful source code. | |