File size: 2,633 Bytes
a80d6bb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
# Submodule used in [hloc](https://github.com/Vincentqyw/Hierarchical-Localization) toolbox

# ASpanFormer Implementation

![Framework](assets/teaser.png)

This is a PyTorch implementation of ASpanFormer for ECCV'22 [paper](https://arxiv.org/abs/2208.14201), “ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer”, and can be used to reproduce the results in the paper.

This work focuses on detector-free image matching. We propose a hierarchical attention framework for cross-view feature update, which adaptively adjusts attention span based on region-wise matchability.

This repo contains training, evaluation and basic demo scripts used in our paper.

A large part of the code base is borrowed from the [LoFTR Repository](https://github.com/zju3dv/LoFTR) under its own separate license, terms and conditions.  The authors of this software are not responsible for the contents of third-party websites.

## Installation 
```bash
conda env create -f environment.yaml
conda activate ASpanFormer
```

## Get started
Download model weights from [here](https://drive.google.com/file/d/1eavM9dTkw9nbc-JqlVVfGPU5UvTTfc6k/view?usp=share_link)  

Extract weights by
```bash
tar -xvf weights_aspanformer.tar
```

A demo to match one image pair is provided. To get a quick start, 

```bash
cd demo
python demo.py
```


## Data Preparation
Please follow the [training doc](docs/TRAINING.md) for data organization



## Evaluation


### 1. ScanNet Evaluation 
```bash
cd scripts/reproduce_test
bash indoor.sh
```
Similar results as below should be obtained,
```bash
'auc@10': 0.46640095171012563,
'auc@20': 0.6407042320049785,
'auc@5': 0.26241231577189295,
'prec@5e-04': 0.8827665604024288,
'prec_flow@2e-03': 0.810938751342228
```

### 2. MegaDepth Evaluation
 ```bash
cd scripts/reproduce_test
bash outdoor.sh
```
Similar results as below should be obtained,
```bash
'auc@10': 0.7184113573584142,
'auc@20': 0.8333835724453831,
'auc@5': 0.5567622479156181,
'prec@5e-04': 0.9901741341790503,
'prec_flow@2e-03': 0.7188964321862907
```


## Training

### 1. ScanNet Training
```bash
cd scripts/reproduce_train
bash indoor.sh
```

### 2. MegaDepth Training
```bash
cd scripts/reproduce_train
bash outdoor.sh
```
      

If you find this project useful, please cite:

```
@article{chen2022aspanformer,
  title={ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer},
  author={Chen, Hongkai and Luo, Zixin and Zhou, Lei and Tian, Yurun and Zhen, Mingmin and Fang, Tian and McKinnon, David and Tsin, Yanghai and Quan, Long},
  journal={European Conference on Computer Vision (ECCV)},
  year={2022}
}
```