Realcat
commited on
Commit
·
0bf7151
1
Parent(s):
816b9f6
add: thirdparty
Browse filesThis view is limited to 50 files because it contains too many changes.
See raw diff
- third_party/ALIKE/LICENSE +29 -0
- third_party/ALIKE/README.md +131 -0
- third_party/ALIKE/alike.py +198 -0
- third_party/ALIKE/alnet.py +194 -0
- third_party/ALIKE/demo.py +201 -0
- third_party/ALIKE/hseq/cache/alike-l-ms.npy +3 -0
- third_party/ALIKE/hseq/cache/alike-l.npy +3 -0
- third_party/ALIKE/hseq/cache/alike-n-ms.npy +3 -0
- third_party/ALIKE/hseq/cache/alike-n.npy +3 -0
- third_party/ALIKE/hseq/cache/aslfeat.npy +3 -0
- third_party/ALIKE/hseq/cache/d2.npy +3 -0
- third_party/ALIKE/hseq/cache/disk.npy +3 -0
- third_party/ALIKE/hseq/cache/lfnet.npy +3 -0
- third_party/ALIKE/hseq/cache/r2d2.npy +3 -0
- third_party/ALIKE/hseq/cache/superpoint.npy +3 -0
- third_party/ALIKE/hseq/eval.py +197 -0
- third_party/ALIKE/hseq/extract.py +175 -0
- third_party/ALIKE/matlab/createfigure.m +75 -0
- third_party/ALIKE/matlab/peakloss_rect.m +19 -0
- third_party/ALIKE/requirements.txt +6 -0
- third_party/ALIKE/soft_detect.py +234 -0
- third_party/ASpanFormer/.github/workflows/sync.yml +39 -0
- third_party/ASpanFormer/.gitignore +32 -0
- third_party/ASpanFormer/CODE_OF_CONDUCT.md +71 -0
- third_party/ASpanFormer/CONTRIBUTING.md +7 -0
- third_party/ASpanFormer/LICENSE +9 -0
- third_party/ASpanFormer/README.md +98 -0
- third_party/ASpanFormer/configs/aspan/indoor/aspan_test.py +11 -0
- third_party/ASpanFormer/configs/aspan/indoor/aspan_train.py +12 -0
- third_party/ASpanFormer/configs/aspan/outdoor/aspan_test.py +22 -0
- third_party/ASpanFormer/configs/aspan/outdoor/aspan_train.py +21 -0
- third_party/ASpanFormer/configs/data/__init__.py +0 -0
- third_party/ASpanFormer/configs/data/base.py +36 -0
- third_party/ASpanFormer/configs/data/debug/.gitignore +3 -0
- third_party/ASpanFormer/configs/data/megadepth_test_1500.py +13 -0
- third_party/ASpanFormer/configs/data/megadepth_trainval_832.py +26 -0
- third_party/ASpanFormer/configs/data/scannet_test_1500.py +11 -0
- third_party/ASpanFormer/configs/data/scannet_trainval.py +21 -0
- third_party/ASpanFormer/data/megadepth/index/.gitignore +4 -0
- third_party/ASpanFormer/data/megadepth/test/.gitignore +4 -0
- third_party/ASpanFormer/data/megadepth/train/.gitignore +4 -0
- third_party/ASpanFormer/data/scannet/index/.gitignore +4 -0
- third_party/ASpanFormer/data/scannet/test/.gitignore +3 -0
- third_party/ASpanFormer/data/scannet/train/.gitignore +4 -0
- third_party/ASpanFormer/demo/demo.py +91 -0
- third_party/ASpanFormer/demo/demo_utils.py +88 -0
- third_party/ASpanFormer/docs/TRAINING.md +72 -0
- third_party/ASpanFormer/environment.yaml +12 -0
- third_party/ASpanFormer/requirements.txt +18 -0
- third_party/ASpanFormer/scripts/reproduce_test/indoor.sh +31 -0
third_party/ALIKE/LICENSE
ADDED
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
BSD 3-Clause License
|
2 |
+
|
3 |
+
Copyright (c) 2022, Zhao Xiaoming
|
4 |
+
All rights reserved.
|
5 |
+
|
6 |
+
Redistribution and use in source and binary forms, with or without
|
7 |
+
modification, are permitted provided that the following conditions are met:
|
8 |
+
|
9 |
+
1. Redistributions of source code must retain the above copyright notice, this
|
10 |
+
list of conditions and the following disclaimer.
|
11 |
+
|
12 |
+
2. Redistributions in binary form must reproduce the above copyright notice,
|
13 |
+
this list of conditions and the following disclaimer in the documentation
|
14 |
+
and/or other materials provided with the distribution.
|
15 |
+
|
16 |
+
3. Neither the name of the copyright holder nor the names of its
|
17 |
+
contributors may be used to endorse or promote products derived from
|
18 |
+
this software without specific prior written permission.
|
19 |
+
|
20 |
+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
21 |
+
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
22 |
+
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
23 |
+
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
|
24 |
+
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
25 |
+
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
|
26 |
+
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
|
27 |
+
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
|
28 |
+
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
29 |
+
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
third_party/ALIKE/README.md
ADDED
@@ -0,0 +1,131 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# News
|
2 |
+
|
3 |
+
- The [ALIKED](https://github.com/Shiaoming/ALIKED) is released.
|
4 |
+
- The [ALIKE training code](https://github.com/Shiaoming/ALIKE/raw/main/assets/ALIKE_code.zip) is released.
|
5 |
+
|
6 |
+
# ALIKE: Accurate and Lightweight Keypoint Detection and Descriptor Extraction
|
7 |
+
|
8 |
+
ALIKE applies a differentiable keypoint detection module to detect accurate sub-pixel keypoints. The network can run at 95 frames per second for 640 x 480 images on NVIDIA Titan X (Pascal) GPU and achieve equivalent performance with the state-of-the-arts. ALIKE benefits real-time applications in resource-limited platforms/devices. Technical details are described in [this paper](https://arxiv.org/pdf/2112.02906.pdf).
|
9 |
+
|
10 |
+
> ```
|
11 |
+
> Xiaoming Zhao, Xingming Wu, Jinyu Miao, Weihai Chen, Peter C. Y. Chen, Zhengguo Li, "ALIKE: Accurate and Lightweight Keypoint
|
12 |
+
> Detection and Descriptor Extraction," IEEE Transactions on Multimedia, 2022.
|
13 |
+
> ```
|
14 |
+
|
15 |
+
![](./assets/alike.png)
|
16 |
+
|
17 |
+
|
18 |
+
If you use ALIKE in an academic work, please cite:
|
19 |
+
|
20 |
+
```
|
21 |
+
@article{Zhao2023ALIKED,
|
22 |
+
title = {ALIKED: A Lighter Keypoint and Descriptor Extraction Network via Deformable Transformation},
|
23 |
+
url = {https://arxiv.org/pdf/2304.03608.pdf},
|
24 |
+
doi = {10.1109/TIM.2023.3271000},
|
25 |
+
journal = {IEEE Transactions on Instrumentation & Measurement},
|
26 |
+
author = {Zhao, Xiaoming and Wu, Xingming and Chen, Weihai and Chen, Peter C. Y. and Xu, Qingsong and Li, Zhengguo},
|
27 |
+
year = {2023},
|
28 |
+
volume = {72},
|
29 |
+
pages = {1-16},
|
30 |
+
}
|
31 |
+
|
32 |
+
@article{Zhao2022ALIKE,
|
33 |
+
title = {ALIKE: Accurate and Lightweight Keypoint Detection and Descriptor Extraction},
|
34 |
+
url = {http://arxiv.org/abs/2112.02906},
|
35 |
+
doi = {10.1109/TMM.2022.3155927},
|
36 |
+
journal = {IEEE Transactions on Multimedia},
|
37 |
+
author = {Zhao, Xiaoming and Wu, Xingming and Miao, Jinyu and Chen, Weihai and Chen, Peter C. Y. and Li, Zhengguo},
|
38 |
+
month = march,
|
39 |
+
year = {2022},
|
40 |
+
}
|
41 |
+
```
|
42 |
+
|
43 |
+
|
44 |
+
|
45 |
+
## 1. Prerequisites
|
46 |
+
|
47 |
+
The required packages are listed in the `requirements.txt` :
|
48 |
+
|
49 |
+
```shell
|
50 |
+
pip install -r requirements.txt
|
51 |
+
```
|
52 |
+
|
53 |
+
|
54 |
+
|
55 |
+
## 2. Models
|
56 |
+
|
57 |
+
The off-the-shelf weights of four variant ALIKE models are provided in `models/` .
|
58 |
+
|
59 |
+
|
60 |
+
|
61 |
+
## 3. Run demo
|
62 |
+
|
63 |
+
```shell
|
64 |
+
$ python demo.py -h
|
65 |
+
usage: demo.py [-h] [--model {alike-t,alike-s,alike-n,alike-l}]
|
66 |
+
[--device DEVICE] [--top_k TOP_K] [--scores_th SCORES_TH]
|
67 |
+
[--n_limit N_LIMIT] [--no_display] [--no_sub_pixel]
|
68 |
+
input
|
69 |
+
|
70 |
+
ALike Demo.
|
71 |
+
|
72 |
+
positional arguments:
|
73 |
+
input Image directory or movie file or "camera0" (for
|
74 |
+
webcam0).
|
75 |
+
|
76 |
+
optional arguments:
|
77 |
+
-h, --help show this help message and exit
|
78 |
+
--model {alike-t,alike-s,alike-n,alike-l}
|
79 |
+
The model configuration
|
80 |
+
--device DEVICE Running device (default: cuda).
|
81 |
+
--top_k TOP_K Detect top K keypoints. -1 for threshold based mode,
|
82 |
+
>0 for top K mode. (default: -1)
|
83 |
+
--scores_th SCORES_TH
|
84 |
+
Detector score threshold (default: 0.2).
|
85 |
+
--n_limit N_LIMIT Maximum number of keypoints to be detected (default:
|
86 |
+
5000).
|
87 |
+
--no_display Do not display images to screen. Useful if running
|
88 |
+
remotely (default: False).
|
89 |
+
--no_sub_pixel Do not detect sub-pixel keypoints (default: False).
|
90 |
+
```
|
91 |
+
|
92 |
+
|
93 |
+
|
94 |
+
## 4. Examples
|
95 |
+
|
96 |
+
### KITTI example
|
97 |
+
```shell
|
98 |
+
python demo.py assets/kitti
|
99 |
+
```
|
100 |
+
![](./assets/kitti.gif)
|
101 |
+
|
102 |
+
### TUM example
|
103 |
+
```shell
|
104 |
+
python demo.py assets/tum
|
105 |
+
```
|
106 |
+
![](./assets/tum.gif)
|
107 |
+
|
108 |
+
## 5. Efficiency and performance
|
109 |
+
|
110 |
+
| Models | Parameters | GFLOPs(640x480) | MHA@3 on Hpatches | mAA(10°) on [IMW2020-test](https://www.cs.ubc.ca/research/image-matching-challenge/2021/leaderboard) (Stereo) |
|
111 |
+
|:---:|:---:|:---:|:-----------------:|:-------------------------------------------------------------------------------------------------------------:|
|
112 |
+
| D2-Net(MS) | 7653KB | 889.40 | 38.33% | 12.27% |
|
113 |
+
| LF-Net(MS) | 2642KB | 24.37 | 57.78% | 23.44% |
|
114 |
+
| SuperPoint | 1301KB | 26.11 | 70.19% | 28.97% |
|
115 |
+
| R2D2(MS) | 484KB | 464.55 | 71.48% | 39.02% |
|
116 |
+
| ASLFeat(MS) | 823KB | 77.58 | 73.52% | 33.65% |
|
117 |
+
| DISK | 1092KB | 98.97 | 70.56% | 51.22% |
|
118 |
+
| ALike-N | 318KB | 7.909 | 75.74% | 47.18% |
|
119 |
+
| ALike-L | 653KB | 19.685 | 76.85% | 49.58% |
|
120 |
+
|
121 |
+
### Evaluation on Hpatches
|
122 |
+
|
123 |
+
- Download [hpatches-sequences-release](https://hpatches.github.io/) and put it into `hseq/hpatches-sequences-release`.
|
124 |
+
- Remove the unreliable sequences as D2-Net.
|
125 |
+
- Run the following command to evaluate the performance:
|
126 |
+
```shell
|
127 |
+
python hseq/eval.py
|
128 |
+
```
|
129 |
+
|
130 |
+
|
131 |
+
For more details, please refer to the [paper](https://arxiv.org/abs/2112.02906).
|
third_party/ALIKE/alike.py
ADDED
@@ -0,0 +1,198 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import logging
|
2 |
+
import os
|
3 |
+
import cv2
|
4 |
+
import torch
|
5 |
+
from copy import deepcopy
|
6 |
+
import torch.nn.functional as F
|
7 |
+
from torchvision.transforms import ToTensor
|
8 |
+
import math
|
9 |
+
|
10 |
+
from alnet import ALNet
|
11 |
+
from soft_detect import DKD
|
12 |
+
import time
|
13 |
+
|
14 |
+
configs = {
|
15 |
+
"alike-t": {
|
16 |
+
"c1": 8,
|
17 |
+
"c2": 16,
|
18 |
+
"c3": 32,
|
19 |
+
"c4": 64,
|
20 |
+
"dim": 64,
|
21 |
+
"single_head": True,
|
22 |
+
"radius": 2,
|
23 |
+
"model_path": os.path.join(os.path.split(__file__)[0], "models", "alike-t.pth"),
|
24 |
+
},
|
25 |
+
"alike-s": {
|
26 |
+
"c1": 8,
|
27 |
+
"c2": 16,
|
28 |
+
"c3": 48,
|
29 |
+
"c4": 96,
|
30 |
+
"dim": 96,
|
31 |
+
"single_head": True,
|
32 |
+
"radius": 2,
|
33 |
+
"model_path": os.path.join(os.path.split(__file__)[0], "models", "alike-s.pth"),
|
34 |
+
},
|
35 |
+
"alike-n": {
|
36 |
+
"c1": 16,
|
37 |
+
"c2": 32,
|
38 |
+
"c3": 64,
|
39 |
+
"c4": 128,
|
40 |
+
"dim": 128,
|
41 |
+
"single_head": True,
|
42 |
+
"radius": 2,
|
43 |
+
"model_path": os.path.join(os.path.split(__file__)[0], "models", "alike-n.pth"),
|
44 |
+
},
|
45 |
+
"alike-l": {
|
46 |
+
"c1": 32,
|
47 |
+
"c2": 64,
|
48 |
+
"c3": 128,
|
49 |
+
"c4": 128,
|
50 |
+
"dim": 128,
|
51 |
+
"single_head": False,
|
52 |
+
"radius": 2,
|
53 |
+
"model_path": os.path.join(os.path.split(__file__)[0], "models", "alike-l.pth"),
|
54 |
+
},
|
55 |
+
}
|
56 |
+
|
57 |
+
|
58 |
+
class ALike(ALNet):
|
59 |
+
def __init__(
|
60 |
+
self,
|
61 |
+
# ================================== feature encoder
|
62 |
+
c1: int = 32,
|
63 |
+
c2: int = 64,
|
64 |
+
c3: int = 128,
|
65 |
+
c4: int = 128,
|
66 |
+
dim: int = 128,
|
67 |
+
single_head: bool = False,
|
68 |
+
# ================================== detect parameters
|
69 |
+
radius: int = 2,
|
70 |
+
top_k: int = 500,
|
71 |
+
scores_th: float = 0.5,
|
72 |
+
n_limit: int = 5000,
|
73 |
+
device: str = "cpu",
|
74 |
+
model_path: str = "",
|
75 |
+
):
|
76 |
+
super().__init__(c1, c2, c3, c4, dim, single_head)
|
77 |
+
self.radius = radius
|
78 |
+
self.top_k = top_k
|
79 |
+
self.n_limit = n_limit
|
80 |
+
self.scores_th = scores_th
|
81 |
+
self.dkd = DKD(
|
82 |
+
radius=self.radius,
|
83 |
+
top_k=self.top_k,
|
84 |
+
scores_th=self.scores_th,
|
85 |
+
n_limit=self.n_limit,
|
86 |
+
)
|
87 |
+
self.device = device
|
88 |
+
|
89 |
+
if model_path != "":
|
90 |
+
state_dict = torch.load(model_path, self.device)
|
91 |
+
self.load_state_dict(state_dict)
|
92 |
+
self.to(self.device)
|
93 |
+
self.eval()
|
94 |
+
logging.info(f"Loaded model parameters from {model_path}")
|
95 |
+
logging.info(
|
96 |
+
f"Number of model parameters: {sum(p.numel() for p in self.parameters() if p.requires_grad) / 1e3}KB"
|
97 |
+
)
|
98 |
+
|
99 |
+
def extract_dense_map(self, image, ret_dict=False):
|
100 |
+
# ====================================================
|
101 |
+
# check image size, should be integer multiples of 2^5
|
102 |
+
# if it is not a integer multiples of 2^5, padding zeros
|
103 |
+
device = image.device
|
104 |
+
b, c, h, w = image.shape
|
105 |
+
h_ = math.ceil(h / 32) * 32 if h % 32 != 0 else h
|
106 |
+
w_ = math.ceil(w / 32) * 32 if w % 32 != 0 else w
|
107 |
+
if h_ != h:
|
108 |
+
h_padding = torch.zeros(b, c, h_ - h, w, device=device)
|
109 |
+
image = torch.cat([image, h_padding], dim=2)
|
110 |
+
if w_ != w:
|
111 |
+
w_padding = torch.zeros(b, c, h_, w_ - w, device=device)
|
112 |
+
image = torch.cat([image, w_padding], dim=3)
|
113 |
+
# ====================================================
|
114 |
+
|
115 |
+
scores_map, descriptor_map = super().forward(image)
|
116 |
+
|
117 |
+
# ====================================================
|
118 |
+
if h_ != h or w_ != w:
|
119 |
+
descriptor_map = descriptor_map[:, :, :h, :w]
|
120 |
+
scores_map = scores_map[:, :, :h, :w] # Bx1xHxW
|
121 |
+
# ====================================================
|
122 |
+
|
123 |
+
# BxCxHxW
|
124 |
+
descriptor_map = torch.nn.functional.normalize(descriptor_map, p=2, dim=1)
|
125 |
+
|
126 |
+
if ret_dict:
|
127 |
+
return {
|
128 |
+
"descriptor_map": descriptor_map,
|
129 |
+
"scores_map": scores_map,
|
130 |
+
}
|
131 |
+
else:
|
132 |
+
return descriptor_map, scores_map
|
133 |
+
|
134 |
+
def forward(self, img, image_size_max=99999, sort=False, sub_pixel=False):
|
135 |
+
"""
|
136 |
+
:param img: np.array HxWx3, RGB
|
137 |
+
:param image_size_max: maximum image size, otherwise, the image will be resized
|
138 |
+
:param sort: sort keypoints by scores
|
139 |
+
:param sub_pixel: whether to use sub-pixel accuracy
|
140 |
+
:return: a dictionary with 'keypoints', 'descriptors', 'scores', and 'time'
|
141 |
+
"""
|
142 |
+
H, W, three = img.shape
|
143 |
+
assert three == 3, "input image shape should be [HxWx3]"
|
144 |
+
|
145 |
+
# ==================== image size constraint
|
146 |
+
image = deepcopy(img)
|
147 |
+
max_hw = max(H, W)
|
148 |
+
if max_hw > image_size_max:
|
149 |
+
ratio = float(image_size_max / max_hw)
|
150 |
+
image = cv2.resize(image, dsize=None, fx=ratio, fy=ratio)
|
151 |
+
|
152 |
+
# ==================== convert image to tensor
|
153 |
+
image = (
|
154 |
+
torch.from_numpy(image)
|
155 |
+
.to(self.device)
|
156 |
+
.to(torch.float32)
|
157 |
+
.permute(2, 0, 1)[None]
|
158 |
+
/ 255.0
|
159 |
+
)
|
160 |
+
|
161 |
+
# ==================== extract keypoints
|
162 |
+
start = time.time()
|
163 |
+
|
164 |
+
with torch.no_grad():
|
165 |
+
descriptor_map, scores_map = self.extract_dense_map(image)
|
166 |
+
keypoints, descriptors, scores, _ = self.dkd(
|
167 |
+
scores_map, descriptor_map, sub_pixel=sub_pixel
|
168 |
+
)
|
169 |
+
keypoints, descriptors, scores = keypoints[0], descriptors[0], scores[0]
|
170 |
+
keypoints = (keypoints + 1) / 2 * keypoints.new_tensor([[W - 1, H - 1]])
|
171 |
+
|
172 |
+
if sort:
|
173 |
+
indices = torch.argsort(scores, descending=True)
|
174 |
+
keypoints = keypoints[indices]
|
175 |
+
descriptors = descriptors[indices]
|
176 |
+
scores = scores[indices]
|
177 |
+
|
178 |
+
end = time.time()
|
179 |
+
|
180 |
+
return {
|
181 |
+
"keypoints": keypoints.cpu().numpy(),
|
182 |
+
"descriptors": descriptors.cpu().numpy(),
|
183 |
+
"scores": scores.cpu().numpy(),
|
184 |
+
"scores_map": scores_map.cpu().numpy(),
|
185 |
+
"time": end - start,
|
186 |
+
}
|
187 |
+
|
188 |
+
|
189 |
+
if __name__ == "__main__":
|
190 |
+
import numpy as np
|
191 |
+
from thop import profile
|
192 |
+
|
193 |
+
net = ALike(c1=32, c2=64, c3=128, c4=128, dim=128, single_head=False)
|
194 |
+
|
195 |
+
image = np.random.random((640, 480, 3)).astype(np.float32)
|
196 |
+
flops, params = profile(net, inputs=(image, 9999, False), verbose=False)
|
197 |
+
print("{:<30} {:<8} GFLops".format("Computational complexity: ", flops / 1e9))
|
198 |
+
print("{:<30} {:<8} KB".format("Number of parameters: ", params / 1e3))
|
third_party/ALIKE/alnet.py
ADDED
@@ -0,0 +1,194 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import torch
|
2 |
+
from torch import nn
|
3 |
+
from torchvision.models import resnet
|
4 |
+
from typing import Optional, Callable
|
5 |
+
|
6 |
+
|
7 |
+
class ConvBlock(nn.Module):
|
8 |
+
def __init__(
|
9 |
+
self,
|
10 |
+
in_channels,
|
11 |
+
out_channels,
|
12 |
+
gate: Optional[Callable[..., nn.Module]] = None,
|
13 |
+
norm_layer: Optional[Callable[..., nn.Module]] = None,
|
14 |
+
):
|
15 |
+
super().__init__()
|
16 |
+
if gate is None:
|
17 |
+
self.gate = nn.ReLU(inplace=True)
|
18 |
+
else:
|
19 |
+
self.gate = gate
|
20 |
+
if norm_layer is None:
|
21 |
+
norm_layer = nn.BatchNorm2d
|
22 |
+
self.conv1 = resnet.conv3x3(in_channels, out_channels)
|
23 |
+
self.bn1 = norm_layer(out_channels)
|
24 |
+
self.conv2 = resnet.conv3x3(out_channels, out_channels)
|
25 |
+
self.bn2 = norm_layer(out_channels)
|
26 |
+
|
27 |
+
def forward(self, x):
|
28 |
+
x = self.gate(self.bn1(self.conv1(x))) # B x in_channels x H x W
|
29 |
+
x = self.gate(self.bn2(self.conv2(x))) # B x out_channels x H x W
|
30 |
+
return x
|
31 |
+
|
32 |
+
|
33 |
+
# copied from torchvision\models\resnet.py#27->BasicBlock
|
34 |
+
class ResBlock(nn.Module):
|
35 |
+
expansion: int = 1
|
36 |
+
|
37 |
+
def __init__(
|
38 |
+
self,
|
39 |
+
inplanes: int,
|
40 |
+
planes: int,
|
41 |
+
stride: int = 1,
|
42 |
+
downsample: Optional[nn.Module] = None,
|
43 |
+
groups: int = 1,
|
44 |
+
base_width: int = 64,
|
45 |
+
dilation: int = 1,
|
46 |
+
gate: Optional[Callable[..., nn.Module]] = None,
|
47 |
+
norm_layer: Optional[Callable[..., nn.Module]] = None,
|
48 |
+
) -> None:
|
49 |
+
super(ResBlock, self).__init__()
|
50 |
+
if gate is None:
|
51 |
+
self.gate = nn.ReLU(inplace=True)
|
52 |
+
else:
|
53 |
+
self.gate = gate
|
54 |
+
if norm_layer is None:
|
55 |
+
norm_layer = nn.BatchNorm2d
|
56 |
+
if groups != 1 or base_width != 64:
|
57 |
+
raise ValueError("ResBlock only supports groups=1 and base_width=64")
|
58 |
+
if dilation > 1:
|
59 |
+
raise NotImplementedError("Dilation > 1 not supported in ResBlock")
|
60 |
+
# Both self.conv1 and self.downsample layers downsample the input when stride != 1
|
61 |
+
self.conv1 = resnet.conv3x3(inplanes, planes, stride)
|
62 |
+
self.bn1 = norm_layer(planes)
|
63 |
+
self.conv2 = resnet.conv3x3(planes, planes)
|
64 |
+
self.bn2 = norm_layer(planes)
|
65 |
+
self.downsample = downsample
|
66 |
+
self.stride = stride
|
67 |
+
|
68 |
+
def forward(self, x: torch.Tensor) -> torch.Tensor:
|
69 |
+
identity = x
|
70 |
+
|
71 |
+
out = self.conv1(x)
|
72 |
+
out = self.bn1(out)
|
73 |
+
out = self.gate(out)
|
74 |
+
|
75 |
+
out = self.conv2(out)
|
76 |
+
out = self.bn2(out)
|
77 |
+
|
78 |
+
if self.downsample is not None:
|
79 |
+
identity = self.downsample(x)
|
80 |
+
|
81 |
+
out += identity
|
82 |
+
out = self.gate(out)
|
83 |
+
|
84 |
+
return out
|
85 |
+
|
86 |
+
|
87 |
+
class ALNet(nn.Module):
|
88 |
+
def __init__(
|
89 |
+
self,
|
90 |
+
c1: int = 32,
|
91 |
+
c2: int = 64,
|
92 |
+
c3: int = 128,
|
93 |
+
c4: int = 128,
|
94 |
+
dim: int = 128,
|
95 |
+
single_head: bool = True,
|
96 |
+
):
|
97 |
+
super().__init__()
|
98 |
+
|
99 |
+
self.gate = nn.ReLU(inplace=True)
|
100 |
+
|
101 |
+
self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
|
102 |
+
self.pool4 = nn.MaxPool2d(kernel_size=4, stride=4)
|
103 |
+
|
104 |
+
self.block1 = ConvBlock(3, c1, self.gate, nn.BatchNorm2d)
|
105 |
+
|
106 |
+
self.block2 = ResBlock(
|
107 |
+
inplanes=c1,
|
108 |
+
planes=c2,
|
109 |
+
stride=1,
|
110 |
+
downsample=nn.Conv2d(c1, c2, 1),
|
111 |
+
gate=self.gate,
|
112 |
+
norm_layer=nn.BatchNorm2d,
|
113 |
+
)
|
114 |
+
self.block3 = ResBlock(
|
115 |
+
inplanes=c2,
|
116 |
+
planes=c3,
|
117 |
+
stride=1,
|
118 |
+
downsample=nn.Conv2d(c2, c3, 1),
|
119 |
+
gate=self.gate,
|
120 |
+
norm_layer=nn.BatchNorm2d,
|
121 |
+
)
|
122 |
+
self.block4 = ResBlock(
|
123 |
+
inplanes=c3,
|
124 |
+
planes=c4,
|
125 |
+
stride=1,
|
126 |
+
downsample=nn.Conv2d(c3, c4, 1),
|
127 |
+
gate=self.gate,
|
128 |
+
norm_layer=nn.BatchNorm2d,
|
129 |
+
)
|
130 |
+
|
131 |
+
# ================================== feature aggregation
|
132 |
+
self.conv1 = resnet.conv1x1(c1, dim // 4)
|
133 |
+
self.conv2 = resnet.conv1x1(c2, dim // 4)
|
134 |
+
self.conv3 = resnet.conv1x1(c3, dim // 4)
|
135 |
+
self.conv4 = resnet.conv1x1(dim, dim // 4)
|
136 |
+
self.upsample2 = nn.Upsample(
|
137 |
+
scale_factor=2, mode="bilinear", align_corners=True
|
138 |
+
)
|
139 |
+
self.upsample4 = nn.Upsample(
|
140 |
+
scale_factor=4, mode="bilinear", align_corners=True
|
141 |
+
)
|
142 |
+
self.upsample8 = nn.Upsample(
|
143 |
+
scale_factor=8, mode="bilinear", align_corners=True
|
144 |
+
)
|
145 |
+
self.upsample32 = nn.Upsample(
|
146 |
+
scale_factor=32, mode="bilinear", align_corners=True
|
147 |
+
)
|
148 |
+
|
149 |
+
# ================================== detector and descriptor head
|
150 |
+
self.single_head = single_head
|
151 |
+
if not self.single_head:
|
152 |
+
self.convhead1 = resnet.conv1x1(dim, dim)
|
153 |
+
self.convhead2 = resnet.conv1x1(dim, dim + 1)
|
154 |
+
|
155 |
+
def forward(self, image):
|
156 |
+
# ================================== feature encoder
|
157 |
+
x1 = self.block1(image) # B x c1 x H x W
|
158 |
+
x2 = self.pool2(x1)
|
159 |
+
x2 = self.block2(x2) # B x c2 x H/2 x W/2
|
160 |
+
x3 = self.pool4(x2)
|
161 |
+
x3 = self.block3(x3) # B x c3 x H/8 x W/8
|
162 |
+
x4 = self.pool4(x3)
|
163 |
+
x4 = self.block4(x4) # B x dim x H/32 x W/32
|
164 |
+
|
165 |
+
# ================================== feature aggregation
|
166 |
+
x1 = self.gate(self.conv1(x1)) # B x dim//4 x H x W
|
167 |
+
x2 = self.gate(self.conv2(x2)) # B x dim//4 x H//2 x W//2
|
168 |
+
x3 = self.gate(self.conv3(x3)) # B x dim//4 x H//8 x W//8
|
169 |
+
x4 = self.gate(self.conv4(x4)) # B x dim//4 x H//32 x W//32
|
170 |
+
x2_up = self.upsample2(x2) # B x dim//4 x H x W
|
171 |
+
x3_up = self.upsample8(x3) # B x dim//4 x H x W
|
172 |
+
x4_up = self.upsample32(x4) # B x dim//4 x H x W
|
173 |
+
x1234 = torch.cat([x1, x2_up, x3_up, x4_up], dim=1)
|
174 |
+
|
175 |
+
# ================================== detector and descriptor head
|
176 |
+
if not self.single_head:
|
177 |
+
x1234 = self.gate(self.convhead1(x1234))
|
178 |
+
x = self.convhead2(x1234) # B x dim+1 x H x W
|
179 |
+
|
180 |
+
descriptor_map = x[:, :-1, :, :]
|
181 |
+
scores_map = torch.sigmoid(x[:, -1, :, :]).unsqueeze(1)
|
182 |
+
|
183 |
+
return scores_map, descriptor_map
|
184 |
+
|
185 |
+
|
186 |
+
if __name__ == "__main__":
|
187 |
+
from thop import profile
|
188 |
+
|
189 |
+
net = ALNet(c1=16, c2=32, c3=64, c4=128, dim=128, single_head=True)
|
190 |
+
|
191 |
+
image = torch.randn(1, 3, 640, 480)
|
192 |
+
flops, params = profile(net, inputs=(image,), verbose=False)
|
193 |
+
print("{:<30} {:<8} GFLops".format("Computational complexity: ", flops / 1e9))
|
194 |
+
print("{:<30} {:<8} KB".format("Number of parameters: ", params / 1e3))
|
third_party/ALIKE/demo.py
ADDED
@@ -0,0 +1,201 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import copy
|
2 |
+
import os
|
3 |
+
import cv2
|
4 |
+
import glob
|
5 |
+
import logging
|
6 |
+
import argparse
|
7 |
+
import numpy as np
|
8 |
+
from tqdm import tqdm
|
9 |
+
from alike import ALike, configs
|
10 |
+
|
11 |
+
|
12 |
+
class ImageLoader(object):
|
13 |
+
def __init__(self, filepath: str):
|
14 |
+
self.N = 3000
|
15 |
+
if filepath.startswith("camera"):
|
16 |
+
camera = int(filepath[6:])
|
17 |
+
self.cap = cv2.VideoCapture(camera)
|
18 |
+
if not self.cap.isOpened():
|
19 |
+
raise IOError(f"Can't open camera {camera}!")
|
20 |
+
logging.info(f"Opened camera {camera}")
|
21 |
+
self.mode = "camera"
|
22 |
+
elif os.path.exists(filepath):
|
23 |
+
if os.path.isfile(filepath):
|
24 |
+
self.cap = cv2.VideoCapture(filepath)
|
25 |
+
if not self.cap.isOpened():
|
26 |
+
raise IOError(f"Can't open video {filepath}!")
|
27 |
+
rate = self.cap.get(cv2.CAP_PROP_FPS)
|
28 |
+
self.N = int(self.cap.get(cv2.CAP_PROP_FRAME_COUNT)) - 1
|
29 |
+
duration = self.N / rate
|
30 |
+
logging.info(f"Opened video {filepath}")
|
31 |
+
logging.info(f"Frames: {self.N}, FPS: {rate}, Duration: {duration}s")
|
32 |
+
self.mode = "video"
|
33 |
+
else:
|
34 |
+
self.images = (
|
35 |
+
glob.glob(os.path.join(filepath, "*.png"))
|
36 |
+
+ glob.glob(os.path.join(filepath, "*.jpg"))
|
37 |
+
+ glob.glob(os.path.join(filepath, "*.ppm"))
|
38 |
+
)
|
39 |
+
self.images.sort()
|
40 |
+
self.N = len(self.images)
|
41 |
+
logging.info(f"Loading {self.N} images")
|
42 |
+
self.mode = "images"
|
43 |
+
else:
|
44 |
+
raise IOError(
|
45 |
+
"Error filepath (camerax/path of images/path of videos): ", filepath
|
46 |
+
)
|
47 |
+
|
48 |
+
def __getitem__(self, item):
|
49 |
+
if self.mode == "camera" or self.mode == "video":
|
50 |
+
if item > self.N:
|
51 |
+
return None
|
52 |
+
ret, img = self.cap.read()
|
53 |
+
if not ret:
|
54 |
+
raise "Can't read image from camera"
|
55 |
+
if self.mode == "video":
|
56 |
+
self.cap.set(cv2.CAP_PROP_POS_FRAMES, item)
|
57 |
+
elif self.mode == "images":
|
58 |
+
filename = self.images[item]
|
59 |
+
img = cv2.imread(filename)
|
60 |
+
if img is None:
|
61 |
+
raise Exception("Error reading image %s" % filename)
|
62 |
+
return img
|
63 |
+
|
64 |
+
def __len__(self):
|
65 |
+
return self.N
|
66 |
+
|
67 |
+
|
68 |
+
class SimpleTracker(object):
|
69 |
+
def __init__(self):
|
70 |
+
self.pts_prev = None
|
71 |
+
self.desc_prev = None
|
72 |
+
|
73 |
+
def update(self, img, pts, desc):
|
74 |
+
N_matches = 0
|
75 |
+
if self.pts_prev is None:
|
76 |
+
self.pts_prev = pts
|
77 |
+
self.desc_prev = desc
|
78 |
+
|
79 |
+
out = copy.deepcopy(img)
|
80 |
+
for pt1 in pts:
|
81 |
+
p1 = (int(round(pt1[0])), int(round(pt1[1])))
|
82 |
+
cv2.circle(out, p1, 1, (0, 0, 255), -1, lineType=16)
|
83 |
+
else:
|
84 |
+
matches = self.mnn_mather(self.desc_prev, desc)
|
85 |
+
mpts1, mpts2 = self.pts_prev[matches[:, 0]], pts[matches[:, 1]]
|
86 |
+
N_matches = len(matches)
|
87 |
+
|
88 |
+
out = copy.deepcopy(img)
|
89 |
+
for pt1, pt2 in zip(mpts1, mpts2):
|
90 |
+
p1 = (int(round(pt1[0])), int(round(pt1[1])))
|
91 |
+
p2 = (int(round(pt2[0])), int(round(pt2[1])))
|
92 |
+
cv2.line(out, p1, p2, (0, 255, 0), lineType=16)
|
93 |
+
cv2.circle(out, p2, 1, (0, 0, 255), -1, lineType=16)
|
94 |
+
|
95 |
+
self.pts_prev = pts
|
96 |
+
self.desc_prev = desc
|
97 |
+
|
98 |
+
return out, N_matches
|
99 |
+
|
100 |
+
def mnn_mather(self, desc1, desc2):
|
101 |
+
sim = desc1 @ desc2.transpose()
|
102 |
+
sim[sim < 0.9] = 0
|
103 |
+
nn12 = np.argmax(sim, axis=1)
|
104 |
+
nn21 = np.argmax(sim, axis=0)
|
105 |
+
ids1 = np.arange(0, sim.shape[0])
|
106 |
+
mask = ids1 == nn21[nn12]
|
107 |
+
matches = np.stack([ids1[mask], nn12[mask]])
|
108 |
+
return matches.transpose()
|
109 |
+
|
110 |
+
|
111 |
+
if __name__ == "__main__":
|
112 |
+
parser = argparse.ArgumentParser(description="ALike Demo.")
|
113 |
+
parser.add_argument(
|
114 |
+
"input",
|
115 |
+
type=str,
|
116 |
+
default="",
|
117 |
+
help='Image directory or movie file or "camera0" (for webcam0).',
|
118 |
+
)
|
119 |
+
parser.add_argument(
|
120 |
+
"--model",
|
121 |
+
choices=["alike-t", "alike-s", "alike-n", "alike-l"],
|
122 |
+
default="alike-t",
|
123 |
+
help="The model configuration",
|
124 |
+
)
|
125 |
+
parser.add_argument(
|
126 |
+
"--device", type=str, default="cuda", help="Running device (default: cuda)."
|
127 |
+
)
|
128 |
+
parser.add_argument(
|
129 |
+
"--top_k",
|
130 |
+
type=int,
|
131 |
+
default=-1,
|
132 |
+
help="Detect top K keypoints. -1 for threshold based mode, >0 for top K mode. (default: -1)",
|
133 |
+
)
|
134 |
+
parser.add_argument(
|
135 |
+
"--scores_th",
|
136 |
+
type=float,
|
137 |
+
default=0.2,
|
138 |
+
help="Detector score threshold (default: 0.2).",
|
139 |
+
)
|
140 |
+
parser.add_argument(
|
141 |
+
"--n_limit",
|
142 |
+
type=int,
|
143 |
+
default=5000,
|
144 |
+
help="Maximum number of keypoints to be detected (default: 5000).",
|
145 |
+
)
|
146 |
+
parser.add_argument(
|
147 |
+
"--no_display",
|
148 |
+
action="store_true",
|
149 |
+
help="Do not display images to screen. Useful if running remotely (default: False).",
|
150 |
+
)
|
151 |
+
parser.add_argument(
|
152 |
+
"--no_sub_pixel",
|
153 |
+
action="store_true",
|
154 |
+
help="Do not detect sub-pixel keypoints (default: False).",
|
155 |
+
)
|
156 |
+
args = parser.parse_args()
|
157 |
+
|
158 |
+
logging.basicConfig(level=logging.INFO)
|
159 |
+
|
160 |
+
image_loader = ImageLoader(args.input)
|
161 |
+
model = ALike(
|
162 |
+
**configs[args.model],
|
163 |
+
device=args.device,
|
164 |
+
top_k=args.top_k,
|
165 |
+
scores_th=args.scores_th,
|
166 |
+
n_limit=args.n_limit,
|
167 |
+
)
|
168 |
+
tracker = SimpleTracker()
|
169 |
+
|
170 |
+
if not args.no_display:
|
171 |
+
logging.info("Press 'q' to stop!")
|
172 |
+
cv2.namedWindow(args.model)
|
173 |
+
|
174 |
+
runtime = []
|
175 |
+
progress_bar = tqdm(image_loader)
|
176 |
+
for img in progress_bar:
|
177 |
+
if img is None:
|
178 |
+
break
|
179 |
+
|
180 |
+
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
|
181 |
+
pred = model(img_rgb, sub_pixel=not args.no_sub_pixel)
|
182 |
+
kpts = pred["keypoints"]
|
183 |
+
desc = pred["descriptors"]
|
184 |
+
runtime.append(pred["time"])
|
185 |
+
|
186 |
+
out, N_matches = tracker.update(img, kpts, desc)
|
187 |
+
|
188 |
+
ave_fps = (1.0 / np.stack(runtime)).mean()
|
189 |
+
status = f"Fps:{ave_fps:.1f}, Keypoints/Matches: {len(kpts)}/{N_matches}"
|
190 |
+
progress_bar.set_description(status)
|
191 |
+
|
192 |
+
if not args.no_display:
|
193 |
+
cv2.setWindowTitle(args.model, args.model + ": " + status)
|
194 |
+
cv2.imshow(args.model, out)
|
195 |
+
if cv2.waitKey(1) == ord("q"):
|
196 |
+
break
|
197 |
+
|
198 |
+
logging.info("Finished!")
|
199 |
+
if not args.no_display:
|
200 |
+
logging.info("Press any key to exit!")
|
201 |
+
cv2.waitKey()
|
third_party/ALIKE/hseq/cache/alike-l-ms.npy
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1350ab826afdd9b7542a556e2fda9ad9f94388a875c8edb7874e4bcdfebc63ca
|
3 |
+
size 13124
|
third_party/ALIKE/hseq/cache/alike-l.npy
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:999daff1155f3d4736bb7374fb2058f520b0cb4c75b5d7d87fc1e7025a7d2a7d
|
3 |
+
size 13124
|
third_party/ALIKE/hseq/cache/alike-n-ms.npy
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1e5967048eddb61e423bf2ea05a2a626e18d8a716b6a0ad42471059aec0b934c
|
3 |
+
size 13124
|
third_party/ALIKE/hseq/cache/alike-n.npy
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8e2eba5ff96b25d0a100b6c7273549de91586e6069dcb5320a20edbb24ea462e
|
3 |
+
size 13124
|
third_party/ALIKE/hseq/cache/aslfeat.npy
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ce06fd1b6265e09ed3b26768b68f624e2d556358ab98addd8ebdb7a5a076abe8
|
3 |
+
size 15352
|
third_party/ALIKE/hseq/cache/d2.npy
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:976d81c6b51a98f89eac60c6d25990130c1df571ef6536280f4b00577eab56f0
|
3 |
+
size 15352
|
third_party/ALIKE/hseq/cache/disk.npy
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:df2d9e0dfd0baa19f2af12f4604368ca65a1643159e7e3438e25efc41ab15357
|
3 |
+
size 15352
|
third_party/ALIKE/hseq/cache/lfnet.npy
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:417327dee726cffccc6dfbc9b0e6b3c06b277ea8878ccf87b87475d1cd6e65ca
|
3 |
+
size 15352
|
third_party/ALIKE/hseq/cache/r2d2.npy
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1375a21adcc932db2c9e210e52f633c1903cca6d37066391eb9d645ff87d0120
|
3 |
+
size 15352
|
third_party/ALIKE/hseq/cache/superpoint.npy
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6e4d4a4ca79518af47467e9ddd69fe159c9305a580dadc4fdab6ffde6f8b48c2
|
3 |
+
size 15352
|
third_party/ALIKE/hseq/eval.py
ADDED
@@ -0,0 +1,197 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import cv2
|
2 |
+
import os
|
3 |
+
from tqdm import tqdm
|
4 |
+
import torch
|
5 |
+
import numpy as np
|
6 |
+
from extract import extract_method
|
7 |
+
|
8 |
+
use_cuda = torch.cuda.is_available()
|
9 |
+
device = torch.device("cuda" if use_cuda else "cpu")
|
10 |
+
|
11 |
+
methods = [
|
12 |
+
"d2",
|
13 |
+
"lfnet",
|
14 |
+
"superpoint",
|
15 |
+
"r2d2",
|
16 |
+
"aslfeat",
|
17 |
+
"disk",
|
18 |
+
"alike-n",
|
19 |
+
"alike-l",
|
20 |
+
"alike-n-ms",
|
21 |
+
"alike-l-ms",
|
22 |
+
]
|
23 |
+
names = [
|
24 |
+
"D2-Net(MS)",
|
25 |
+
"LF-Net(MS)",
|
26 |
+
"SuperPoint",
|
27 |
+
"R2D2(MS)",
|
28 |
+
"ASLFeat(MS)",
|
29 |
+
"DISK",
|
30 |
+
"ALike-N",
|
31 |
+
"ALike-L",
|
32 |
+
"ALike-N(MS)",
|
33 |
+
"ALike-L(MS)",
|
34 |
+
]
|
35 |
+
|
36 |
+
top_k = None
|
37 |
+
n_i = 52
|
38 |
+
n_v = 56
|
39 |
+
cache_dir = "hseq/cache"
|
40 |
+
dataset_path = "hseq/hpatches-sequences-release"
|
41 |
+
|
42 |
+
|
43 |
+
def generate_read_function(method, extension="ppm"):
|
44 |
+
def read_function(seq_name, im_idx):
|
45 |
+
aux = np.load(
|
46 |
+
os.path.join(
|
47 |
+
dataset_path, seq_name, "%d.%s.%s" % (im_idx, extension, method)
|
48 |
+
)
|
49 |
+
)
|
50 |
+
if top_k is None:
|
51 |
+
return aux["keypoints"], aux["descriptors"]
|
52 |
+
else:
|
53 |
+
assert "scores" in aux
|
54 |
+
ids = np.argsort(aux["scores"])[-top_k:]
|
55 |
+
return aux["keypoints"][ids, :], aux["descriptors"][ids, :]
|
56 |
+
|
57 |
+
return read_function
|
58 |
+
|
59 |
+
|
60 |
+
def mnn_matcher(descriptors_a, descriptors_b):
|
61 |
+
device = descriptors_a.device
|
62 |
+
sim = descriptors_a @ descriptors_b.t()
|
63 |
+
nn12 = torch.max(sim, dim=1)[1]
|
64 |
+
nn21 = torch.max(sim, dim=0)[1]
|
65 |
+
ids1 = torch.arange(0, sim.shape[0], device=device)
|
66 |
+
mask = ids1 == nn21[nn12]
|
67 |
+
matches = torch.stack([ids1[mask], nn12[mask]])
|
68 |
+
return matches.t().data.cpu().numpy()
|
69 |
+
|
70 |
+
|
71 |
+
def homo_trans(coord, H):
|
72 |
+
kpt_num = coord.shape[0]
|
73 |
+
homo_coord = np.concatenate((coord, np.ones((kpt_num, 1))), axis=-1)
|
74 |
+
proj_coord = np.matmul(H, homo_coord.T).T
|
75 |
+
proj_coord = proj_coord / proj_coord[:, 2][..., None]
|
76 |
+
proj_coord = proj_coord[:, 0:2]
|
77 |
+
return proj_coord
|
78 |
+
|
79 |
+
|
80 |
+
def benchmark_features(read_feats):
|
81 |
+
lim = [1, 5]
|
82 |
+
rng = np.arange(lim[0], lim[1] + 1)
|
83 |
+
|
84 |
+
seq_names = sorted(os.listdir(dataset_path))
|
85 |
+
|
86 |
+
n_feats = []
|
87 |
+
n_matches = []
|
88 |
+
seq_type = []
|
89 |
+
i_err = {thr: 0 for thr in rng}
|
90 |
+
v_err = {thr: 0 for thr in rng}
|
91 |
+
|
92 |
+
i_err_homo = {thr: 0 for thr in rng}
|
93 |
+
v_err_homo = {thr: 0 for thr in rng}
|
94 |
+
|
95 |
+
for seq_idx, seq_name in tqdm(enumerate(seq_names), total=len(seq_names)):
|
96 |
+
keypoints_a, descriptors_a = read_feats(seq_name, 1)
|
97 |
+
n_feats.append(keypoints_a.shape[0])
|
98 |
+
|
99 |
+
# =========== compute homography
|
100 |
+
ref_img = cv2.imread(os.path.join(dataset_path, seq_name, "1.ppm"))
|
101 |
+
ref_img_shape = ref_img.shape
|
102 |
+
|
103 |
+
for im_idx in range(2, 7):
|
104 |
+
keypoints_b, descriptors_b = read_feats(seq_name, im_idx)
|
105 |
+
n_feats.append(keypoints_b.shape[0])
|
106 |
+
|
107 |
+
matches = mnn_matcher(
|
108 |
+
torch.from_numpy(descriptors_a).to(device=device),
|
109 |
+
torch.from_numpy(descriptors_b).to(device=device),
|
110 |
+
)
|
111 |
+
|
112 |
+
homography = np.loadtxt(
|
113 |
+
os.path.join(dataset_path, seq_name, "H_1_" + str(im_idx))
|
114 |
+
)
|
115 |
+
|
116 |
+
pos_a = keypoints_a[matches[:, 0], :2]
|
117 |
+
pos_a_h = np.concatenate([pos_a, np.ones([matches.shape[0], 1])], axis=1)
|
118 |
+
pos_b_proj_h = np.transpose(np.dot(homography, np.transpose(pos_a_h)))
|
119 |
+
pos_b_proj = pos_b_proj_h[:, :2] / pos_b_proj_h[:, 2:]
|
120 |
+
|
121 |
+
pos_b = keypoints_b[matches[:, 1], :2]
|
122 |
+
|
123 |
+
dist = np.sqrt(np.sum((pos_b - pos_b_proj) ** 2, axis=1))
|
124 |
+
|
125 |
+
n_matches.append(matches.shape[0])
|
126 |
+
seq_type.append(seq_name[0])
|
127 |
+
|
128 |
+
if dist.shape[0] == 0:
|
129 |
+
dist = np.array([float("inf")])
|
130 |
+
|
131 |
+
for thr in rng:
|
132 |
+
if seq_name[0] == "i":
|
133 |
+
i_err[thr] += np.mean(dist <= thr)
|
134 |
+
else:
|
135 |
+
v_err[thr] += np.mean(dist <= thr)
|
136 |
+
|
137 |
+
# =========== compute homography
|
138 |
+
gt_homo = homography
|
139 |
+
pred_homo, _ = cv2.findHomography(
|
140 |
+
keypoints_a[matches[:, 0], :2],
|
141 |
+
keypoints_b[matches[:, 1], :2],
|
142 |
+
cv2.RANSAC,
|
143 |
+
)
|
144 |
+
if pred_homo is None:
|
145 |
+
homo_dist = np.array([float("inf")])
|
146 |
+
else:
|
147 |
+
corners = np.array(
|
148 |
+
[
|
149 |
+
[0, 0],
|
150 |
+
[ref_img_shape[1] - 1, 0],
|
151 |
+
[0, ref_img_shape[0] - 1],
|
152 |
+
[ref_img_shape[1] - 1, ref_img_shape[0] - 1],
|
153 |
+
]
|
154 |
+
)
|
155 |
+
real_warped_corners = homo_trans(corners, gt_homo)
|
156 |
+
warped_corners = homo_trans(corners, pred_homo)
|
157 |
+
homo_dist = np.mean(
|
158 |
+
np.linalg.norm(real_warped_corners - warped_corners, axis=1)
|
159 |
+
)
|
160 |
+
|
161 |
+
for thr in rng:
|
162 |
+
if seq_name[0] == "i":
|
163 |
+
i_err_homo[thr] += np.mean(homo_dist <= thr)
|
164 |
+
else:
|
165 |
+
v_err_homo[thr] += np.mean(homo_dist <= thr)
|
166 |
+
|
167 |
+
seq_type = np.array(seq_type)
|
168 |
+
n_feats = np.array(n_feats)
|
169 |
+
n_matches = np.array(n_matches)
|
170 |
+
|
171 |
+
return i_err, v_err, i_err_homo, v_err_homo, [seq_type, n_feats, n_matches]
|
172 |
+
|
173 |
+
|
174 |
+
if __name__ == "__main__":
|
175 |
+
errors = {}
|
176 |
+
for method in methods:
|
177 |
+
output_file = os.path.join(cache_dir, method + ".npy")
|
178 |
+
read_function = generate_read_function(method)
|
179 |
+
if os.path.exists(output_file):
|
180 |
+
errors[method] = np.load(output_file, allow_pickle=True)
|
181 |
+
else:
|
182 |
+
extract_method(method)
|
183 |
+
errors[method] = benchmark_features(read_function)
|
184 |
+
np.save(output_file, errors[method])
|
185 |
+
|
186 |
+
for name, method in zip(names, methods):
|
187 |
+
i_err, v_err, i_err_hom, v_err_hom, _ = errors[method]
|
188 |
+
|
189 |
+
print(f"====={name}=====")
|
190 |
+
print(f"MMA@1 MMA@2 MMA@3 MHA@1 MHA@2 MHA@3: ", end="")
|
191 |
+
for thr in range(1, 4):
|
192 |
+
err = (i_err[thr] + v_err[thr]) / ((n_i + n_v) * 5)
|
193 |
+
print(f"{err * 100:.2f}%", end=" ")
|
194 |
+
for thr in range(1, 4):
|
195 |
+
err_hom = (i_err_hom[thr] + v_err_hom[thr]) / ((n_i + n_v) * 5)
|
196 |
+
print(f"{err_hom * 100:.2f}%", end=" ")
|
197 |
+
print("")
|
third_party/ALIKE/hseq/extract.py
ADDED
@@ -0,0 +1,175 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
import sys
|
3 |
+
import cv2
|
4 |
+
from pathlib import Path
|
5 |
+
import numpy as np
|
6 |
+
import torch
|
7 |
+
import torch.utils.data as data
|
8 |
+
from tqdm import tqdm
|
9 |
+
from copy import deepcopy
|
10 |
+
from torchvision.transforms import ToTensor
|
11 |
+
|
12 |
+
sys.path.append(os.path.join(os.path.dirname(__file__), ".."))
|
13 |
+
from alike import ALike, configs
|
14 |
+
|
15 |
+
dataset_root = "hseq/hpatches-sequences-release"
|
16 |
+
use_cuda = torch.cuda.is_available()
|
17 |
+
device = "cuda" if use_cuda else "cpu"
|
18 |
+
methods = ["alike-n", "alike-l", "alike-n-ms", "alike-l-ms"]
|
19 |
+
|
20 |
+
|
21 |
+
class HPatchesDataset(data.Dataset):
|
22 |
+
def __init__(self, root: str = dataset_root, alteration: str = "all"):
|
23 |
+
"""
|
24 |
+
Args:
|
25 |
+
root: dataset root path
|
26 |
+
alteration: # 'all', 'i' for illumination or 'v' for viewpoint
|
27 |
+
"""
|
28 |
+
assert Path(root).exists(), f"Dataset root path {root} dose not exist!"
|
29 |
+
self.root = root
|
30 |
+
|
31 |
+
# get all image file name
|
32 |
+
self.image0_list = []
|
33 |
+
self.image1_list = []
|
34 |
+
self.homographies = []
|
35 |
+
folders = [x for x in Path(self.root).iterdir() if x.is_dir()]
|
36 |
+
self.seqs = []
|
37 |
+
for folder in folders:
|
38 |
+
if alteration == "i" and folder.stem[0] != "i":
|
39 |
+
continue
|
40 |
+
if alteration == "v" and folder.stem[0] != "v":
|
41 |
+
continue
|
42 |
+
|
43 |
+
self.seqs.append(folder)
|
44 |
+
|
45 |
+
self.len = len(self.seqs)
|
46 |
+
assert self.len > 0, f"Can not find PatchDataset in path {self.root}"
|
47 |
+
|
48 |
+
def __getitem__(self, item):
|
49 |
+
folder = self.seqs[item]
|
50 |
+
|
51 |
+
imgs = []
|
52 |
+
homos = []
|
53 |
+
for i in range(1, 7):
|
54 |
+
img = cv2.imread(str(folder / f"{i}.ppm"), cv2.IMREAD_COLOR)
|
55 |
+
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # HxWxC
|
56 |
+
imgs.append(img)
|
57 |
+
|
58 |
+
if i != 1:
|
59 |
+
homo = np.loadtxt(str(folder / f"H_1_{i}")).astype("float32")
|
60 |
+
homos.append(homo)
|
61 |
+
|
62 |
+
return imgs, homos, folder.stem
|
63 |
+
|
64 |
+
def __len__(self):
|
65 |
+
return self.len
|
66 |
+
|
67 |
+
def name(self):
|
68 |
+
return self.__class__
|
69 |
+
|
70 |
+
|
71 |
+
def extract_multiscale(
|
72 |
+
model,
|
73 |
+
img,
|
74 |
+
scale_f=2**0.5,
|
75 |
+
min_scale=1.0,
|
76 |
+
max_scale=1.0,
|
77 |
+
min_size=0.0,
|
78 |
+
max_size=99999.0,
|
79 |
+
image_size_max=99999,
|
80 |
+
n_k=0,
|
81 |
+
sort=False,
|
82 |
+
):
|
83 |
+
H_, W_, three = img.shape
|
84 |
+
assert three == 3, "input image shape should be [HxWx3]"
|
85 |
+
|
86 |
+
old_bm = torch.backends.cudnn.benchmark
|
87 |
+
torch.backends.cudnn.benchmark = False # speedup
|
88 |
+
|
89 |
+
# ==================== image size constraint
|
90 |
+
image = deepcopy(img)
|
91 |
+
max_hw = max(H_, W_)
|
92 |
+
if max_hw > image_size_max:
|
93 |
+
ratio = float(image_size_max / max_hw)
|
94 |
+
image = cv2.resize(image, dsize=None, fx=ratio, fy=ratio)
|
95 |
+
|
96 |
+
# ==================== convert image to tensor
|
97 |
+
H, W, three = image.shape
|
98 |
+
image = ToTensor()(image).unsqueeze(0)
|
99 |
+
image = image.to(device)
|
100 |
+
|
101 |
+
s = 1.0 # current scale factor
|
102 |
+
keypoints, descriptors, scores, scores_maps, descriptor_maps = [], [], [], [], []
|
103 |
+
while s + 0.001 >= max(min_scale, min_size / max(H, W)):
|
104 |
+
if s - 0.001 <= min(max_scale, max_size / max(H, W)):
|
105 |
+
nh, nw = image.shape[2:]
|
106 |
+
|
107 |
+
# extract descriptors
|
108 |
+
with torch.no_grad():
|
109 |
+
descriptor_map, scores_map = model.extract_dense_map(image)
|
110 |
+
keypoints_, descriptors_, scores_, _ = model.dkd(
|
111 |
+
scores_map, descriptor_map
|
112 |
+
)
|
113 |
+
|
114 |
+
keypoints.append(keypoints_[0])
|
115 |
+
descriptors.append(descriptors_[0])
|
116 |
+
scores.append(scores_[0])
|
117 |
+
|
118 |
+
s /= scale_f
|
119 |
+
|
120 |
+
# down-scale the image for next iteration
|
121 |
+
nh, nw = round(H * s), round(W * s)
|
122 |
+
image = torch.nn.functional.interpolate(
|
123 |
+
image, (nh, nw), mode="bilinear", align_corners=False
|
124 |
+
)
|
125 |
+
|
126 |
+
# restore value
|
127 |
+
torch.backends.cudnn.benchmark = old_bm
|
128 |
+
|
129 |
+
keypoints = torch.cat(keypoints)
|
130 |
+
descriptors = torch.cat(descriptors)
|
131 |
+
scores = torch.cat(scores)
|
132 |
+
keypoints = (keypoints + 1) / 2 * keypoints.new_tensor([[W_ - 1, H_ - 1]])
|
133 |
+
|
134 |
+
if sort or 0 < n_k < len(keypoints):
|
135 |
+
indices = torch.argsort(scores, descending=True)
|
136 |
+
keypoints = keypoints[indices]
|
137 |
+
descriptors = descriptors[indices]
|
138 |
+
scores = scores[indices]
|
139 |
+
|
140 |
+
if 0 < n_k < len(keypoints):
|
141 |
+
keypoints = keypoints[0:n_k]
|
142 |
+
descriptors = descriptors[0:n_k]
|
143 |
+
scores = scores[0:n_k]
|
144 |
+
|
145 |
+
return {"keypoints": keypoints, "descriptors": descriptors, "scores": scores}
|
146 |
+
|
147 |
+
|
148 |
+
def extract_method(m):
|
149 |
+
hpatches = HPatchesDataset(root=dataset_root, alteration="all")
|
150 |
+
model = m[:7]
|
151 |
+
min_scale = 0.3 if m[8:] == "ms" else 1.0
|
152 |
+
|
153 |
+
model = ALike(**configs[model], device=device, top_k=0, scores_th=0.2, n_limit=5000)
|
154 |
+
|
155 |
+
progbar = tqdm(hpatches, desc="Extracting for {}".format(m))
|
156 |
+
for imgs, homos, seq_name in progbar:
|
157 |
+
for i in range(1, 7):
|
158 |
+
img = imgs[i - 1]
|
159 |
+
pred = extract_multiscale(
|
160 |
+
model, img, min_scale=min_scale, max_scale=1, sort=False, n_k=5000
|
161 |
+
)
|
162 |
+
kpts, descs, scores = pred["keypoints"], pred["descriptors"], pred["scores"]
|
163 |
+
|
164 |
+
with open(os.path.join(dataset_root, seq_name, f"{i}.ppm.{m}"), "wb") as f:
|
165 |
+
np.savez(
|
166 |
+
f,
|
167 |
+
keypoints=kpts.cpu().numpy(),
|
168 |
+
scores=scores.cpu().numpy(),
|
169 |
+
descriptors=descs.cpu().numpy(),
|
170 |
+
)
|
171 |
+
|
172 |
+
|
173 |
+
if __name__ == "__main__":
|
174 |
+
for method in methods:
|
175 |
+
extract_method(method)
|
third_party/ALIKE/matlab/createfigure.m
ADDED
@@ -0,0 +1,75 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
function createfigure(X1, YMatrix1, Y1, l1, l2, l3)
|
2 |
+
%CREATEFIGURE(X1, YMatrix1, Y1)
|
3 |
+
% X1: vector of x data
|
4 |
+
% YMATRIX1: matrix of y data
|
5 |
+
% Y1: vector of y data
|
6 |
+
|
7 |
+
% Auto-generated by MATLAB on 29-Oct-2021 15:42:14
|
8 |
+
|
9 |
+
% Create figure
|
10 |
+
figure1 = figure;
|
11 |
+
|
12 |
+
% Create axes
|
13 |
+
axes1 = axes('Parent',figure1);
|
14 |
+
hold(axes1,'on');
|
15 |
+
|
16 |
+
% Create multiple lines using matrix input to plot
|
17 |
+
plot1 = plot(X1,YMatrix1,'Parent',axes1,'LineWidth',1);
|
18 |
+
set(plot1(1),'LineStyle','-.','Color',[1 0 0]);
|
19 |
+
set(plot1(2),'Color',[0 1 0]);
|
20 |
+
set(plot1(3),'LineStyle','--',...
|
21 |
+
'Color',[0.87058824300766 0.490196079015732 0]);
|
22 |
+
|
23 |
+
% Uncomment the following line to preserve the X-limits of the axes
|
24 |
+
% xlim(axes1,[-1.1 1.1]);
|
25 |
+
% Uncomment the following line to preserve the Y-limits of the axes
|
26 |
+
ylim(axes1,[0 2.2]);
|
27 |
+
box(axes1,'on');
|
28 |
+
hold(axes1,'off');
|
29 |
+
% Set the remaining axes properties
|
30 |
+
set(axes1,'XColor',[0 0 0],'YColor',[0 0 0],'YTick',[0 0.5 1 1.5 2 2.5]);
|
31 |
+
% Create axes
|
32 |
+
axes2 = axes('Parent',figure1);
|
33 |
+
hold(axes2,'on');
|
34 |
+
colororder([0.494 0.184 0.556;0.466 0.674 0.188;0.301 0.745 0.933;0.635 0.078 0.184;0 0.447 0.741;0.85 0.325 0.098;0.929 0.694 0.125]);
|
35 |
+
|
36 |
+
% Create plot
|
37 |
+
plot(X1,Y1,'Parent',axes2,'LineWidth',1,'LineStyle',':','Color',[0 0 1]);
|
38 |
+
|
39 |
+
% Uncomment the following line to preserve the X-limits of the axes
|
40 |
+
% xlim(axes2,[-1.1 1.1]);
|
41 |
+
% Uncomment the following line to preserve the Y-limits of the axes
|
42 |
+
ylim(axes2,[0 1.6]);
|
43 |
+
hold(axes2,'off');
|
44 |
+
% Set the remaining axes properties
|
45 |
+
set(axes2,'Color','none','HitTest','off','XColor',[0 0 0],'YAxisLocation',...
|
46 |
+
'right','YColor',[0 0 0],'YTick',[0 0.5 1 1.5]);
|
47 |
+
% Create textbox
|
48 |
+
annotation(figure1,'textbox',...
|
49 |
+
[0.255427607968038,0.605539475745798,0.304947448327989,0.235148519909872],...
|
50 |
+
'Color',[0.8 0 0],...
|
51 |
+
'String',{sprintf('peak loss=%.4f',l1)},...
|
52 |
+
'EdgeColor','none');
|
53 |
+
|
54 |
+
% Create textbox
|
55 |
+
annotation(figure1,'textbox',...
|
56 |
+
[0.631790371410027,0.083530640355914,0.178879315581032,0.235148519909871],...
|
57 |
+
'Color',[0 0 1],...
|
58 |
+
'String',{'keypoint'},...
|
59 |
+
'EdgeColor','none');
|
60 |
+
|
61 |
+
% Create textbox
|
62 |
+
annotation(figure1,'textbox',...
|
63 |
+
[0.59663112557549,0.640686239621974,0.318247136419826,0.22093023731067],...
|
64 |
+
'Color',[0 0.498039215803146 0],...
|
65 |
+
'String',{sprintf('peak loss=%.4f',l2)},...
|
66 |
+
'EdgeColor','none');
|
67 |
+
|
68 |
+
% Create textbox
|
69 |
+
annotation(figure1,'textbox',...
|
70 |
+
[0.595423071596731,0.415858983920567,0.318247136419826,0.235148519909871],...
|
71 |
+
'Color',[0.87058824300766 0.490196079015732 0],...
|
72 |
+
'String',{sprintf('peak loss=%.4f',l3)},...
|
73 |
+
'FitBoxToText','off',...
|
74 |
+
'EdgeColor','none');
|
75 |
+
|
third_party/ALIKE/matlab/peakloss_rect.m
ADDED
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
clear;
|
2 |
+
close all;
|
3 |
+
|
4 |
+
x = -1:0.01:1;
|
5 |
+
|
6 |
+
p0 = 0.5;
|
7 |
+
p1 = -0.5;
|
8 |
+
|
9 |
+
d = abs(x - p0);
|
10 |
+
|
11 |
+
c0 = 2 .* (x>=-0.75 & x <= -0.25);
|
12 |
+
c1 = 2 .* (x>=0.25 & x <= 0.75);
|
13 |
+
c2 = 1.25 .* (x>=0.1 & x <= 0.9);
|
14 |
+
|
15 |
+
peak_loss0 = sum(d.*c0) / length(x)
|
16 |
+
peak_loss1 = sum(d.*c1) / length(x)
|
17 |
+
peak_loss2 = sum(d.*c2) / length(x)
|
18 |
+
|
19 |
+
createfigure(x, [c0;c1;c2], d, peak_loss0,peak_loss1, peak_loss2);
|
third_party/ALIKE/requirements.txt
ADDED
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
opencv-python~=4.5.1.48
|
2 |
+
numpy~=1.19.5
|
3 |
+
tqdm~=4.60.0
|
4 |
+
torch~=1.8.0
|
5 |
+
torchvision~=0.9.0
|
6 |
+
thop~=0.0.31-2005241907
|
third_party/ALIKE/soft_detect.py
ADDED
@@ -0,0 +1,234 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import torch
|
2 |
+
from torch import nn
|
3 |
+
import torch.nn.functional as F
|
4 |
+
|
5 |
+
|
6 |
+
# coordinates system
|
7 |
+
# ------------------------------> [ x: range=-1.0~1.0; w: range=0~W ]
|
8 |
+
# | -----------------------------
|
9 |
+
# | | |
|
10 |
+
# | | |
|
11 |
+
# | | |
|
12 |
+
# | | image |
|
13 |
+
# | | |
|
14 |
+
# | | |
|
15 |
+
# | | |
|
16 |
+
# | |---------------------------|
|
17 |
+
# v
|
18 |
+
# [ y: range=-1.0~1.0; h: range=0~H ]
|
19 |
+
|
20 |
+
|
21 |
+
def simple_nms(scores, nms_radius: int):
|
22 |
+
"""Fast Non-maximum suppression to remove nearby points"""
|
23 |
+
assert nms_radius >= 0
|
24 |
+
|
25 |
+
def max_pool(x):
|
26 |
+
return torch.nn.functional.max_pool2d(
|
27 |
+
x, kernel_size=nms_radius * 2 + 1, stride=1, padding=nms_radius
|
28 |
+
)
|
29 |
+
|
30 |
+
zeros = torch.zeros_like(scores)
|
31 |
+
max_mask = scores == max_pool(scores)
|
32 |
+
|
33 |
+
for _ in range(2):
|
34 |
+
supp_mask = max_pool(max_mask.float()) > 0
|
35 |
+
supp_scores = torch.where(supp_mask, zeros, scores)
|
36 |
+
new_max_mask = supp_scores == max_pool(supp_scores)
|
37 |
+
max_mask = max_mask | (new_max_mask & (~supp_mask))
|
38 |
+
return torch.where(max_mask, scores, zeros)
|
39 |
+
|
40 |
+
|
41 |
+
def sample_descriptor(descriptor_map, kpts, bilinear_interp=False):
|
42 |
+
"""
|
43 |
+
:param descriptor_map: BxCxHxW
|
44 |
+
:param kpts: list, len=B, each is Nx2 (keypoints) [h,w]
|
45 |
+
:param bilinear_interp: bool, whether to use bilinear interpolation
|
46 |
+
:return: descriptors: list, len=B, each is NxD
|
47 |
+
"""
|
48 |
+
batch_size, channel, height, width = descriptor_map.shape
|
49 |
+
|
50 |
+
descriptors = []
|
51 |
+
for index in range(batch_size):
|
52 |
+
kptsi = kpts[index] # Nx2,(x,y)
|
53 |
+
|
54 |
+
if bilinear_interp:
|
55 |
+
descriptors_ = torch.nn.functional.grid_sample(
|
56 |
+
descriptor_map[index].unsqueeze(0),
|
57 |
+
kptsi.view(1, 1, -1, 2),
|
58 |
+
mode="bilinear",
|
59 |
+
align_corners=True,
|
60 |
+
)[
|
61 |
+
0, :, 0, :
|
62 |
+
] # CxN
|
63 |
+
else:
|
64 |
+
kptsi = (kptsi + 1) / 2 * kptsi.new_tensor([[width - 1, height - 1]])
|
65 |
+
kptsi = kptsi.long()
|
66 |
+
descriptors_ = descriptor_map[index, :, kptsi[:, 1], kptsi[:, 0]] # CxN
|
67 |
+
|
68 |
+
descriptors_ = torch.nn.functional.normalize(descriptors_, p=2, dim=0)
|
69 |
+
descriptors.append(descriptors_.t())
|
70 |
+
|
71 |
+
return descriptors
|
72 |
+
|
73 |
+
|
74 |
+
class DKD(nn.Module):
|
75 |
+
def __init__(self, radius=2, top_k=0, scores_th=0.2, n_limit=20000):
|
76 |
+
"""
|
77 |
+
Args:
|
78 |
+
radius: soft detection radius, kernel size is (2 * radius + 1)
|
79 |
+
top_k: top_k > 0: return top k keypoints
|
80 |
+
scores_th: top_k <= 0 threshold mode: scores_th > 0: return keypoints with scores>scores_th
|
81 |
+
else: return keypoints with scores > scores.mean()
|
82 |
+
n_limit: max number of keypoint in threshold mode
|
83 |
+
"""
|
84 |
+
super().__init__()
|
85 |
+
self.radius = radius
|
86 |
+
self.top_k = top_k
|
87 |
+
self.scores_th = scores_th
|
88 |
+
self.n_limit = n_limit
|
89 |
+
self.kernel_size = 2 * self.radius + 1
|
90 |
+
self.temperature = 0.1 # tuned temperature
|
91 |
+
self.unfold = nn.Unfold(kernel_size=self.kernel_size, padding=self.radius)
|
92 |
+
|
93 |
+
# local xy grid
|
94 |
+
x = torch.linspace(-self.radius, self.radius, self.kernel_size)
|
95 |
+
# (kernel_size*kernel_size) x 2 : (w,h)
|
96 |
+
self.hw_grid = torch.stack(torch.meshgrid([x, x])).view(2, -1).t()[:, [1, 0]]
|
97 |
+
|
98 |
+
def detect_keypoints(self, scores_map, sub_pixel=True):
|
99 |
+
b, c, h, w = scores_map.shape
|
100 |
+
scores_nograd = scores_map.detach()
|
101 |
+
# nms_scores = simple_nms(scores_nograd, self.radius)
|
102 |
+
nms_scores = simple_nms(scores_nograd, 2)
|
103 |
+
|
104 |
+
# remove border
|
105 |
+
nms_scores[:, :, : self.radius + 1, :] = 0
|
106 |
+
nms_scores[:, :, :, : self.radius + 1] = 0
|
107 |
+
nms_scores[:, :, h - self.radius :, :] = 0
|
108 |
+
nms_scores[:, :, :, w - self.radius :] = 0
|
109 |
+
|
110 |
+
# detect keypoints without grad
|
111 |
+
if self.top_k > 0:
|
112 |
+
topk = torch.topk(nms_scores.view(b, -1), self.top_k)
|
113 |
+
indices_keypoints = topk.indices # B x top_k
|
114 |
+
else:
|
115 |
+
if self.scores_th > 0:
|
116 |
+
masks = nms_scores > self.scores_th
|
117 |
+
if masks.sum() == 0:
|
118 |
+
th = scores_nograd.reshape(b, -1).mean(dim=1) # th = self.scores_th
|
119 |
+
masks = nms_scores > th.reshape(b, 1, 1, 1)
|
120 |
+
else:
|
121 |
+
th = scores_nograd.reshape(b, -1).mean(dim=1) # th = self.scores_th
|
122 |
+
masks = nms_scores > th.reshape(b, 1, 1, 1)
|
123 |
+
masks = masks.reshape(b, -1)
|
124 |
+
|
125 |
+
indices_keypoints = [] # list, B x (any size)
|
126 |
+
scores_view = scores_nograd.reshape(b, -1)
|
127 |
+
for mask, scores in zip(masks, scores_view):
|
128 |
+
indices = mask.nonzero(as_tuple=False)[:, 0]
|
129 |
+
if len(indices) > self.n_limit:
|
130 |
+
kpts_sc = scores[indices]
|
131 |
+
sort_idx = kpts_sc.sort(descending=True)[1]
|
132 |
+
sel_idx = sort_idx[: self.n_limit]
|
133 |
+
indices = indices[sel_idx]
|
134 |
+
indices_keypoints.append(indices)
|
135 |
+
|
136 |
+
keypoints = []
|
137 |
+
scoredispersitys = []
|
138 |
+
kptscores = []
|
139 |
+
if sub_pixel:
|
140 |
+
# detect soft keypoints with grad backpropagation
|
141 |
+
patches = self.unfold(scores_map) # B x (kernel**2) x (H*W)
|
142 |
+
self.hw_grid = self.hw_grid.to(patches) # to device
|
143 |
+
for b_idx in range(b):
|
144 |
+
patch = patches[b_idx].t() # (H*W) x (kernel**2)
|
145 |
+
indices_kpt = indices_keypoints[
|
146 |
+
b_idx
|
147 |
+
] # one dimension vector, say its size is M
|
148 |
+
patch_scores = patch[indices_kpt] # M x (kernel**2)
|
149 |
+
|
150 |
+
# max is detached to prevent undesired backprop loops in the graph
|
151 |
+
max_v = patch_scores.max(dim=1).values.detach()[:, None]
|
152 |
+
x_exp = (
|
153 |
+
(patch_scores - max_v) / self.temperature
|
154 |
+
).exp() # M * (kernel**2), in [0, 1]
|
155 |
+
|
156 |
+
# \frac{ \sum{(i,j) \times \exp(x/T)} }{ \sum{\exp(x/T)} }
|
157 |
+
xy_residual = (
|
158 |
+
x_exp @ self.hw_grid / x_exp.sum(dim=1)[:, None]
|
159 |
+
) # Soft-argmax, Mx2
|
160 |
+
|
161 |
+
hw_grid_dist2 = (
|
162 |
+
torch.norm(
|
163 |
+
(self.hw_grid[None, :, :] - xy_residual[:, None, :])
|
164 |
+
/ self.radius,
|
165 |
+
dim=-1,
|
166 |
+
)
|
167 |
+
** 2
|
168 |
+
)
|
169 |
+
scoredispersity = (x_exp * hw_grid_dist2).sum(dim=1) / x_exp.sum(dim=1)
|
170 |
+
|
171 |
+
# compute result keypoints
|
172 |
+
keypoints_xy_nms = torch.stack(
|
173 |
+
[indices_kpt % w, indices_kpt // w], dim=1
|
174 |
+
) # Mx2
|
175 |
+
keypoints_xy = keypoints_xy_nms + xy_residual
|
176 |
+
keypoints_xy = (
|
177 |
+
keypoints_xy / keypoints_xy.new_tensor([w - 1, h - 1]) * 2 - 1
|
178 |
+
) # (w,h) -> (-1~1,-1~1)
|
179 |
+
|
180 |
+
kptscore = torch.nn.functional.grid_sample(
|
181 |
+
scores_map[b_idx].unsqueeze(0),
|
182 |
+
keypoints_xy.view(1, 1, -1, 2),
|
183 |
+
mode="bilinear",
|
184 |
+
align_corners=True,
|
185 |
+
)[
|
186 |
+
0, 0, 0, :
|
187 |
+
] # CxN
|
188 |
+
|
189 |
+
keypoints.append(keypoints_xy)
|
190 |
+
scoredispersitys.append(scoredispersity)
|
191 |
+
kptscores.append(kptscore)
|
192 |
+
else:
|
193 |
+
for b_idx in range(b):
|
194 |
+
indices_kpt = indices_keypoints[
|
195 |
+
b_idx
|
196 |
+
] # one dimension vector, say its size is M
|
197 |
+
keypoints_xy_nms = torch.stack(
|
198 |
+
[indices_kpt % w, indices_kpt // w], dim=1
|
199 |
+
) # Mx2
|
200 |
+
keypoints_xy = (
|
201 |
+
keypoints_xy_nms / keypoints_xy_nms.new_tensor([w - 1, h - 1]) * 2
|
202 |
+
- 1
|
203 |
+
) # (w,h) -> (-1~1,-1~1)
|
204 |
+
kptscore = torch.nn.functional.grid_sample(
|
205 |
+
scores_map[b_idx].unsqueeze(0),
|
206 |
+
keypoints_xy.view(1, 1, -1, 2),
|
207 |
+
mode="bilinear",
|
208 |
+
align_corners=True,
|
209 |
+
)[
|
210 |
+
0, 0, 0, :
|
211 |
+
] # CxN
|
212 |
+
keypoints.append(keypoints_xy)
|
213 |
+
scoredispersitys.append(None)
|
214 |
+
kptscores.append(kptscore)
|
215 |
+
|
216 |
+
return keypoints, scoredispersitys, kptscores
|
217 |
+
|
218 |
+
def forward(self, scores_map, descriptor_map, sub_pixel=False):
|
219 |
+
"""
|
220 |
+
:param scores_map: Bx1xHxW
|
221 |
+
:param descriptor_map: BxCxHxW
|
222 |
+
:param sub_pixel: whether to use sub-pixel keypoint detection
|
223 |
+
:return: kpts: list[Nx2,...]; kptscores: list[N,....] normalised position: -1.0 ~ 1.0
|
224 |
+
"""
|
225 |
+
keypoints, scoredispersitys, kptscores = self.detect_keypoints(
|
226 |
+
scores_map, sub_pixel
|
227 |
+
)
|
228 |
+
|
229 |
+
descriptors = sample_descriptor(descriptor_map, keypoints, sub_pixel)
|
230 |
+
|
231 |
+
# keypoints: B M 2
|
232 |
+
# descriptors: B M D
|
233 |
+
# scoredispersitys:
|
234 |
+
return keypoints, descriptors, kptscores, scoredispersitys
|
third_party/ASpanFormer/.github/workflows/sync.yml
ADDED
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
name: Upstream Sync
|
2 |
+
|
3 |
+
permissions:
|
4 |
+
contents: write
|
5 |
+
|
6 |
+
on:
|
7 |
+
schedule:
|
8 |
+
- cron: "0 0 * * *" # every day
|
9 |
+
workflow_dispatch:
|
10 |
+
|
11 |
+
jobs:
|
12 |
+
sync_latest_from_upstream:
|
13 |
+
name: Sync latest commits from upstream repo
|
14 |
+
runs-on: ubuntu-latest
|
15 |
+
if: ${{ github.event.repository.fork }}
|
16 |
+
|
17 |
+
steps:
|
18 |
+
# Step 1: run a standard checkout action
|
19 |
+
- name: Checkout target repo
|
20 |
+
uses: actions/checkout@v3
|
21 |
+
|
22 |
+
# Step 2: run the sync action
|
23 |
+
- name: Sync upstream changes
|
24 |
+
id: sync
|
25 |
+
uses: aormsby/Fork-Sync-With-Upstream-action@v3.4
|
26 |
+
with:
|
27 |
+
upstream_sync_repo: apple/ml-aspanformer
|
28 |
+
upstream_sync_branch: main
|
29 |
+
target_sync_branch: main
|
30 |
+
target_repo_token: ${{ secrets.GITHUB_TOKEN }} # automatically generated, no need to set
|
31 |
+
|
32 |
+
# Set test_mode true to run tests instead of the true action!!
|
33 |
+
test_mode: false
|
34 |
+
|
35 |
+
- name: Sync check
|
36 |
+
if: failure()
|
37 |
+
run: |
|
38 |
+
echo "::error::Due to insufficient permissions, synchronization failed (as expected). Please go to the repository homepage and manually perform [Sync fork]."
|
39 |
+
exit 1
|
third_party/ASpanFormer/.gitignore
ADDED
@@ -0,0 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
.vscode/
|
2 |
+
__pycache__/
|
3 |
+
*.pyc
|
4 |
+
*.DS_Store
|
5 |
+
*.swp
|
6 |
+
*.pth
|
7 |
+
tmp.*
|
8 |
+
*/.ipynb_checkpoints/*
|
9 |
+
|
10 |
+
logs/
|
11 |
+
# weights/
|
12 |
+
dump/
|
13 |
+
demo/*.mp4
|
14 |
+
demo/demo_images/
|
15 |
+
src/loftr/utils/superglue.py
|
16 |
+
demo/utils.py
|
17 |
+
|
18 |
+
demo/*.jpg
|
19 |
+
demo/*.png
|
20 |
+
|
21 |
+
notebooks/QccDayNight.ipynb
|
22 |
+
notebooks/westlake.ipynb
|
23 |
+
assets/westlake
|
24 |
+
assets/qcc_pairs.txt
|
25 |
+
configs/.petrel*
|
26 |
+
tools/draw_QccDayNights.py
|
27 |
+
|
28 |
+
scripts/slurm/
|
29 |
+
scripts/sbatch_submit.sh
|
30 |
+
src/utils/client.py
|
31 |
+
|
32 |
+
scannet_indices/
|
third_party/ASpanFormer/CODE_OF_CONDUCT.md
ADDED
@@ -0,0 +1,71 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Code of Conduct
|
2 |
+
|
3 |
+
## Our Pledge
|
4 |
+
|
5 |
+
In the interest of fostering an open and welcoming environment, we as
|
6 |
+
contributors and maintainers pledge to making participation in our project and
|
7 |
+
our community a harassment-free experience for everyone, regardless of age, body
|
8 |
+
size, disability, ethnicity, sex characteristics, gender identity and expression,
|
9 |
+
level of experience, education, socio-economic status, nationality, personal
|
10 |
+
appearance, race, religion, or sexual identity and orientation.
|
11 |
+
|
12 |
+
## Our Standards
|
13 |
+
|
14 |
+
Examples of behavior that contributes to creating a positive environment
|
15 |
+
include:
|
16 |
+
|
17 |
+
* Using welcoming and inclusive language
|
18 |
+
* Being respectful of differing viewpoints and experiences
|
19 |
+
* Gracefully accepting constructive criticism
|
20 |
+
* Focusing on what is best for the community
|
21 |
+
* Showing empathy towards other community members
|
22 |
+
|
23 |
+
Examples of unacceptable behavior by participants include:
|
24 |
+
|
25 |
+
* The use of sexualized language or imagery and unwelcome sexual attention or
|
26 |
+
advances
|
27 |
+
* Trolling, insulting/derogatory comments, and personal or political attacks
|
28 |
+
* Public or private harassment
|
29 |
+
* Publishing others' private information, such as a physical or electronic
|
30 |
+
address, without explicit permission
|
31 |
+
* Other conduct which could reasonably be considered inappropriate in a
|
32 |
+
professional setting
|
33 |
+
|
34 |
+
## Our Responsibilities
|
35 |
+
|
36 |
+
Project maintainers are responsible for clarifying the standards of acceptable
|
37 |
+
behavior and are expected to take appropriate and fair corrective action in
|
38 |
+
response to any instances of unacceptable behavior.
|
39 |
+
|
40 |
+
Project maintainers have the right and responsibility to remove, edit, or
|
41 |
+
reject comments, commits, code, wiki edits, issues, and other contributions
|
42 |
+
that are not aligned to this Code of Conduct, or to ban temporarily or
|
43 |
+
permanently any contributor for other behaviors that they deem inappropriate,
|
44 |
+
threatening, offensive, or harmful.
|
45 |
+
|
46 |
+
## Scope
|
47 |
+
|
48 |
+
This Code of Conduct applies within all project spaces, and it also applies when
|
49 |
+
an individual is representing the project or its community in public spaces.
|
50 |
+
Examples of representing a project or community include using an official
|
51 |
+
project e-mail address, posting via an official social media account, or acting
|
52 |
+
as an appointed representative at an online or offline event. Representation of
|
53 |
+
a project may be further defined and clarified by project maintainers.
|
54 |
+
|
55 |
+
## Enforcement
|
56 |
+
|
57 |
+
Instances of abusive, harassing, or otherwise unacceptable behavior may be
|
58 |
+
reported by contacting the open source team at [opensource-conduct@group.apple.com](mailto:opensource-conduct@group.apple.com). All
|
59 |
+
complaints will be reviewed and investigated and will result in a response that
|
60 |
+
is deemed necessary and appropriate to the circumstances. The project team is
|
61 |
+
obligated to maintain confidentiality with regard to the reporter of an incident.
|
62 |
+
Further details of specific enforcement policies may be posted separately.
|
63 |
+
|
64 |
+
Project maintainers who do not follow or enforce the Code of Conduct in good
|
65 |
+
faith may face temporary or permanent repercussions as determined by other
|
66 |
+
members of the project's leadership.
|
67 |
+
|
68 |
+
## Attribution
|
69 |
+
|
70 |
+
This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org), version 1.4,
|
71 |
+
available at [https://www.contributor-covenant.org/version/1/4/code-of-conduct.html](https://www.contributor-covenant.org/version/1/4/code-of-conduct.html)
|
third_party/ASpanFormer/CONTRIBUTING.md
ADDED
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Contribution Guide
|
2 |
+
|
3 |
+
Thanks for your interest in contributing. This project was released to accompany a research paper for purposes of reproducability, and beyond its publication there are limited plans for future development of the repository.
|
4 |
+
|
5 |
+
## Before you get started
|
6 |
+
|
7 |
+
We ask that all community members read and observe our [Code of Conduct](CODE_OF_CONDUCT.md).
|
third_party/ASpanFormer/LICENSE
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Copyright (C) 2021, 2022 Apple Inc. All Rights Reserved.
|
2 |
+
|
3 |
+
IMPORTANT: This Apple software is supplied to you by Apple Inc. ("Apple") in consideration of your agreement to the following terms, and your use, installation, modification or redistribution of this Apple software constitutes acceptance of these terms. If you do not agree with these terms, please do not use, install, modify or redistribute this Apple software.
|
4 |
+
|
5 |
+
In consideration of your agreement to abide by the following terms, and subject to these terms, Apple grants you a personal, non-commercial, non-exclusive license, under Apple's copyrights in this original Apple software (the "Apple Software"), to use, reproduce, modify and redistribute the Apple Software, with or without modifications, in source and/or binary forms for non-commercial purposes only; provided that if you redistribute the Apple Software in its entirety and without modifications, you must retain this notice and the following text and disclaimers in all such redistributions of the Apple Software. Neither the name, trademarks, service marks or logos of Apple Inc. may be used to endorse or promote products derived from the Apple Software without specific prior written permission from Apple. Except as expressly stated in this notice, no other rights or licenses, express or implied, are granted by Apple herein, including but not limited to any patent rights that may be infringed by your derivative works or by other works in which the Apple Software may be incorporated.
|
6 |
+
|
7 |
+
The Apple Software is provided by Apple on an "AS IS" basis. APPLE MAKES NO WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, REGARDING THE APPLE SOFTWARE OR ITS USE AND OPERATION ALONE OR IN COMBINATION WITH YOUR PRODUCTS.
|
8 |
+
|
9 |
+
IN NO EVENT SHALL APPLE BE LIABLE FOR ANY SPECIAL, INDIRECT, INCIDENTAL OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) ARISING IN ANY WAY OUT OF THE USE, REPRODUCTION, MODIFICATION AND/OR DISTRIBUTION OF THE APPLE SOFTWARE, HOWEVER CAUSED AND WHETHER UNDER THEORY OF CONTRACT, TORT (INCLUDING NEGLIGENCE), STRICT LIABILITY OR OTHERWISE, EVEN IF APPLE HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
third_party/ASpanFormer/README.md
ADDED
@@ -0,0 +1,98 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Submodule used in [hloc](https://github.com/Vincentqyw/Hierarchical-Localization) toolbox
|
2 |
+
|
3 |
+
# ASpanFormer Implementation
|
4 |
+
|
5 |
+
![Framework](assets/teaser.png)
|
6 |
+
|
7 |
+
This is a PyTorch implementation of ASpanFormer for ECCV'22 [paper](https://arxiv.org/abs/2208.14201), “ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer”, and can be used to reproduce the results in the paper.
|
8 |
+
|
9 |
+
This work focuses on detector-free image matching. We propose a hierarchical attention framework for cross-view feature update, which adaptively adjusts attention span based on region-wise matchability.
|
10 |
+
|
11 |
+
This repo contains training, evaluation and basic demo scripts used in our paper.
|
12 |
+
|
13 |
+
A large part of the code base is borrowed from the [LoFTR Repository](https://github.com/zju3dv/LoFTR) under its own separate license, terms and conditions. The authors of this software are not responsible for the contents of third-party websites.
|
14 |
+
|
15 |
+
## Installation
|
16 |
+
```bash
|
17 |
+
conda env create -f environment.yaml
|
18 |
+
conda activate ASpanFormer
|
19 |
+
```
|
20 |
+
|
21 |
+
## Get started
|
22 |
+
Download model weights from [here](https://drive.google.com/file/d/1eavM9dTkw9nbc-JqlVVfGPU5UvTTfc6k/view?usp=share_link)
|
23 |
+
|
24 |
+
Extract weights by
|
25 |
+
```bash
|
26 |
+
tar -xvf weights_aspanformer.tar
|
27 |
+
```
|
28 |
+
|
29 |
+
A demo to match one image pair is provided. To get a quick start,
|
30 |
+
|
31 |
+
```bash
|
32 |
+
cd demo
|
33 |
+
python demo.py
|
34 |
+
```
|
35 |
+
|
36 |
+
|
37 |
+
## Data Preparation
|
38 |
+
Please follow the [training doc](docs/TRAINING.md) for data organization
|
39 |
+
|
40 |
+
|
41 |
+
|
42 |
+
## Evaluation
|
43 |
+
|
44 |
+
|
45 |
+
### 1. ScanNet Evaluation
|
46 |
+
```bash
|
47 |
+
cd scripts/reproduce_test
|
48 |
+
bash indoor.sh
|
49 |
+
```
|
50 |
+
Similar results as below should be obtained,
|
51 |
+
```bash
|
52 |
+
'auc@10': 0.46640095171012563,
|
53 |
+
'auc@20': 0.6407042320049785,
|
54 |
+
'auc@5': 0.26241231577189295,
|
55 |
+
'prec@5e-04': 0.8827665604024288,
|
56 |
+
'prec_flow@2e-03': 0.810938751342228
|
57 |
+
```
|
58 |
+
|
59 |
+
### 2. MegaDepth Evaluation
|
60 |
+
```bash
|
61 |
+
cd scripts/reproduce_test
|
62 |
+
bash outdoor.sh
|
63 |
+
```
|
64 |
+
Similar results as below should be obtained,
|
65 |
+
```bash
|
66 |
+
'auc@10': 0.7184113573584142,
|
67 |
+
'auc@20': 0.8333835724453831,
|
68 |
+
'auc@5': 0.5567622479156181,
|
69 |
+
'prec@5e-04': 0.9901741341790503,
|
70 |
+
'prec_flow@2e-03': 0.7188964321862907
|
71 |
+
```
|
72 |
+
|
73 |
+
|
74 |
+
## Training
|
75 |
+
|
76 |
+
### 1. ScanNet Training
|
77 |
+
```bash
|
78 |
+
cd scripts/reproduce_train
|
79 |
+
bash indoor.sh
|
80 |
+
```
|
81 |
+
|
82 |
+
### 2. MegaDepth Training
|
83 |
+
```bash
|
84 |
+
cd scripts/reproduce_train
|
85 |
+
bash outdoor.sh
|
86 |
+
```
|
87 |
+
|
88 |
+
|
89 |
+
If you find this project useful, please cite:
|
90 |
+
|
91 |
+
```
|
92 |
+
@article{chen2022aspanformer,
|
93 |
+
title={ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer},
|
94 |
+
author={Chen, Hongkai and Luo, Zixin and Zhou, Lei and Tian, Yurun and Zhen, Mingmin and Fang, Tian and McKinnon, David and Tsin, Yanghai and Quan, Long},
|
95 |
+
journal={European Conference on Computer Vision (ECCV)},
|
96 |
+
year={2022}
|
97 |
+
}
|
98 |
+
```
|
third_party/ASpanFormer/configs/aspan/indoor/aspan_test.py
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import sys
|
2 |
+
from pathlib import Path
|
3 |
+
|
4 |
+
sys.path.append(str(Path(__file__).parent / "../../../"))
|
5 |
+
from src.config.default import _CN as cfg
|
6 |
+
|
7 |
+
cfg.ASPAN.MATCH_COARSE.MATCH_TYPE = "dual_softmax"
|
8 |
+
|
9 |
+
cfg.ASPAN.MATCH_COARSE.BORDER_RM = 0
|
10 |
+
cfg.ASPAN.COARSE.COARSEST_LEVEL = [15, 20]
|
11 |
+
cfg.ASPAN.COARSE.TRAIN_RES = [480, 640]
|
third_party/ASpanFormer/configs/aspan/indoor/aspan_train.py
ADDED
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import sys
|
2 |
+
from pathlib import Path
|
3 |
+
|
4 |
+
sys.path.append(str(Path(__file__).parent / "../../../"))
|
5 |
+
from src.config.default import _CN as cfg
|
6 |
+
|
7 |
+
cfg.ASPAN.COARSE.COARSEST_LEVEL = [15, 20]
|
8 |
+
cfg.ASPAN.MATCH_COARSE.MATCH_TYPE = "dual_softmax"
|
9 |
+
|
10 |
+
cfg.ASPAN.MATCH_COARSE.SPARSE_SPVS = False
|
11 |
+
cfg.ASPAN.MATCH_COARSE.BORDER_RM = 0
|
12 |
+
cfg.TRAINER.MSLR_MILESTONES = [3, 6, 9, 12, 17, 20, 23, 26, 29]
|
third_party/ASpanFormer/configs/aspan/outdoor/aspan_test.py
ADDED
@@ -0,0 +1,22 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import sys
|
2 |
+
from pathlib import Path
|
3 |
+
|
4 |
+
sys.path.append(str(Path(__file__).parent / "../../../"))
|
5 |
+
from src.config.default import _CN as cfg
|
6 |
+
|
7 |
+
cfg.ASPAN.COARSE.COARSEST_LEVEL = [36, 36]
|
8 |
+
cfg.ASPAN.COARSE.TRAIN_RES = [832, 832]
|
9 |
+
cfg.ASPAN.COARSE.TEST_RES = [1152, 1152]
|
10 |
+
cfg.ASPAN.MATCH_COARSE.MATCH_TYPE = "dual_softmax"
|
11 |
+
|
12 |
+
cfg.TRAINER.CANONICAL_LR = 8e-3
|
13 |
+
cfg.TRAINER.WARMUP_STEP = 1875 # 3 epochs
|
14 |
+
cfg.TRAINER.WARMUP_RATIO = 0.1
|
15 |
+
cfg.TRAINER.MSLR_MILESTONES = [8, 12, 16, 20, 24]
|
16 |
+
|
17 |
+
# pose estimation
|
18 |
+
cfg.TRAINER.RANSAC_PIXEL_THR = 0.5
|
19 |
+
|
20 |
+
cfg.TRAINER.OPTIMIZER = "adamw"
|
21 |
+
cfg.TRAINER.ADAMW_DECAY = 0.1
|
22 |
+
cfg.ASPAN.MATCH_COARSE.TRAIN_COARSE_PERCENT = 0.3
|
third_party/ASpanFormer/configs/aspan/outdoor/aspan_train.py
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import sys
|
2 |
+
from pathlib import Path
|
3 |
+
|
4 |
+
sys.path.append(str(Path(__file__).parent / "../../../"))
|
5 |
+
from src.config.default import _CN as cfg
|
6 |
+
|
7 |
+
cfg.ASPAN.COARSE.COARSEST_LEVEL = [26, 26]
|
8 |
+
cfg.ASPAN.MATCH_COARSE.MATCH_TYPE = "dual_softmax"
|
9 |
+
cfg.ASPAN.MATCH_COARSE.SPARSE_SPVS = False
|
10 |
+
|
11 |
+
cfg.TRAINER.CANONICAL_LR = 8e-3
|
12 |
+
cfg.TRAINER.WARMUP_STEP = 1875 # 3 epochs
|
13 |
+
cfg.TRAINER.WARMUP_RATIO = 0.1
|
14 |
+
cfg.TRAINER.MSLR_MILESTONES = [8, 12, 16, 20, 24]
|
15 |
+
|
16 |
+
# pose estimation
|
17 |
+
cfg.TRAINER.RANSAC_PIXEL_THR = 0.5
|
18 |
+
|
19 |
+
cfg.TRAINER.OPTIMIZER = "adamw"
|
20 |
+
cfg.TRAINER.ADAMW_DECAY = 0.1
|
21 |
+
cfg.ASPAN.MATCH_COARSE.TRAIN_COARSE_PERCENT = 0.3
|
third_party/ASpanFormer/configs/data/__init__.py
ADDED
File without changes
|
third_party/ASpanFormer/configs/data/base.py
ADDED
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
"""
|
2 |
+
The data config will be the last one merged into the main config.
|
3 |
+
Setups in data configs will override all existed setups!
|
4 |
+
"""
|
5 |
+
|
6 |
+
from yacs.config import CfgNode as CN
|
7 |
+
|
8 |
+
_CN = CN()
|
9 |
+
_CN.DATASET = CN()
|
10 |
+
_CN.TRAINER = CN()
|
11 |
+
|
12 |
+
# training data config
|
13 |
+
_CN.DATASET.TRAIN_DATA_ROOT = None
|
14 |
+
_CN.DATASET.TRAIN_POSE_ROOT = None
|
15 |
+
_CN.DATASET.TRAIN_NPZ_ROOT = None
|
16 |
+
_CN.DATASET.TRAIN_LIST_PATH = None
|
17 |
+
_CN.DATASET.TRAIN_INTRINSIC_PATH = None
|
18 |
+
# validation set config
|
19 |
+
_CN.DATASET.VAL_DATA_ROOT = None
|
20 |
+
_CN.DATASET.VAL_POSE_ROOT = None
|
21 |
+
_CN.DATASET.VAL_NPZ_ROOT = None
|
22 |
+
_CN.DATASET.VAL_LIST_PATH = None
|
23 |
+
_CN.DATASET.VAL_INTRINSIC_PATH = None
|
24 |
+
|
25 |
+
# testing data config
|
26 |
+
_CN.DATASET.TEST_DATA_ROOT = None
|
27 |
+
_CN.DATASET.TEST_POSE_ROOT = None
|
28 |
+
_CN.DATASET.TEST_NPZ_ROOT = None
|
29 |
+
_CN.DATASET.TEST_LIST_PATH = None
|
30 |
+
_CN.DATASET.TEST_INTRINSIC_PATH = None
|
31 |
+
|
32 |
+
# dataset config
|
33 |
+
_CN.DATASET.MIN_OVERLAP_SCORE_TRAIN = 0.4
|
34 |
+
_CN.DATASET.MIN_OVERLAP_SCORE_TEST = 0.0 # for both test and val
|
35 |
+
|
36 |
+
cfg = _CN
|
third_party/ASpanFormer/configs/data/debug/.gitignore
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
*
|
2 |
+
*/
|
3 |
+
!.gitignore
|
third_party/ASpanFormer/configs/data/megadepth_test_1500.py
ADDED
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from configs.data.base import cfg
|
2 |
+
|
3 |
+
TEST_BASE_PATH = "assets/megadepth_test_1500_scene_info"
|
4 |
+
|
5 |
+
cfg.DATASET.TEST_DATA_SOURCE = "MegaDepth"
|
6 |
+
cfg.DATASET.TEST_DATA_ROOT = "data/megadepth/test"
|
7 |
+
cfg.DATASET.TEST_NPZ_ROOT = f"{TEST_BASE_PATH}"
|
8 |
+
cfg.DATASET.TEST_LIST_PATH = f"{TEST_BASE_PATH}/megadepth_test_1500.txt"
|
9 |
+
|
10 |
+
cfg.DATASET.MGDPT_IMG_RESIZE = 1152
|
11 |
+
cfg.DATASET.MGDPT_IMG_PAD = True
|
12 |
+
cfg.DATASET.MGDPT_DF = 8
|
13 |
+
cfg.DATASET.MIN_OVERLAP_SCORE_TEST = 0.0
|
third_party/ASpanFormer/configs/data/megadepth_trainval_832.py
ADDED
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from configs.data.base import cfg
|
2 |
+
|
3 |
+
|
4 |
+
TRAIN_BASE_PATH = "data/megadepth/index"
|
5 |
+
cfg.DATASET.TRAINVAL_DATA_SOURCE = "MegaDepth"
|
6 |
+
cfg.DATASET.TRAIN_DATA_ROOT = "data/megadepth/train"
|
7 |
+
cfg.DATASET.TRAIN_NPZ_ROOT = f"{TRAIN_BASE_PATH}/scene_info_0.1_0.7"
|
8 |
+
cfg.DATASET.TRAIN_LIST_PATH = f"{TRAIN_BASE_PATH}/trainvaltest_list/train_list.txt"
|
9 |
+
cfg.DATASET.MIN_OVERLAP_SCORE_TRAIN = 0.0
|
10 |
+
|
11 |
+
TEST_BASE_PATH = "data/megadepth/index"
|
12 |
+
cfg.DATASET.TEST_DATA_SOURCE = "MegaDepth"
|
13 |
+
cfg.DATASET.VAL_DATA_ROOT = cfg.DATASET.TEST_DATA_ROOT = "data/megadepth/test"
|
14 |
+
cfg.DATASET.VAL_NPZ_ROOT = (
|
15 |
+
cfg.DATASET.TEST_NPZ_ROOT
|
16 |
+
) = f"{TEST_BASE_PATH}/scene_info_val_1500"
|
17 |
+
cfg.DATASET.VAL_LIST_PATH = (
|
18 |
+
cfg.DATASET.TEST_LIST_PATH
|
19 |
+
) = f"{TEST_BASE_PATH}/trainvaltest_list/val_list.txt"
|
20 |
+
cfg.DATASET.MIN_OVERLAP_SCORE_TEST = 0.0 # for both test and val
|
21 |
+
|
22 |
+
# 368 scenes in total for MegaDepth
|
23 |
+
# (with difficulty balanced (further split each scene to 3 sub-scenes))
|
24 |
+
cfg.TRAINER.N_SAMPLES_PER_SUBSET = 100
|
25 |
+
|
26 |
+
cfg.DATASET.MGDPT_IMG_RESIZE = 832 # for training on 32GB meme GPUs
|
third_party/ASpanFormer/configs/data/scannet_test_1500.py
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from configs.data.base import cfg
|
2 |
+
|
3 |
+
TEST_BASE_PATH = "assets/scannet_test_1500"
|
4 |
+
|
5 |
+
cfg.DATASET.TEST_DATA_SOURCE = "ScanNet"
|
6 |
+
cfg.DATASET.TEST_DATA_ROOT = "data/scannet/test"
|
7 |
+
cfg.DATASET.TEST_NPZ_ROOT = f"{TEST_BASE_PATH}"
|
8 |
+
cfg.DATASET.TEST_LIST_PATH = f"{TEST_BASE_PATH}/scannet_test.txt"
|
9 |
+
cfg.DATASET.TEST_INTRINSIC_PATH = f"{TEST_BASE_PATH}/intrinsics.npz"
|
10 |
+
|
11 |
+
cfg.DATASET.MIN_OVERLAP_SCORE_TEST = 0.0
|
third_party/ASpanFormer/configs/data/scannet_trainval.py
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from configs.data.base import cfg
|
2 |
+
|
3 |
+
|
4 |
+
TRAIN_BASE_PATH = "data/scannet/index"
|
5 |
+
cfg.DATASET.TRAINVAL_DATA_SOURCE = "ScanNet"
|
6 |
+
cfg.DATASET.TRAIN_DATA_ROOT = "data/scannet/train"
|
7 |
+
cfg.DATASET.TRAIN_NPZ_ROOT = f"{TRAIN_BASE_PATH}/scene_data/train"
|
8 |
+
cfg.DATASET.TRAIN_LIST_PATH = f"{TRAIN_BASE_PATH}/scene_data/train_list/scannet_all.txt"
|
9 |
+
cfg.DATASET.TRAIN_INTRINSIC_PATH = f"{TRAIN_BASE_PATH}/intrinsics.npz"
|
10 |
+
|
11 |
+
TEST_BASE_PATH = "assets/scannet_test_1500"
|
12 |
+
cfg.DATASET.TEST_DATA_SOURCE = "ScanNet"
|
13 |
+
cfg.DATASET.VAL_DATA_ROOT = cfg.DATASET.TEST_DATA_ROOT = "data/scannet/test"
|
14 |
+
cfg.DATASET.VAL_NPZ_ROOT = cfg.DATASET.TEST_NPZ_ROOT = TEST_BASE_PATH
|
15 |
+
cfg.DATASET.VAL_LIST_PATH = (
|
16 |
+
cfg.DATASET.TEST_LIST_PATH
|
17 |
+
) = f"{TEST_BASE_PATH}/scannet_test.txt"
|
18 |
+
cfg.DATASET.VAL_INTRINSIC_PATH = (
|
19 |
+
cfg.DATASET.TEST_INTRINSIC_PATH
|
20 |
+
) = f"{TEST_BASE_PATH}/intrinsics.npz"
|
21 |
+
cfg.DATASET.MIN_OVERLAP_SCORE_TEST = 0.0 # for both test and val
|
third_party/ASpanFormer/data/megadepth/index/.gitignore
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Ignore everything in this directory
|
2 |
+
*
|
3 |
+
# Except this file
|
4 |
+
!.gitignore
|
third_party/ASpanFormer/data/megadepth/test/.gitignore
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Ignore everything in this directory
|
2 |
+
*
|
3 |
+
# Except this file
|
4 |
+
!.gitignore
|
third_party/ASpanFormer/data/megadepth/train/.gitignore
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Ignore everything in this directory
|
2 |
+
*
|
3 |
+
# Except this file
|
4 |
+
!.gitignore
|
third_party/ASpanFormer/data/scannet/index/.gitignore
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Ignore everything in this directory
|
2 |
+
*
|
3 |
+
# Except this file
|
4 |
+
!.gitignore
|
third_party/ASpanFormer/data/scannet/test/.gitignore
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
*
|
2 |
+
*/
|
3 |
+
!.gitignore
|
third_party/ASpanFormer/data/scannet/train/.gitignore
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Ignore everything in this directory
|
2 |
+
*
|
3 |
+
# Except this file
|
4 |
+
!.gitignore
|
third_party/ASpanFormer/demo/demo.py
ADDED
@@ -0,0 +1,91 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
import sys
|
3 |
+
|
4 |
+
ROOT_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
|
5 |
+
sys.path.insert(0, ROOT_DIR)
|
6 |
+
|
7 |
+
from src.ASpanFormer.aspanformer import ASpanFormer
|
8 |
+
from src.config.default import get_cfg_defaults
|
9 |
+
from src.utils.misc import lower_config
|
10 |
+
import demo_utils
|
11 |
+
|
12 |
+
import cv2
|
13 |
+
import torch
|
14 |
+
import numpy as np
|
15 |
+
|
16 |
+
import argparse
|
17 |
+
|
18 |
+
parser = argparse.ArgumentParser()
|
19 |
+
parser.add_argument(
|
20 |
+
"--config_path",
|
21 |
+
type=str,
|
22 |
+
default="../configs/aspan/outdoor/aspan_test.py",
|
23 |
+
help="path for config file.",
|
24 |
+
)
|
25 |
+
parser.add_argument(
|
26 |
+
"--img0_path",
|
27 |
+
type=str,
|
28 |
+
default="../assets/phototourism_sample_images/piazza_san_marco_06795901_3725050516.jpg",
|
29 |
+
help="path for image0.",
|
30 |
+
)
|
31 |
+
parser.add_argument(
|
32 |
+
"--img1_path",
|
33 |
+
type=str,
|
34 |
+
default="../assets/phototourism_sample_images/piazza_san_marco_15148634_5228701572.jpg",
|
35 |
+
help="path for image1.",
|
36 |
+
)
|
37 |
+
parser.add_argument(
|
38 |
+
"--weights_path",
|
39 |
+
type=str,
|
40 |
+
default="../weights/outdoor.ckpt",
|
41 |
+
help="path for model weights.",
|
42 |
+
)
|
43 |
+
parser.add_argument(
|
44 |
+
"--long_dim0", type=int, default=1024, help="resize for longest dim of image0."
|
45 |
+
)
|
46 |
+
parser.add_argument(
|
47 |
+
"--long_dim1", type=int, default=1024, help="resize for longest dim of image1."
|
48 |
+
)
|
49 |
+
|
50 |
+
args = parser.parse_args()
|
51 |
+
|
52 |
+
|
53 |
+
if __name__ == "__main__":
|
54 |
+
config = get_cfg_defaults()
|
55 |
+
config.merge_from_file(args.config_path)
|
56 |
+
_config = lower_config(config)
|
57 |
+
matcher = ASpanFormer(config=_config["aspan"])
|
58 |
+
state_dict = torch.load(args.weights_path, map_location="cpu")["state_dict"]
|
59 |
+
matcher.load_state_dict(state_dict, strict=False)
|
60 |
+
matcher.cuda(), matcher.eval()
|
61 |
+
|
62 |
+
img0, img1 = cv2.imread(args.img0_path), cv2.imread(args.img1_path)
|
63 |
+
img0_g, img1_g = cv2.imread(args.img0_path, 0), cv2.imread(args.img1_path, 0)
|
64 |
+
img0, img1 = demo_utils.resize(img0, args.long_dim0), demo_utils.resize(
|
65 |
+
img1, args.long_dim1
|
66 |
+
)
|
67 |
+
img0_g, img1_g = demo_utils.resize(img0_g, args.long_dim0), demo_utils.resize(
|
68 |
+
img1_g, args.long_dim1
|
69 |
+
)
|
70 |
+
data = {
|
71 |
+
"image0": torch.from_numpy(img0_g / 255.0)[None, None].cuda().float(),
|
72 |
+
"image1": torch.from_numpy(img1_g / 255.0)[None, None].cuda().float(),
|
73 |
+
}
|
74 |
+
with torch.no_grad():
|
75 |
+
matcher(data, online_resize=True)
|
76 |
+
corr0, corr1 = data["mkpts0_f"].cpu().numpy(), data["mkpts1_f"].cpu().numpy()
|
77 |
+
|
78 |
+
F_hat, mask_F = cv2.findFundamentalMat(
|
79 |
+
corr0, corr1, method=cv2.FM_RANSAC, ransacReprojThreshold=1
|
80 |
+
)
|
81 |
+
if mask_F is not None:
|
82 |
+
mask_F = mask_F[:, 0].astype(bool)
|
83 |
+
else:
|
84 |
+
mask_F = np.zeros_like(corr0[:, 0]).astype(bool)
|
85 |
+
|
86 |
+
# visualize match
|
87 |
+
display = demo_utils.draw_match(img0, img1, corr0, corr1)
|
88 |
+
display_ransac = demo_utils.draw_match(img0, img1, corr0[mask_F], corr1[mask_F])
|
89 |
+
cv2.imwrite("match.png", display)
|
90 |
+
cv2.imwrite("match_ransac.png", display_ransac)
|
91 |
+
print(len(corr1), len(corr1[mask_F]))
|
third_party/ASpanFormer/demo/demo_utils.py
ADDED
@@ -0,0 +1,88 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import cv2
|
2 |
+
import numpy as np
|
3 |
+
|
4 |
+
|
5 |
+
def resize(image, long_dim):
|
6 |
+
h, w = image.shape[0], image.shape[1]
|
7 |
+
image = cv2.resize(
|
8 |
+
image, (int(w * long_dim / max(h, w)), int(h * long_dim / max(h, w)))
|
9 |
+
)
|
10 |
+
return image
|
11 |
+
|
12 |
+
|
13 |
+
def draw_points(img, points, color=(0, 255, 0), radius=3):
|
14 |
+
dp = [(int(points[i, 0]), int(points[i, 1])) for i in range(points.shape[0])]
|
15 |
+
for i in range(points.shape[0]):
|
16 |
+
cv2.circle(img, dp[i], radius=radius, color=color)
|
17 |
+
return img
|
18 |
+
|
19 |
+
|
20 |
+
def draw_match(
|
21 |
+
img1,
|
22 |
+
img2,
|
23 |
+
corr1,
|
24 |
+
corr2,
|
25 |
+
inlier=[True],
|
26 |
+
color=None,
|
27 |
+
radius1=1,
|
28 |
+
radius2=1,
|
29 |
+
resize=None,
|
30 |
+
):
|
31 |
+
if resize is not None:
|
32 |
+
scale1, scale2 = [img1.shape[1] / resize[0], img1.shape[0] / resize[1]], [
|
33 |
+
img2.shape[1] / resize[0],
|
34 |
+
img2.shape[0] / resize[1],
|
35 |
+
]
|
36 |
+
img1, img2 = cv2.resize(img1, resize, interpolation=cv2.INTER_AREA), cv2.resize(
|
37 |
+
img2, resize, interpolation=cv2.INTER_AREA
|
38 |
+
)
|
39 |
+
corr1, corr2 = (
|
40 |
+
corr1 / np.asarray(scale1)[np.newaxis],
|
41 |
+
corr2 / np.asarray(scale2)[np.newaxis],
|
42 |
+
)
|
43 |
+
corr1_key = [
|
44 |
+
cv2.KeyPoint(corr1[i, 0], corr1[i, 1], radius1) for i in range(corr1.shape[0])
|
45 |
+
]
|
46 |
+
corr2_key = [
|
47 |
+
cv2.KeyPoint(corr2[i, 0], corr2[i, 1], radius2) for i in range(corr2.shape[0])
|
48 |
+
]
|
49 |
+
|
50 |
+
assert len(corr1) == len(corr2)
|
51 |
+
|
52 |
+
draw_matches = [cv2.DMatch(i, i, 0) for i in range(len(corr1))]
|
53 |
+
if color is None:
|
54 |
+
color = [(0, 255, 0) if cur_inlier else (0, 0, 255) for cur_inlier in inlier]
|
55 |
+
if len(color) == 1:
|
56 |
+
display = cv2.drawMatches(
|
57 |
+
img1,
|
58 |
+
corr1_key,
|
59 |
+
img2,
|
60 |
+
corr2_key,
|
61 |
+
draw_matches,
|
62 |
+
None,
|
63 |
+
matchColor=color[0],
|
64 |
+
singlePointColor=color[0],
|
65 |
+
flags=4,
|
66 |
+
)
|
67 |
+
else:
|
68 |
+
height, width = max(img1.shape[0], img2.shape[0]), img1.shape[1] + img2.shape[1]
|
69 |
+
display = np.zeros([height, width, 3], np.uint8)
|
70 |
+
display[: img1.shape[0], : img1.shape[1]] = img1
|
71 |
+
display[: img2.shape[0], img1.shape[1] :] = img2
|
72 |
+
for i in range(len(corr1)):
|
73 |
+
left_x, left_y, right_x, right_y = (
|
74 |
+
int(corr1[i][0]),
|
75 |
+
int(corr1[i][1]),
|
76 |
+
int(corr2[i][0] + img1.shape[1]),
|
77 |
+
int(corr2[i][1]),
|
78 |
+
)
|
79 |
+
cur_color = (int(color[i][0]), int(color[i][1]), int(color[i][2]))
|
80 |
+
cv2.line(
|
81 |
+
display,
|
82 |
+
(left_x, left_y),
|
83 |
+
(right_x, right_y),
|
84 |
+
cur_color,
|
85 |
+
1,
|
86 |
+
lineType=cv2.LINE_AA,
|
87 |
+
)
|
88 |
+
return display
|
third_party/ASpanFormer/docs/TRAINING.md
ADDED
@@ -0,0 +1,72 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
# Traininig ASpanFormer
|
3 |
+
|
4 |
+
## Dataset setup
|
5 |
+
Generally, two parts of data are needed for training ASpanFormer, the original dataset, i.e., ScanNet and MegaDepth, and the offline generated dataset indices. The dataset indices store scenes, image pairs, and other metadata within each dataset used for training/validation/testing. For the MegaDepth dataset, the relative poses between images used for training are directly cached in the indexing files. However, the relative poses of ScanNet image pairs are not stored due to the enormous resulting file size.
|
6 |
+
|
7 |
+
### Download datasets
|
8 |
+
#### MegaDepth
|
9 |
+
We use depth maps provided in the [original MegaDepth dataset](https://www.cs.cornell.edu/projects/megadepth/) as well as undistorted images, corresponding camera intrinsics and extrinsics preprocessed by [D2-Net](https://github.com/mihaidusmanu/d2-net#downloading-and-preprocessing-the-megadepth-dataset). You can download them separately from the following links.
|
10 |
+
- [MegaDepth undistorted images and processed depths](https://www.cs.cornell.edu/projects/megadepth/dataset/Megadepth_v1/MegaDepth_v1.tar.gz)
|
11 |
+
- Note that we only use depth maps.
|
12 |
+
- Path of the download data will be referreed to as `/path/to/megadepth`
|
13 |
+
- [D2-Net preprocessed images](https://drive.google.com/drive/folders/1hxpOsqOZefdrba_BqnW490XpNX_LgXPB)
|
14 |
+
- Images are undistorted manually in D2-Net since the undistorted images from MegaDepth do not come with corresponding intrinsics.
|
15 |
+
- Path of the download data will be referreed to as `/path/to/megadepth_d2net`
|
16 |
+
|
17 |
+
#### ScanNet
|
18 |
+
Please set up the ScanNet dataset following [the official guide](https://github.com/ScanNet/ScanNet#scannet-data)
|
19 |
+
> NOTE: We use the [python exported data](https://github.com/ScanNet/ScanNet/tree/master/SensReader/python),
|
20 |
+
instead of the [c++ exported one](https://github.com/ScanNet/ScanNet/tree/master/SensReader/c%2B%2B).
|
21 |
+
|
22 |
+
### Download the dataset indices
|
23 |
+
|
24 |
+
You can download the required dataset indices from the [following link](https://drive.google.com/drive/folders/1DOcOPZb3-5cWxLqn256AhwUVjBPifhuf).
|
25 |
+
After downloading, unzip the required files.
|
26 |
+
```shell
|
27 |
+
unzip downloaded-file.zip
|
28 |
+
|
29 |
+
# extract dataset indices
|
30 |
+
tar xf train-data/megadepth_indices.tar
|
31 |
+
tar xf train-data/scannet_indices.tar
|
32 |
+
|
33 |
+
# extract testing data (optional)
|
34 |
+
tar xf testdata/megadepth_test_1500.tar
|
35 |
+
tar xf testdata/scannet_test_1500.tar
|
36 |
+
```
|
37 |
+
|
38 |
+
### Build the dataset symlinks
|
39 |
+
|
40 |
+
We symlink the datasets to the `data` directory under the main ASpanFormer project directory.
|
41 |
+
|
42 |
+
```shell
|
43 |
+
# scannet
|
44 |
+
# -- # train and test dataset
|
45 |
+
ln -s /path/to/scannet_train/* /path/to/ASpanFormer/data/scannet/train
|
46 |
+
ln -s /path/to/scannet_test/* /path/to/ASpanFormer/data/scannet/test
|
47 |
+
# -- # dataset indices
|
48 |
+
ln -s /path/to/scannet_indices/* /path/to/ASpanFormer/data/scannet/index
|
49 |
+
|
50 |
+
# megadepth
|
51 |
+
# -- # train and test dataset (train and test share the same dataset)
|
52 |
+
ln -sv /path/to/megadepth/phoenix /path/to/megadepth_d2net/Undistorted_SfM /path/to/ASpanFormer/data/megadepth/train
|
53 |
+
ln -sv /path/to/megadepth/phoenix /path/to/megadepth_d2net/Undistorted_SfM /path/to/ASpanFormer/data/megadepth/test
|
54 |
+
# -- # dataset indices
|
55 |
+
ln -s /path/to/megadepth_indices/* /path/to/ASpanFormer/data/megadepth/index
|
56 |
+
```
|
57 |
+
|
58 |
+
|
59 |
+
## Training
|
60 |
+
We provide training scripts of ScanNet and MegaDepth. The results in the ASpanFormer paper can be reproduced with 8 v100 GPUs. For a different setup, we scale the learning rate and its warm-up linearly, but the final evaluation results might vary due to the different batch size & learning rate used. Thus the reproduction of results in our paper is not guaranteed.
|
61 |
+
|
62 |
+
|
63 |
+
### Training on ScanNet
|
64 |
+
``` shell
|
65 |
+
scripts/reproduce_train/indoor.sh
|
66 |
+
```
|
67 |
+
|
68 |
+
|
69 |
+
### Training on MegaDepth
|
70 |
+
``` shell
|
71 |
+
scripts/reproduce_train/outdoor.sh
|
72 |
+
```
|
third_party/ASpanFormer/environment.yaml
ADDED
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
name: ASpanFormer
|
2 |
+
channels:
|
3 |
+
- pytorch
|
4 |
+
- conda-forge
|
5 |
+
- defaults
|
6 |
+
dependencies:
|
7 |
+
- python=3.8
|
8 |
+
- cudatoolkit=10.2
|
9 |
+
- pytorch=1.8.1
|
10 |
+
- pip
|
11 |
+
- pip:
|
12 |
+
- -r requirements.txt
|
third_party/ASpanFormer/requirements.txt
ADDED
@@ -0,0 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#opencv_python==4.4.0.46
|
2 |
+
albumentations==0.5.1 --no-binary=imgaug,albumentations
|
3 |
+
ray>=1.0.1
|
4 |
+
einops==0.3.0
|
5 |
+
kornia==0.4.1
|
6 |
+
loguru==0.5.3
|
7 |
+
yacs>=0.1.8
|
8 |
+
tqdm
|
9 |
+
autopep8
|
10 |
+
pylint
|
11 |
+
ipython
|
12 |
+
jupyterlab
|
13 |
+
matplotlib
|
14 |
+
h5py
|
15 |
+
pytorch-lightning==1.3.5
|
16 |
+
loguru
|
17 |
+
joblib>=1.0.1
|
18 |
+
torchmetrics==0.4
|
third_party/ASpanFormer/scripts/reproduce_test/indoor.sh
ADDED
@@ -0,0 +1,31 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/bash -l
|
2 |
+
# a indoor_ds model with the pos_enc impl bug fixed.
|
3 |
+
|
4 |
+
SCRIPTPATH=$(dirname $(readlink -f "$0"))
|
5 |
+
PROJECT_DIR="${SCRIPTPATH}/../../"
|
6 |
+
|
7 |
+
# conda activate loftr
|
8 |
+
export PYTHONPATH=$PROJECT_DIR:$PYTHONPATH
|
9 |
+
cd $PROJECT_DIR
|
10 |
+
|
11 |
+
data_cfg_path="configs/data/scannet_test_1500.py"
|
12 |
+
main_cfg_path="configs/aspan/indoor/aspan_test.py"
|
13 |
+
ckpt_path='weights/indoor.ckpt'
|
14 |
+
dump_dir="dump/indoor_dump"
|
15 |
+
profiler_name="inference"
|
16 |
+
n_nodes=1 # mannually keep this the same with --nodes
|
17 |
+
n_gpus_per_node=-1
|
18 |
+
torch_num_workers=4
|
19 |
+
batch_size=1 # per gpu
|
20 |
+
|
21 |
+
python -u ./test.py \
|
22 |
+
${data_cfg_path} \
|
23 |
+
${main_cfg_path} \
|
24 |
+
--ckpt_path=${ckpt_path} \
|
25 |
+
--dump_dir=${dump_dir} \
|
26 |
+
--gpus=${n_gpus_per_node} --num_nodes=${n_nodes} --accelerator="ddp" \
|
27 |
+
--batch_size=${batch_size} --num_workers=${torch_num_workers}\
|
28 |
+
--profiler_name=${profiler_name} \
|
29 |
+
--benchmark \
|
30 |
+
--mode integrated
|
31 |
+
|