Vincentqyw commited on
Commit
2673dcd
1 Parent(s): 45354a0

add: lightglue

Browse files
pre-requirements.txt CHANGED
@@ -1,4 +1,3 @@
1
- # python>=3.10.4
2
  torch>=1.12.1
3
  torchvision>=0.13.1
4
  torchmetrics>=0.6.0
@@ -9,5 +8,3 @@ einops>=0.3.0
9
  kornia>=0.6
10
  gradio
11
  gradio_client==0.2.7
12
- # datasets[vision]>=2.4.0
13
-
 
 
1
  torch>=1.12.1
2
  torchvision>=0.13.1
3
  torchmetrics>=0.6.0
 
8
  kornia>=0.6
9
  gradio
10
  gradio_client==0.2.7
 
 
third_party/LightGlue/.gitattributes ADDED
@@ -0,0 +1 @@
 
 
1
+ *.ipynb linguist-documentation
third_party/LightGlue/.gitignore ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ *.egg-info
2
+ *.pyc
3
+ /.idea/
4
+ /data/
5
+ /outputs/
6
+ __pycache__
7
+ /lightglue/weights/
8
+ lightglue/_flash/
9
+ *-checkpoint.ipynb
10
+ *.pth
third_party/LightGlue/LICENSE ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity authorized by
13
+ the copyright owner that is granting the License.
14
+
15
+ "Legal Entity" shall mean the union of the acting entity and all
16
+ other entities that control, are controlled by, or are under common
17
+ control with that entity. For the purposes of this definition,
18
+ "control" means (i) the power, direct or indirect, to cause the
19
+ direction or management of such entity, whether by contract or
20
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
21
+ outstanding shares, or (iii) beneficial ownership of such entity.
22
+
23
+ "You" (or "Your") shall mean an individual or Legal Entity
24
+ exercising permissions granted by this License.
25
+
26
+ "Source" form shall mean the preferred form for making modifications,
27
+ including but not limited to software source code, documentation
28
+ source, and configuration files.
29
+
30
+ "Object" form shall mean any form resulting from mechanical
31
+ transformation or translation of a Source form, including but
32
+ not limited to compiled object code, generated documentation,
33
+ and conversions to other media types.
34
+
35
+ "Work" shall mean the work of authorship, whether in Source or
36
+ Object form, made available under the License, as indicated by a
37
+ copyright notice that is included in or attached to the work
38
+ (an example is provided in the Appendix below).
39
+
40
+ "Derivative Works" shall mean any work, whether in Source or Object
41
+ form, that is based on (or derived from) the Work and for which the
42
+ editorial revisions, annotations, elaborations, or other modifications
43
+ represent, as a whole, an original work of authorship. For the purposes
44
+ of this License, Derivative Works shall not include works that remain
45
+ separable from, or merely link (or bind by name) to the interfaces of,
46
+ the Work and Derivative Works thereof.
47
+
48
+ "Contribution" shall mean any work of authorship, including
49
+ the original version of the Work and any modifications or additions
50
+ to that Work or Derivative Works thereof, that is intentionally
51
+ submitted to Licensor for inclusion in the Work by the copyright owner
52
+ or by an individual or Legal Entity authorized to submit on behalf of
53
+ the copyright owner. For the purposes of this definition, "submitted"
54
+ means any form of electronic, verbal, or written communication sent
55
+ to the Licensor or its representatives, including but not limited to
56
+ communication on electronic mailing lists, source code control systems,
57
+ and issue tracking systems that are managed by, or on behalf of, the
58
+ Licensor for the purpose of discussing and improving the Work, but
59
+ excluding communication that is conspicuously marked or otherwise
60
+ designated in writing by the copyright owner as "Not a Contribution."
61
+
62
+ "Contributor" shall mean Licensor and any individual or Legal Entity
63
+ on behalf of whom a Contribution has been received by Licensor and
64
+ subsequently incorporated within the Work.
65
+
66
+ 2. Grant of Copyright License. Subject to the terms and conditions of
67
+ this License, each Contributor hereby grants to You a perpetual,
68
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69
+ copyright license to reproduce, prepare Derivative Works of,
70
+ publicly display, publicly perform, sublicense, and distribute the
71
+ Work and such Derivative Works in Source or Object form.
72
+
73
+ 3. Grant of Patent License. Subject to the terms and conditions of
74
+ this License, each Contributor hereby grants to You a perpetual,
75
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76
+ (except as stated in this section) patent license to make, have made,
77
+ use, offer to sell, sell, import, and otherwise transfer the Work,
78
+ where such license applies only to those patent claims licensable
79
+ by such Contributor that are necessarily infringed by their
80
+ Contribution(s) alone or by combination of their Contribution(s)
81
+ with the Work to which such Contribution(s) was submitted. If You
82
+ institute patent litigation against any entity (including a
83
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
84
+ or a Contribution incorporated within the Work constitutes direct
85
+ or contributory patent infringement, then any patent licenses
86
+ granted to You under this License for that Work shall terminate
87
+ as of the date such litigation is filed.
88
+
89
+ 4. Redistribution. You may reproduce and distribute copies of the
90
+ Work or Derivative Works thereof in any medium, with or without
91
+ modifications, and in Source or Object form, provided that You
92
+ meet the following conditions:
93
+
94
+ (a) You must give any other recipients of the Work or
95
+ Derivative Works a copy of this License; and
96
+
97
+ (b) You must cause any modified files to carry prominent notices
98
+ stating that You changed the files; and
99
+
100
+ (c) You must retain, in the Source form of any Derivative Works
101
+ that You distribute, all copyright, patent, trademark, and
102
+ attribution notices from the Source form of the Work,
103
+ excluding those notices that do not pertain to any part of
104
+ the Derivative Works; and
105
+
106
+ (d) If the Work includes a "NOTICE" text file as part of its
107
+ distribution, then any Derivative Works that You distribute must
108
+ include a readable copy of the attribution notices contained
109
+ within such NOTICE file, excluding those notices that do not
110
+ pertain to any part of the Derivative Works, in at least one
111
+ of the following places: within a NOTICE text file distributed
112
+ as part of the Derivative Works; within the Source form or
113
+ documentation, if provided along with the Derivative Works; or,
114
+ within a display generated by the Derivative Works, if and
115
+ wherever such third-party notices normally appear. The contents
116
+ of the NOTICE file are for informational purposes only and
117
+ do not modify the License. You may add Your own attribution
118
+ notices within Derivative Works that You distribute, alongside
119
+ or as an addendum to the NOTICE text from the Work, provided
120
+ that such additional attribution notices cannot be construed
121
+ as modifying the License.
122
+
123
+ You may add Your own copyright statement to Your modifications and
124
+ may provide additional or different license terms and conditions
125
+ for use, reproduction, or distribution of Your modifications, or
126
+ for any such Derivative Works as a whole, provided Your use,
127
+ reproduction, and distribution of the Work otherwise complies with
128
+ the conditions stated in this License.
129
+
130
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
131
+ any Contribution intentionally submitted for inclusion in the Work
132
+ by You to the Licensor shall be under the terms and conditions of
133
+ this License, without any additional terms or conditions.
134
+ Notwithstanding the above, nothing herein shall supersede or modify
135
+ the terms of any separate license agreement you may have executed
136
+ with Licensor regarding such Contributions.
137
+
138
+ 6. Trademarks. This License does not grant permission to use the trade
139
+ names, trademarks, service marks, or product names of the Licensor,
140
+ except as required for reasonable and customary use in describing the
141
+ origin of the Work and reproducing the content of the NOTICE file.
142
+
143
+ 7. Disclaimer of Warranty. Unless required by applicable law or
144
+ agreed to in writing, Licensor provides the Work (and each
145
+ Contributor provides its Contributions) on an "AS IS" BASIS,
146
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147
+ implied, including, without limitation, any warranties or conditions
148
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149
+ PARTICULAR PURPOSE. You are solely responsible for determining the
150
+ appropriateness of using or redistributing the Work and assume any
151
+ risks associated with Your exercise of permissions under this License.
152
+
153
+ 8. Limitation of Liability. In no event and under no legal theory,
154
+ whether in tort (including negligence), contract, or otherwise,
155
+ unless required by applicable law (such as deliberate and grossly
156
+ negligent acts) or agreed to in writing, shall any Contributor be
157
+ liable to You for damages, including any direct, indirect, special,
158
+ incidental, or consequential damages of any character arising as a
159
+ result of this License or out of the use or inability to use the
160
+ Work (including but not limited to damages for loss of goodwill,
161
+ work stoppage, computer failure or malfunction, or any and all
162
+ other commercial damages or losses), even if such Contributor
163
+ has been advised of the possibility of such damages.
164
+
165
+ 9. Accepting Warranty or Additional Liability. While redistributing
166
+ the Work or Derivative Works thereof, You may choose to offer,
167
+ and charge a fee for, acceptance of support, warranty, indemnity,
168
+ or other liability obligations and/or rights consistent with this
169
+ License. However, in accepting such obligations, You may act only
170
+ on Your own behalf and on Your sole responsibility, not on behalf
171
+ of any other Contributor, and only if You agree to indemnify,
172
+ defend, and hold each Contributor harmless for any liability
173
+ incurred by, or claims asserted against, such Contributor by reason
174
+ of your accepting any such warranty or additional liability.
175
+
176
+ END OF TERMS AND CONDITIONS
177
+
178
+ APPENDIX: How to apply the Apache License to your work.
179
+
180
+ To apply the Apache License to your work, attach the following
181
+ boilerplate notice, with the fields enclosed by brackets "[]"
182
+ replaced with your own identifying information. (Don't include
183
+ the brackets!) The text should be enclosed in the appropriate
184
+ comment syntax for the file format. We also recommend that a
185
+ file or class name and description of purpose be included on the
186
+ same "printed page" as the copyright notice for easier
187
+ identification within third-party archives.
188
+
189
+ Copyright [yyyy] [name of copyright owner]
190
+
191
+ Licensed under the Apache License, Version 2.0 (the "License");
192
+ you may not use this file except in compliance with the License.
193
+ You may obtain a copy of the License at
194
+
195
+ http://www.apache.org/licenses/LICENSE-2.0
196
+
197
+ Unless required by applicable law or agreed to in writing, software
198
+ distributed under the License is distributed on an "AS IS" BASIS,
199
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200
+ See the License for the specific language governing permissions and
201
+ limitations under the License.
third_party/LightGlue/README.md ADDED
@@ -0,0 +1,134 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <p align="center">
2
+ <h1 align="center"><ins>LightGlue ⚡️</ins><br>Local Feature Matching at Light Speed</h1>
3
+ <p align="center">
4
+ <a href="https://www.linkedin.com/in/philipplindenberger/">Philipp Lindenberger</a>
5
+ ·
6
+ <a href="https://psarlin.com/">Paul-Edouard&nbsp;Sarlin</a>
7
+ ·
8
+ <a href="https://www.microsoft.com/en-us/research/people/mapoll/">Marc&nbsp;Pollefeys</a>
9
+ </p>
10
+ <!-- <p align="center">
11
+ <img src="assets/larchitecture.svg" alt="Logo" height="40">
12
+ </p> -->
13
+ <!-- <h2 align="center">PrePrint 2023</h2> -->
14
+ <h2 align="center"><p>
15
+ <a href="https://arxiv.org/pdf/2306.13643.pdf" align="center">Paper</a> |
16
+ <a href="https://colab.research.google.com/github/cvg/LightGlue/blob/main/demo.ipynb" align="center">Colab</a>
17
+ </p></h2>
18
+ <div align="center"></div>
19
+ </p>
20
+ <p align="center">
21
+ <a href="https://arxiv.org/abs/2306.13643"><img src="assets/easy_hard.jpg" alt="example" width=80%></a>
22
+ <br>
23
+ <em>LightGlue is a deep neural network that matches sparse local features across image pairs.<br>An adaptive mechanism makes it fast for easy pairs (top) and reduces the computational complexity for difficult ones (bottom).</em>
24
+ </p>
25
+
26
+ ##
27
+
28
+ This repository hosts the inference code of LightGlue, a lightweight feature matcher with high accuracy and blazing fast inference. It takes as input a set of keypoints and descriptors for each image and returns the indices of corresponding points. The architecture is based on adaptive pruning techniques, in both network width and depth - [check out the paper for more details](https://arxiv.org/pdf/2306.13643.pdf).
29
+
30
+ We release pretrained weights of LightGlue with [SuperPoint](https://arxiv.org/abs/1712.07629) and [DISK](https://arxiv.org/abs/2006.13566) local features.
31
+ The training end evaluation code will be released in July in a separate repo. To be notified, subscribe to [issue #6](https://github.com/cvg/LightGlue/issues/6).
32
+
33
+ ## Installation and demo [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/cvg/LightGlue/blob/main/demo.ipynb)
34
+
35
+ Install this repo using pip:
36
+
37
+ ```bash
38
+ git clone https://github.com/cvg/LightGlue.git && cd LightGlue
39
+ python -m pip install -e .
40
+ ```
41
+
42
+ We provide a [demo notebook](demo.ipynb) which shows how to perform feature extraction and matching on an image pair.
43
+
44
+ Here is a minimal script to match two images:
45
+
46
+ ```python
47
+ from lightglue import LightGlue, SuperPoint, DISK
48
+ from lightglue.utils import load_image, rbd
49
+
50
+ # SuperPoint+LightGlue
51
+ extractor = SuperPoint(max_num_keypoints=2048).eval().cuda() # load the extractor
52
+ matcher = LightGlue(features='superpoint').eval().cuda() # load the matcher
53
+
54
+ # or DISK+LightGlue
55
+ extractor = DISK(max_num_keypoints=2048).eval().cuda() # load the extractor
56
+ matcher = LightGlue(features='disk').eval().cuda() # load the matcher
57
+
58
+ # load each image as a torch.Tensor on GPU with shape (3,H,W), normalized in [0,1]
59
+ image0 = load_image('path/to/image_0.jpg').cuda()
60
+ image1 = load_image('path/to/image_1.jpg').cuda()
61
+
62
+ # extract local features
63
+ feats0 = extractor.extract(image0) # auto-resize the image, disable with resize=None
64
+ feats1 = extractor.extract(image1)
65
+
66
+ # match the features
67
+ matches01 = matcher({'image0': feats0, 'image1': feats1})
68
+ feats0, feats1, matches01 = [rbd(x) for x in [feats0, feats1, matches01]] # remove batch dimension
69
+ matches = matches01['matches'] # indices with shape (K,2)
70
+ points0 = feats0['keypoints'][matches[..., 0]] # coordinates in image #0, shape (K,2)
71
+ points1 = feats1['keypoints'][matches[..., 1]] # coordinates in image #1, shape (K,2)
72
+ ```
73
+
74
+ We also provide a convenience method to match a pair of images:
75
+
76
+ ```python
77
+ from lightglue import match_pair
78
+ feats0, feats1, matches01 = match_pair(extractor, matcher, image0, image1)
79
+ ```
80
+
81
+ ##
82
+
83
+ <p align="center">
84
+ <a href="https://arxiv.org/abs/2306.13643"><img src="assets/teaser.svg" alt="Logo" width=50%></a>
85
+ <br>
86
+ <em>LightGlue can adjust its depth (number of layers) and width (number of keypoints) per image pair, with a marginal impact on accuracy.</em>
87
+ </p>
88
+
89
+ ## Advanced configuration
90
+
91
+ The default values give a good trade-off between speed and accuracy. To maximize the accuracy, use all keypoints and disable the adaptive mechanisms:
92
+ ```python
93
+ extractor = SuperPoint(max_num_keypoints=None)
94
+ matcher = LightGlue(features='superpoint', depth_confidence=-1, width_confidence=-1)
95
+ ```
96
+
97
+ To increase the speed with a small drop of accuracy, decrease the number of keypoints and lower the adaptive thresholds:
98
+ ```python
99
+ extractor = SuperPoint(max_num_keypoints=1024)
100
+ matcher = LightGlue(features='superpoint', depth_confidence=0.9, width_confidence=0.95)
101
+ ```
102
+ The maximum speed is obtained with [FlashAttention](https://arxiv.org/abs/2205.14135), which is automatically used when ```torch >= 2.0``` or if it is [installed from source](https://github.com/HazyResearch/flash-attention#installation-and-features).
103
+
104
+ <details>
105
+ <summary>[Detail of all parameters - click to expand]</summary>
106
+
107
+ - [```n_layers```](https://github.com/cvg/LightGlue/blob/main/lightglue/lightglue.py#L261): Number of stacked self+cross attention layers. Reduce this value for faster inference at the cost of accuracy (continuous red line in the plot above). Default: 9 (all layers).
108
+ - [```flash```](https://github.com/cvg/LightGlue/blob/main/lightglue/lightglue.py#L263): Enable FlashAttention. Significantly increases the speed and reduces the memory consumption without any impact on accuracy. Default: True (LightGlue automatically detects if FlashAttention is available).
109
+ - [```mp```](https://github.com/cvg/LightGlue/blob/main/lightglue/lightglue.py#L264): Enable mixed precision inference. Default: False (off)
110
+ - [```depth_confidence```](https://github.com/cvg/LightGlue/blob/main/lightglue/lightglue.py#L265): Controls the early stopping. A lower values stops more often at earlier layers. Default: 0.95, disable with -1.
111
+ - [```width_confidence```](https://github.com/cvg/LightGlue/blob/main/lightglue/lightglue.py#L266): Controls the iterative point pruning. A lower value prunes more points earlier. Default: 0.99, disable with -1.
112
+ - [```filter_threshold```](https://github.com/cvg/LightGlue/blob/main/lightglue/lightglue.py#L267): Match confidence. Increase this value to obtain less, but stronger matches. Default: 0.1
113
+
114
+ </details>
115
+
116
+ ## Other links
117
+ - [hloc - the visual localization toolbox](https://github.com/cvg/Hierarchical-Localization/): run LightGlue for Structure-from-Motion and visual localization.
118
+ - [LightGlue-ONNX](https://github.com/fabio-sim/LightGlue-ONNX): export LightGlue to the Open Neural Network Exchange format.
119
+ - [Image Matching WebUI](https://github.com/Vincentqyw/image-matching-webui): a web GUI to easily compare different matchers, including LightGlue.
120
+ - [kornia](kornia.readthedocs.io/) now exposes LightGlue via the interfaces [`LightGlue`](https://kornia.readthedocs.io/en/latest/feature.html#kornia.feature.LightGlue) and [`LightGlueMatcher`](https://kornia.readthedocs.io/en/latest/feature.html#kornia.feature.LightGlueMatcher).
121
+
122
+ ## BibTeX Citation
123
+ If you use any ideas from the paper or code from this repo, please consider citing:
124
+
125
+ ```txt
126
+ @inproceedings{lindenberger23lightglue,
127
+ author = {Philipp Lindenberger and
128
+ Paul-Edouard Sarlin and
129
+ Marc Pollefeys},
130
+ title = {{LightGlue: Local Feature Matching at Light Speed}},
131
+ booktitle = {ICCV},
132
+ year = {2023}
133
+ }
134
+ ```
third_party/LightGlue/assets/DSC_0410.JPG ADDED
third_party/LightGlue/assets/DSC_0411.JPG ADDED
third_party/LightGlue/assets/architecture.svg ADDED
third_party/LightGlue/assets/easy_hard.jpg ADDED
third_party/LightGlue/assets/sacre_coeur1.jpg ADDED
third_party/LightGlue/assets/sacre_coeur2.jpg ADDED
third_party/LightGlue/assets/teaser.svg ADDED
third_party/LightGlue/demo.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
third_party/LightGlue/lightglue/__init__.py ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ from .lightglue import LightGlue
2
+ from .superpoint import SuperPoint
3
+ from .disk import DISK
4
+ from .utils import match_pair
third_party/LightGlue/lightglue/disk.py ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+ import kornia
4
+ from types import SimpleNamespace
5
+ from .utils import ImagePreprocessor
6
+
7
+
8
+ class DISK(nn.Module):
9
+ default_conf = {
10
+ 'weights': 'depth',
11
+ 'max_num_keypoints': None,
12
+ 'desc_dim': 128,
13
+ 'nms_window_size': 5,
14
+ 'detection_threshold': 0.0,
15
+ 'pad_if_not_divisible': True,
16
+ }
17
+
18
+ preprocess_conf = {
19
+ **ImagePreprocessor.default_conf,
20
+ 'resize': 1024,
21
+ 'grayscale': False,
22
+ }
23
+
24
+ required_data_keys = ['image']
25
+
26
+ def __init__(self, **conf) -> None:
27
+ super().__init__()
28
+ self.conf = {**self.default_conf, **conf}
29
+ self.conf = SimpleNamespace(**self.conf)
30
+ self.model = kornia.feature.DISK.from_pretrained(self.conf.weights)
31
+
32
+ def forward(self, data: dict) -> dict:
33
+ """ Compute keypoints, scores, descriptors for image """
34
+ for key in self.required_data_keys:
35
+ assert key in data, f'Missing key {key} in data'
36
+ image = data['image']
37
+ features = self.model(
38
+ image,
39
+ n=self.conf.max_num_keypoints,
40
+ window_size=self.conf.nms_window_size,
41
+ score_threshold=self.conf.detection_threshold,
42
+ pad_if_not_divisible=self.conf.pad_if_not_divisible
43
+ )
44
+ keypoints = [f.keypoints for f in features]
45
+ scores = [f.detection_scores for f in features]
46
+ descriptors = [f.descriptors for f in features]
47
+ del features
48
+
49
+ keypoints = torch.stack(keypoints, 0)
50
+ scores = torch.stack(scores, 0)
51
+ descriptors = torch.stack(descriptors, 0)
52
+
53
+ return {
54
+ 'keypoints': keypoints.to(image),
55
+ 'keypoint_scores': scores.to(image),
56
+ 'descriptors': descriptors.to(image),
57
+ }
58
+
59
+ def extract(self, img: torch.Tensor, **conf) -> dict:
60
+ """ Perform extraction with online resizing"""
61
+ if img.dim() == 3:
62
+ img = img[None] # add batch dim
63
+ assert img.dim() == 4 and img.shape[0] == 1
64
+ shape = img.shape[-2:][::-1]
65
+ img, scales = ImagePreprocessor(
66
+ **{**self.preprocess_conf, **conf})(img)
67
+ feats = self.forward({'image': img})
68
+ feats['image_size'] = torch.tensor(shape)[None].to(img).float()
69
+ feats['keypoints'] = (feats['keypoints'] + .5) / scales[None] - .5
70
+ return feats
third_party/LightGlue/lightglue/lightglue.py ADDED
@@ -0,0 +1,466 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pathlib import Path
2
+ from types import SimpleNamespace
3
+ import warnings
4
+ import numpy as np
5
+ import torch
6
+ from torch import nn
7
+ import torch.nn.functional as F
8
+ from typing import Optional, List, Callable
9
+
10
+ try:
11
+ from flash_attn.modules.mha import FlashCrossAttention
12
+ except ModuleNotFoundError:
13
+ FlashCrossAttention = None
14
+
15
+ if FlashCrossAttention or hasattr(F, 'scaled_dot_product_attention'):
16
+ FLASH_AVAILABLE = True
17
+ else:
18
+ FLASH_AVAILABLE = False
19
+
20
+ torch.backends.cudnn.deterministic = True
21
+
22
+
23
+ @torch.cuda.amp.custom_fwd(cast_inputs=torch.float32)
24
+ def normalize_keypoints(
25
+ kpts: torch.Tensor,
26
+ size: torch.Tensor) -> torch.Tensor:
27
+ if isinstance(size, torch.Size):
28
+ size = torch.tensor(size)[None]
29
+ shift = size.float().to(kpts) / 2
30
+ scale = size.max(1).values.float().to(kpts) / 2
31
+ kpts = (kpts - shift[:, None]) / scale[:, None, None]
32
+ return kpts
33
+
34
+
35
+ def rotate_half(x: torch.Tensor) -> torch.Tensor:
36
+ x = x.unflatten(-1, (-1, 2))
37
+ x1, x2 = x.unbind(dim=-1)
38
+ return torch.stack((-x2, x1), dim=-1).flatten(start_dim=-2)
39
+
40
+
41
+ def apply_cached_rotary_emb(
42
+ freqs: torch.Tensor, t: torch.Tensor) -> torch.Tensor:
43
+ return (t * freqs[0]) + (rotate_half(t) * freqs[1])
44
+
45
+
46
+ class LearnableFourierPositionalEncoding(nn.Module):
47
+ def __init__(self, M: int, dim: int, F_dim: int = None,
48
+ gamma: float = 1.0) -> None:
49
+ super().__init__()
50
+ F_dim = F_dim if F_dim is not None else dim
51
+ self.gamma = gamma
52
+ self.Wr = nn.Linear(M, F_dim // 2, bias=False)
53
+ nn.init.normal_(self.Wr.weight.data, mean=0, std=self.gamma ** -2)
54
+
55
+ def forward(self, x: torch.Tensor) -> torch.Tensor:
56
+ """ encode position vector """
57
+ projected = self.Wr(x)
58
+ cosines, sines = torch.cos(projected), torch.sin(projected)
59
+ emb = torch.stack([cosines, sines], 0).unsqueeze(-3)
60
+ return emb.repeat_interleave(2, dim=-1)
61
+
62
+
63
+ class TokenConfidence(nn.Module):
64
+ def __init__(self, dim: int) -> None:
65
+ super().__init__()
66
+ self.token = nn.Sequential(
67
+ nn.Linear(dim, 1),
68
+ nn.Sigmoid()
69
+ )
70
+
71
+ def forward(self, desc0: torch.Tensor, desc1: torch.Tensor):
72
+ """ get confidence tokens """
73
+ return (
74
+ self.token(desc0.detach().float()).squeeze(-1),
75
+ self.token(desc1.detach().float()).squeeze(-1))
76
+
77
+
78
+ class Attention(nn.Module):
79
+ def __init__(self, allow_flash: bool) -> None:
80
+ super().__init__()
81
+ if allow_flash and not FLASH_AVAILABLE:
82
+ warnings.warn(
83
+ 'FlashAttention is not available. For optimal speed, '
84
+ 'consider installing torch >= 2.0 or flash-attn.',
85
+ stacklevel=2,
86
+ )
87
+ self.enable_flash = allow_flash and FLASH_AVAILABLE
88
+ if allow_flash and FlashCrossAttention:
89
+ self.flash_ = FlashCrossAttention()
90
+
91
+ def forward(self, q, k, v) -> torch.Tensor:
92
+ if self.enable_flash and q.device.type == 'cuda':
93
+ if FlashCrossAttention:
94
+ q, k, v = [x.transpose(-2, -3) for x in [q, k, v]]
95
+ m = self.flash_(q.half(), torch.stack([k, v], 2).half())
96
+ return m.transpose(-2, -3).to(q.dtype)
97
+ else: # use torch 2.0 scaled_dot_product_attention with flash
98
+ args = [x.half().contiguous() for x in [q, k, v]]
99
+ with torch.backends.cuda.sdp_kernel(enable_flash=True):
100
+ return F.scaled_dot_product_attention(*args).to(q.dtype)
101
+ elif hasattr(F, 'scaled_dot_product_attention'):
102
+ args = [x.contiguous() for x in [q, k, v]]
103
+ return F.scaled_dot_product_attention(*args).to(q.dtype)
104
+ else:
105
+ s = q.shape[-1] ** -0.5
106
+ attn = F.softmax(torch.einsum('...id,...jd->...ij', q, k) * s, -1)
107
+ return torch.einsum('...ij,...jd->...id', attn, v)
108
+
109
+
110
+ class Transformer(nn.Module):
111
+ def __init__(self, embed_dim: int, num_heads: int,
112
+ flash: bool = False, bias: bool = True) -> None:
113
+ super().__init__()
114
+ self.embed_dim = embed_dim
115
+ self.num_heads = num_heads
116
+ assert self.embed_dim % num_heads == 0
117
+ self.head_dim = self.embed_dim // num_heads
118
+ self.Wqkv = nn.Linear(embed_dim, 3*embed_dim, bias=bias)
119
+ self.inner_attn = Attention(flash)
120
+ self.out_proj = nn.Linear(embed_dim, embed_dim, bias=bias)
121
+ self.ffn = nn.Sequential(
122
+ nn.Linear(2*embed_dim, 2*embed_dim),
123
+ nn.LayerNorm(2*embed_dim, elementwise_affine=True),
124
+ nn.GELU(),
125
+ nn.Linear(2*embed_dim, embed_dim)
126
+ )
127
+
128
+ def _forward(self, x: torch.Tensor,
129
+ encoding: Optional[torch.Tensor] = None):
130
+ qkv = self.Wqkv(x)
131
+ qkv = qkv.unflatten(-1, (self.num_heads, -1, 3)).transpose(1, 2)
132
+ q, k, v = qkv[..., 0], qkv[..., 1], qkv[..., 2]
133
+ if encoding is not None:
134
+ q = apply_cached_rotary_emb(encoding, q)
135
+ k = apply_cached_rotary_emb(encoding, k)
136
+ context = self.inner_attn(q, k, v)
137
+ message = self.out_proj(
138
+ context.transpose(1, 2).flatten(start_dim=-2))
139
+ return x + self.ffn(torch.cat([x, message], -1))
140
+
141
+ def forward(self, x0, x1, encoding0=None, encoding1=None):
142
+ return self._forward(x0, encoding0), self._forward(x1, encoding1)
143
+
144
+
145
+ class CrossTransformer(nn.Module):
146
+ def __init__(self, embed_dim: int, num_heads: int,
147
+ flash: bool = False, bias: bool = True) -> None:
148
+ super().__init__()
149
+ self.heads = num_heads
150
+ dim_head = embed_dim // num_heads
151
+ self.scale = dim_head ** -0.5
152
+ inner_dim = dim_head * num_heads
153
+ self.to_qk = nn.Linear(embed_dim, inner_dim, bias=bias)
154
+ self.to_v = nn.Linear(embed_dim, inner_dim, bias=bias)
155
+ self.to_out = nn.Linear(inner_dim, embed_dim, bias=bias)
156
+ self.ffn = nn.Sequential(
157
+ nn.Linear(2*embed_dim, 2*embed_dim),
158
+ nn.LayerNorm(2*embed_dim, elementwise_affine=True),
159
+ nn.GELU(),
160
+ nn.Linear(2*embed_dim, embed_dim)
161
+ )
162
+
163
+ if flash and FLASH_AVAILABLE:
164
+ self.flash = Attention(True)
165
+ else:
166
+ self.flash = None
167
+
168
+ def map_(self, func: Callable, x0: torch.Tensor, x1: torch.Tensor):
169
+ return func(x0), func(x1)
170
+
171
+ def forward(self, x0: torch.Tensor, x1: torch.Tensor) -> List[torch.Tensor]:
172
+ qk0, qk1 = self.map_(self.to_qk, x0, x1)
173
+ v0, v1 = self.map_(self.to_v, x0, x1)
174
+ qk0, qk1, v0, v1 = map(
175
+ lambda t: t.unflatten(-1, (self.heads, -1)).transpose(1, 2),
176
+ (qk0, qk1, v0, v1))
177
+ if self.flash is not None:
178
+ m0 = self.flash(qk0, qk1, v1)
179
+ m1 = self.flash(qk1, qk0, v0)
180
+ else:
181
+ qk0, qk1 = qk0 * self.scale**0.5, qk1 * self.scale**0.5
182
+ sim = torch.einsum('b h i d, b h j d -> b h i j', qk0, qk1)
183
+ attn01 = F.softmax(sim, dim=-1)
184
+ attn10 = F.softmax(sim.transpose(-2, -1).contiguous(), dim=-1)
185
+ m0 = torch.einsum('bhij, bhjd -> bhid', attn01, v1)
186
+ m1 = torch.einsum('bhji, bhjd -> bhid', attn10.transpose(-2, -1), v0)
187
+ m0, m1 = self.map_(lambda t: t.transpose(1, 2).flatten(start_dim=-2),
188
+ m0, m1)
189
+ m0, m1 = self.map_(self.to_out, m0, m1)
190
+ x0 = x0 + self.ffn(torch.cat([x0, m0], -1))
191
+ x1 = x1 + self.ffn(torch.cat([x1, m1], -1))
192
+ return x0, x1
193
+
194
+
195
+ def sigmoid_log_double_softmax(
196
+ sim: torch.Tensor, z0: torch.Tensor, z1: torch.Tensor) -> torch.Tensor:
197
+ """ create the log assignment matrix from logits and similarity"""
198
+ b, m, n = sim.shape
199
+ certainties = F.logsigmoid(z0) + F.logsigmoid(z1).transpose(1, 2)
200
+ scores0 = F.log_softmax(sim, 2)
201
+ scores1 = F.log_softmax(
202
+ sim.transpose(-1, -2).contiguous(), 2).transpose(-1, -2)
203
+ scores = sim.new_full((b, m+1, n+1), 0)
204
+ scores[:, :m, :n] = (scores0 + scores1 + certainties)
205
+ scores[:, :-1, -1] = F.logsigmoid(-z0.squeeze(-1))
206
+ scores[:, -1, :-1] = F.logsigmoid(-z1.squeeze(-1))
207
+ return scores
208
+
209
+
210
+ class MatchAssignment(nn.Module):
211
+ def __init__(self, dim: int) -> None:
212
+ super().__init__()
213
+ self.dim = dim
214
+ self.matchability = nn.Linear(dim, 1, bias=True)
215
+ self.final_proj = nn.Linear(dim, dim, bias=True)
216
+
217
+ def forward(self, desc0: torch.Tensor, desc1: torch.Tensor):
218
+ """ build assignment matrix from descriptors """
219
+ mdesc0, mdesc1 = self.final_proj(desc0), self.final_proj(desc1)
220
+ _, _, d = mdesc0.shape
221
+ mdesc0, mdesc1 = mdesc0 / d**.25, mdesc1 / d**.25
222
+ sim = torch.einsum('bmd,bnd->bmn', mdesc0, mdesc1)
223
+ z0 = self.matchability(desc0)
224
+ z1 = self.matchability(desc1)
225
+ scores = sigmoid_log_double_softmax(sim, z0, z1)
226
+ return scores, sim
227
+
228
+ def scores(self, desc0: torch.Tensor, desc1: torch.Tensor):
229
+ m0 = torch.sigmoid(self.matchability(desc0)).squeeze(-1)
230
+ m1 = torch.sigmoid(self.matchability(desc1)).squeeze(-1)
231
+ return m0, m1
232
+
233
+
234
+ def filter_matches(scores: torch.Tensor, th: float):
235
+ """ obtain matches from a log assignment matrix [Bx M+1 x N+1]"""
236
+ max0, max1 = scores[:, :-1, :-1].max(2), scores[:, :-1, :-1].max(1)
237
+ m0, m1 = max0.indices, max1.indices
238
+ mutual0 = torch.arange(m0.shape[1]).to(m0)[None] == m1.gather(1, m0)
239
+ mutual1 = torch.arange(m1.shape[1]).to(m1)[None] == m0.gather(1, m1)
240
+ max0_exp = max0.values.exp()
241
+ zero = max0_exp.new_tensor(0)
242
+ mscores0 = torch.where(mutual0, max0_exp, zero)
243
+ mscores1 = torch.where(mutual1, mscores0.gather(1, m1), zero)
244
+ if th is not None:
245
+ valid0 = mutual0 & (mscores0 > th)
246
+ else:
247
+ valid0 = mutual0
248
+ valid1 = mutual1 & valid0.gather(1, m1)
249
+ m0 = torch.where(valid0, m0, m0.new_tensor(-1))
250
+ m1 = torch.where(valid1, m1, m1.new_tensor(-1))
251
+ return m0, m1, mscores0, mscores1
252
+
253
+
254
+ class LightGlue(nn.Module):
255
+ default_conf = {
256
+ 'name': 'lightglue', # just for interfacing
257
+ 'input_dim': 256, # input descriptor dimension (autoselected from weights)
258
+ 'descriptor_dim': 256,
259
+ 'n_layers': 9,
260
+ 'num_heads': 4,
261
+ 'flash': True, # enable FlashAttention if available.
262
+ 'mp': False, # enable mixed precision
263
+ 'depth_confidence': 0.95, # early stopping, disable with -1
264
+ 'width_confidence': 0.99, # point pruning, disable with -1
265
+ 'filter_threshold': 0.1, # match threshold
266
+ 'weights': None,
267
+ }
268
+
269
+ required_data_keys = [
270
+ 'image0', 'image1']
271
+
272
+ version = "v0.1_arxiv"
273
+ url = "https://github.com/cvg/LightGlue/releases/download/{}/{}_lightglue.pth"
274
+
275
+ features = {
276
+ 'superpoint': ('superpoint_lightglue', 256),
277
+ 'disk': ('disk_lightglue', 128)
278
+ }
279
+
280
+ def __init__(self, features='superpoint', **conf) -> None:
281
+ super().__init__()
282
+ self.conf = {**self.default_conf, **conf}
283
+ if features is not None:
284
+ assert (features in list(self.features.keys()))
285
+ self.conf['weights'], self.conf['input_dim'] = \
286
+ self.features[features]
287
+ self.conf = conf = SimpleNamespace(**self.conf)
288
+
289
+ if conf.input_dim != conf.descriptor_dim:
290
+ self.input_proj = nn.Linear(
291
+ conf.input_dim, conf.descriptor_dim, bias=True)
292
+ else:
293
+ self.input_proj = nn.Identity()
294
+
295
+ head_dim = conf.descriptor_dim // conf.num_heads
296
+ self.posenc = LearnableFourierPositionalEncoding(2, head_dim, head_dim)
297
+
298
+ h, n, d = conf.num_heads, conf.n_layers, conf.descriptor_dim
299
+ self.self_attn = nn.ModuleList(
300
+ [Transformer(d, h, conf.flash) for _ in range(n)])
301
+ self.cross_attn = nn.ModuleList(
302
+ [CrossTransformer(d, h, conf.flash) for _ in range(n)])
303
+ self.log_assignment = nn.ModuleList(
304
+ [MatchAssignment(d) for _ in range(n)])
305
+ self.token_confidence = nn.ModuleList([
306
+ TokenConfidence(d) for _ in range(n-1)])
307
+
308
+ if features is not None:
309
+ fname = f'{conf.weights}_{self.version}.pth'.replace('.', '-')
310
+ state_dict = torch.hub.load_state_dict_from_url(
311
+ self.url.format(self.version, features), file_name=fname)
312
+ self.load_state_dict(state_dict, strict=False)
313
+ elif conf.weights is not None:
314
+ path = Path(__file__).parent
315
+ path = path / 'weights/{}.pth'.format(self.conf.weights)
316
+ state_dict = torch.load(str(path), map_location='cpu')
317
+ self.load_state_dict(state_dict, strict=False)
318
+
319
+ print('Loaded LightGlue model')
320
+
321
+ def forward(self, data: dict) -> dict:
322
+ """
323
+ Match keypoints and descriptors between two images
324
+
325
+ Input (dict):
326
+ image0: dict
327
+ keypoints: [B x M x 2]
328
+ descriptors: [B x M x D]
329
+ image: [B x C x H x W] or image_size: [B x 2]
330
+ image1: dict
331
+ keypoints: [B x N x 2]
332
+ descriptors: [B x N x D]
333
+ image: [B x C x H x W] or image_size: [B x 2]
334
+ Output (dict):
335
+ log_assignment: [B x M+1 x N+1]
336
+ matches0: [B x M]
337
+ matching_scores0: [B x M]
338
+ matches1: [B x N]
339
+ matching_scores1: [B x N]
340
+ matches: List[[Si x 2]], scores: List[[Si]]
341
+ """
342
+ with torch.autocast(enabled=self.conf.mp, device_type='cuda'):
343
+ return self._forward(data)
344
+
345
+ def _forward(self, data: dict) -> dict:
346
+ for key in self.required_data_keys:
347
+ assert key in data, f'Missing key {key} in data'
348
+ data0, data1 = data['image0'], data['image1']
349
+ kpts0_, kpts1_ = data0['keypoints'], data1['keypoints']
350
+ b, m, _ = kpts0_.shape
351
+ b, n, _ = kpts1_.shape
352
+ size0, size1 = data0.get('image_size'), data1.get('image_size')
353
+ size0 = size0 if size0 is not None else data0['image'].shape[-2:][::-1]
354
+ size1 = size1 if size1 is not None else data1['image'].shape[-2:][::-1]
355
+ kpts0 = normalize_keypoints(kpts0_, size=size0)
356
+ kpts1 = normalize_keypoints(kpts1_, size=size1)
357
+
358
+ assert torch.all(kpts0 >= -1) and torch.all(kpts0 <= 1)
359
+ assert torch.all(kpts1 >= -1) and torch.all(kpts1 <= 1)
360
+
361
+ desc0 = data0['descriptors'].detach()
362
+ desc1 = data1['descriptors'].detach()
363
+
364
+ assert desc0.shape[-1] == self.conf.input_dim
365
+ assert desc1.shape[-1] == self.conf.input_dim
366
+
367
+ if torch.is_autocast_enabled():
368
+ desc0 = desc0.half()
369
+ desc1 = desc1.half()
370
+
371
+ desc0 = self.input_proj(desc0)
372
+ desc1 = self.input_proj(desc1)
373
+
374
+ # cache positional embeddings
375
+ encoding0 = self.posenc(kpts0)
376
+ encoding1 = self.posenc(kpts1)
377
+
378
+ # GNN + final_proj + assignment
379
+ ind0 = torch.arange(0, m).to(device=kpts0.device)[None]
380
+ ind1 = torch.arange(0, n).to(device=kpts0.device)[None]
381
+ prune0 = torch.ones_like(ind0) # store layer where pruning is detected
382
+ prune1 = torch.ones_like(ind1)
383
+ dec, wic = self.conf.depth_confidence, self.conf.width_confidence
384
+ token0, token1 = None, None
385
+ for i in range(self.conf.n_layers):
386
+ # self+cross attention
387
+ desc0, desc1 = self.self_attn[i](
388
+ desc0, desc1, encoding0, encoding1)
389
+ desc0, desc1 = self.cross_attn[i](desc0, desc1)
390
+ if i == self.conf.n_layers - 1:
391
+ continue # no early stopping or adaptive width at last layer
392
+ if dec > 0: # early stopping
393
+ token0, token1 = self.token_confidence[i](desc0, desc1)
394
+ if self.stop(token0, token1, self.conf_th(i), dec, m+n):
395
+ break
396
+ if wic > 0: # point pruning
397
+ match0, match1 = self.log_assignment[i].scores(desc0, desc1)
398
+ mask0 = self.get_mask(token0, match0, self.conf_th(i), 1-wic)
399
+ mask1 = self.get_mask(token1, match1, self.conf_th(i), 1-wic)
400
+ ind0, ind1 = ind0[mask0][None], ind1[mask1][None]
401
+ desc0, desc1 = desc0[mask0][None], desc1[mask1][None]
402
+ if desc0.shape[-2] == 0 or desc1.shape[-2] == 0:
403
+ break
404
+ encoding0 = encoding0[:, :, mask0][:, None]
405
+ encoding1 = encoding1[:, :, mask1][:, None]
406
+ prune0[:, ind0] += 1
407
+ prune1[:, ind1] += 1
408
+
409
+ if wic > 0: # scatter with indices after pruning
410
+ scores_, _ = self.log_assignment[i](desc0, desc1)
411
+ dt, dev = scores_.dtype, scores_.device
412
+ scores = torch.zeros(b, m+1, n+1, dtype=dt, device=dev)
413
+ scores[:, :-1, :-1] = -torch.inf
414
+ scores[:, ind0[0], -1] = scores_[:, :-1, -1]
415
+ scores[:, -1, ind1[0]] = scores_[:, -1, :-1]
416
+ x, y = torch.meshgrid(ind0[0], ind1[0], indexing='ij')
417
+ scores[:, x, y] = scores_[:, :-1, :-1]
418
+ else:
419
+ scores, _ = self.log_assignment[i](desc0, desc1)
420
+
421
+ m0, m1, mscores0, mscores1 = filter_matches(
422
+ scores, self.conf.filter_threshold)
423
+
424
+ matches, mscores = [], []
425
+ for k in range(b):
426
+ valid = m0[k] > -1
427
+ matches.append(torch.stack([torch.where(valid)[0], m0[k][valid]], -1))
428
+ mscores.append(mscores0[k][valid])
429
+
430
+ return {
431
+ 'log_assignment': scores,
432
+ 'matches0': m0,
433
+ 'matches1': m1,
434
+ 'matching_scores0': mscores0,
435
+ 'matching_scores1': mscores1,
436
+ 'stop': i+1,
437
+ 'prune0': prune0,
438
+ 'prune1': prune1,
439
+ 'matches': matches,
440
+ 'scores': mscores,
441
+ }
442
+
443
+ def conf_th(self, i: int) -> float:
444
+ """ scaled confidence threshold """
445
+ return np.clip(
446
+ 0.8 + 0.1 * np.exp(-4.0 * i / self.conf.n_layers), 0, 1)
447
+
448
+ def get_mask(self, confidence: torch.Tensor, match: torch.Tensor,
449
+ conf_th: float, match_th: float) -> torch.Tensor:
450
+ """ mask points which should be removed """
451
+ if conf_th and confidence is not None:
452
+ mask = torch.where(confidence > conf_th, match,
453
+ match.new_tensor(1.0)) > match_th
454
+ else:
455
+ mask = match > match_th
456
+ return mask
457
+
458
+ def stop(self, token0: torch.Tensor, token1: torch.Tensor,
459
+ conf_th: float, inl_th: float, seql: int) -> torch.Tensor:
460
+ """ evaluate stopping condition"""
461
+ tokens = torch.cat([token0, token1], -1)
462
+ if conf_th:
463
+ pos = 1.0 - (tokens < conf_th).float().sum() / seql
464
+ return pos > inl_th
465
+ else:
466
+ return tokens.mean() > inl_th
third_party/LightGlue/lightglue/superpoint.py ADDED
@@ -0,0 +1,230 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # %BANNER_BEGIN%
2
+ # ---------------------------------------------------------------------
3
+ # %COPYRIGHT_BEGIN%
4
+ #
5
+ # Magic Leap, Inc. ("COMPANY") CONFIDENTIAL
6
+ #
7
+ # Unpublished Copyright (c) 2020
8
+ # Magic Leap, Inc., All Rights Reserved.
9
+ #
10
+ # NOTICE: All information contained herein is, and remains the property
11
+ # of COMPANY. The intellectual and technical concepts contained herein
12
+ # are proprietary to COMPANY and may be covered by U.S. and Foreign
13
+ # Patents, patents in process, and are protected by trade secret or
14
+ # copyright law. Dissemination of this information or reproduction of
15
+ # this material is strictly forbidden unless prior written permission is
16
+ # obtained from COMPANY. Access to the source code contained herein is
17
+ # hereby forbidden to anyone except current COMPANY employees, managers
18
+ # or contractors who have executed Confidentiality and Non-disclosure
19
+ # agreements explicitly covering such access.
20
+ #
21
+ # The copyright notice above does not evidence any actual or intended
22
+ # publication or disclosure of this source code, which includes
23
+ # information that is confidential and/or proprietary, and is a trade
24
+ # secret, of COMPANY. ANY REPRODUCTION, MODIFICATION, DISTRIBUTION,
25
+ # PUBLIC PERFORMANCE, OR PUBLIC DISPLAY OF OR THROUGH USE OF THIS
26
+ # SOURCE CODE WITHOUT THE EXPRESS WRITTEN CONSENT OF COMPANY IS
27
+ # STRICTLY PROHIBITED, AND IN VIOLATION OF APPLICABLE LAWS AND
28
+ # INTERNATIONAL TREATIES. THE RECEIPT OR POSSESSION OF THIS SOURCE
29
+ # CODE AND/OR RELATED INFORMATION DOES NOT CONVEY OR IMPLY ANY RIGHTS
30
+ # TO REPRODUCE, DISCLOSE OR DISTRIBUTE ITS CONTENTS, OR TO MANUFACTURE,
31
+ # USE, OR SELL ANYTHING THAT IT MAY DESCRIBE, IN WHOLE OR IN PART.
32
+ #
33
+ # %COPYRIGHT_END%
34
+ # ----------------------------------------------------------------------
35
+ # %AUTHORS_BEGIN%
36
+ #
37
+ # Originating Authors: Paul-Edouard Sarlin
38
+ #
39
+ # %AUTHORS_END%
40
+ # --------------------------------------------------------------------*/
41
+ # %BANNER_END%
42
+
43
+ # Adapted by Remi Pautrat, Philipp Lindenberger
44
+
45
+ import torch
46
+ from torch import nn
47
+ from .utils import ImagePreprocessor
48
+
49
+
50
+ def simple_nms(scores, nms_radius: int):
51
+ """ Fast Non-maximum suppression to remove nearby points """
52
+ assert (nms_radius >= 0)
53
+
54
+ def max_pool(x):
55
+ return torch.nn.functional.max_pool2d(
56
+ x, kernel_size=nms_radius*2+1, stride=1, padding=nms_radius)
57
+
58
+ zeros = torch.zeros_like(scores)
59
+ max_mask = scores == max_pool(scores)
60
+ for _ in range(2):
61
+ supp_mask = max_pool(max_mask.float()) > 0
62
+ supp_scores = torch.where(supp_mask, zeros, scores)
63
+ new_max_mask = supp_scores == max_pool(supp_scores)
64
+ max_mask = max_mask | (new_max_mask & (~supp_mask))
65
+ return torch.where(max_mask, scores, zeros)
66
+
67
+
68
+ def top_k_keypoints(keypoints, scores, k):
69
+ if k >= len(keypoints):
70
+ return keypoints, scores
71
+ scores, indices = torch.topk(scores, k, dim=0, sorted=True)
72
+ return keypoints[indices], scores
73
+
74
+
75
+ def sample_descriptors(keypoints, descriptors, s: int = 8):
76
+ """ Interpolate descriptors at keypoint locations """
77
+ b, c, h, w = descriptors.shape
78
+ keypoints = keypoints - s / 2 + 0.5
79
+ keypoints /= torch.tensor([(w*s - s/2 - 0.5), (h*s - s/2 - 0.5)],
80
+ ).to(keypoints)[None]
81
+ keypoints = keypoints*2 - 1 # normalize to (-1, 1)
82
+ args = {'align_corners': True} if torch.__version__ >= '1.3' else {}
83
+ descriptors = torch.nn.functional.grid_sample(
84
+ descriptors, keypoints.view(b, 1, -1, 2), mode='bilinear', **args)
85
+ descriptors = torch.nn.functional.normalize(
86
+ descriptors.reshape(b, c, -1), p=2, dim=1)
87
+ return descriptors
88
+
89
+
90
+ class SuperPoint(nn.Module):
91
+ """SuperPoint Convolutional Detector and Descriptor
92
+
93
+ SuperPoint: Self-Supervised Interest Point Detection and
94
+ Description. Daniel DeTone, Tomasz Malisiewicz, and Andrew
95
+ Rabinovich. In CVPRW, 2019. https://arxiv.org/abs/1712.07629
96
+
97
+ """
98
+ default_conf = {
99
+ 'descriptor_dim': 256,
100
+ 'nms_radius': 4,
101
+ 'max_num_keypoints': None,
102
+ 'detection_threshold': 0.0005,
103
+ 'remove_borders': 4,
104
+ }
105
+
106
+ preprocess_conf = {
107
+ **ImagePreprocessor.default_conf,
108
+ 'resize': 1024,
109
+ 'grayscale': True,
110
+ }
111
+
112
+ required_data_keys = ['image']
113
+
114
+ def __init__(self, **conf):
115
+ super().__init__()
116
+ self.conf = {**self.default_conf, **conf}
117
+
118
+ self.relu = nn.ReLU(inplace=True)
119
+ self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
120
+ c1, c2, c3, c4, c5 = 64, 64, 128, 128, 256
121
+
122
+ self.conv1a = nn.Conv2d(1, c1, kernel_size=3, stride=1, padding=1)
123
+ self.conv1b = nn.Conv2d(c1, c1, kernel_size=3, stride=1, padding=1)
124
+ self.conv2a = nn.Conv2d(c1, c2, kernel_size=3, stride=1, padding=1)
125
+ self.conv2b = nn.Conv2d(c2, c2, kernel_size=3, stride=1, padding=1)
126
+ self.conv3a = nn.Conv2d(c2, c3, kernel_size=3, stride=1, padding=1)
127
+ self.conv3b = nn.Conv2d(c3, c3, kernel_size=3, stride=1, padding=1)
128
+ self.conv4a = nn.Conv2d(c3, c4, kernel_size=3, stride=1, padding=1)
129
+ self.conv4b = nn.Conv2d(c4, c4, kernel_size=3, stride=1, padding=1)
130
+
131
+ self.convPa = nn.Conv2d(c4, c5, kernel_size=3, stride=1, padding=1)
132
+ self.convPb = nn.Conv2d(c5, 65, kernel_size=1, stride=1, padding=0)
133
+
134
+ self.convDa = nn.Conv2d(c4, c5, kernel_size=3, stride=1, padding=1)
135
+ self.convDb = nn.Conv2d(
136
+ c5, self.conf['descriptor_dim'],
137
+ kernel_size=1, stride=1, padding=0)
138
+
139
+ url = "https://github.com/cvg/LightGlue/releases/download/v0.1_arxiv/superpoint_v1.pth"
140
+ self.load_state_dict(torch.hub.load_state_dict_from_url(url))
141
+
142
+ mk = self.conf['max_num_keypoints']
143
+ if mk is not None and mk <= 0:
144
+ raise ValueError('max_num_keypoints must be positive or None')
145
+
146
+ print('Loaded SuperPoint model')
147
+
148
+ def forward(self, data: dict) -> dict:
149
+ """ Compute keypoints, scores, descriptors for image """
150
+ for key in self.required_data_keys:
151
+ assert key in data, f'Missing key {key} in data'
152
+ image = data['image']
153
+ if image.shape[1] == 3: # RGB
154
+ scale = image.new_tensor([0.299, 0.587, 0.114]).view(1, 3, 1, 1)
155
+ image = (image*scale).sum(1, keepdim=True)
156
+ # Shared Encoder
157
+ x = self.relu(self.conv1a(image))
158
+ x = self.relu(self.conv1b(x))
159
+ x = self.pool(x)
160
+ x = self.relu(self.conv2a(x))
161
+ x = self.relu(self.conv2b(x))
162
+ x = self.pool(x)
163
+ x = self.relu(self.conv3a(x))
164
+ x = self.relu(self.conv3b(x))
165
+ x = self.pool(x)
166
+ x = self.relu(self.conv4a(x))
167
+ x = self.relu(self.conv4b(x))
168
+
169
+ # Compute the dense keypoint scores
170
+ cPa = self.relu(self.convPa(x))
171
+ scores = self.convPb(cPa)
172
+ scores = torch.nn.functional.softmax(scores, 1)[:, :-1]
173
+ b, _, h, w = scores.shape
174
+ scores = scores.permute(0, 2, 3, 1).reshape(b, h, w, 8, 8)
175
+ scores = scores.permute(0, 1, 3, 2, 4).reshape(b, h*8, w*8)
176
+ scores = simple_nms(scores, self.conf['nms_radius'])
177
+
178
+ # Discard keypoints near the image borders
179
+ if self.conf['remove_borders']:
180
+ pad = self.conf['remove_borders']
181
+ scores[:, :pad] = -1
182
+ scores[:, :, :pad] = -1
183
+ scores[:, -pad:] = -1
184
+ scores[:, :, -pad:] = -1
185
+
186
+ # Extract keypoints
187
+ best_kp = torch.where(scores > self.conf['detection_threshold'])
188
+ scores = scores[best_kp]
189
+
190
+ # Separate into batches
191
+ keypoints = [torch.stack(best_kp[1:3], dim=-1)[best_kp[0] == i]
192
+ for i in range(b)]
193
+ scores = [scores[best_kp[0] == i] for i in range(b)]
194
+
195
+ # Keep the k keypoints with highest score
196
+ if self.conf['max_num_keypoints'] is not None:
197
+ keypoints, scores = list(zip(*[
198
+ top_k_keypoints(k, s, self.conf['max_num_keypoints'])
199
+ for k, s in zip(keypoints, scores)]))
200
+
201
+ # Convert (h, w) to (x, y)
202
+ keypoints = [torch.flip(k, [1]).float() for k in keypoints]
203
+
204
+ # Compute the dense descriptors
205
+ cDa = self.relu(self.convDa(x))
206
+ descriptors = self.convDb(cDa)
207
+ descriptors = torch.nn.functional.normalize(descriptors, p=2, dim=1)
208
+
209
+ # Extract descriptors
210
+ descriptors = [sample_descriptors(k[None], d[None], 8)[0]
211
+ for k, d in zip(keypoints, descriptors)]
212
+
213
+ return {
214
+ 'keypoints': torch.stack(keypoints, 0),
215
+ 'keypoint_scores': torch.stack(scores, 0),
216
+ 'descriptors': torch.stack(descriptors, 0).transpose(-1, -2),
217
+ }
218
+
219
+ def extract(self, img: torch.Tensor, **conf) -> dict:
220
+ """ Perform extraction with online resizing"""
221
+ if img.dim() == 3:
222
+ img = img[None] # add batch dim
223
+ assert img.dim() == 4 and img.shape[0] == 1
224
+ shape = img.shape[-2:][::-1]
225
+ img, scales = ImagePreprocessor(
226
+ **{**self.preprocess_conf, **conf})(img)
227
+ feats = self.forward({'image': img})
228
+ feats['image_size'] = torch.tensor(shape)[None].to(img).float()
229
+ feats['keypoints'] = (feats['keypoints'] + .5) / scales[None] - .5
230
+ return feats
third_party/LightGlue/lightglue/utils.py ADDED
@@ -0,0 +1,135 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pathlib import Path
2
+ import torch
3
+ import kornia
4
+ import cv2
5
+ import numpy as np
6
+ from typing import Union, List, Optional, Callable, Tuple
7
+ import collections.abc as collections
8
+ from types import SimpleNamespace
9
+
10
+
11
+ class ImagePreprocessor:
12
+ default_conf = {
13
+ 'resize': None, # target edge length, None for no resizing
14
+ 'side': 'long',
15
+ 'interpolation': 'bilinear',
16
+ 'align_corners': None,
17
+ 'antialias': True,
18
+ 'grayscale': False, # convert rgb to grayscale
19
+ }
20
+
21
+ def __init__(self, **conf) -> None:
22
+ super().__init__()
23
+ self.conf = {**self.default_conf, **conf}
24
+ self.conf = SimpleNamespace(**self.conf)
25
+
26
+ def __call__(self, img: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
27
+ """Resize and preprocess an image, return image and resize scale"""
28
+ h, w = img.shape[-2:]
29
+ if self.conf.resize is not None:
30
+ img = kornia.geometry.transform.resize(
31
+ img, self.conf.resize, side=self.conf.side,
32
+ antialias=self.conf.antialias,
33
+ align_corners=self.conf.align_corners)
34
+ scale = torch.Tensor([img.shape[-1] / w, img.shape[-2] / h]).to(img)
35
+ if self.conf.grayscale and img.shape[-3] == 3:
36
+ img = kornia.color.rgb_to_grayscale(img)
37
+ elif not self.conf.grayscale and img.shape[-3] == 1:
38
+ img = kornia.color.grayscale_to_rgb(img)
39
+ return img, scale
40
+
41
+
42
+ def map_tensor(input_, func: Callable):
43
+ string_classes = (str, bytes)
44
+ if isinstance(input_, string_classes):
45
+ return input_
46
+ elif isinstance(input_, collections.Mapping):
47
+ return {k: map_tensor(sample, func) for k, sample in input_.items()}
48
+ elif isinstance(input_, collections.Sequence):
49
+ return [map_tensor(sample, func) for sample in input_]
50
+ elif isinstance(input_, torch.Tensor):
51
+ return func(input_)
52
+ else:
53
+ return input_
54
+
55
+
56
+ def batch_to_device(batch: dict, device: str = 'cpu',
57
+ non_blocking: bool = True):
58
+ """Move batch (dict) to device"""
59
+ def _func(tensor):
60
+ return tensor.to(device=device, non_blocking=non_blocking).detach()
61
+ return map_tensor(batch, _func)
62
+
63
+
64
+ def rbd(data: dict) -> dict:
65
+ """Remove batch dimension from elements in data"""
66
+ return {k: v[0] if isinstance(v, (torch.Tensor, np.ndarray, list)) else v
67
+ for k, v in data.items()}
68
+
69
+
70
+ def read_image(path: Path, grayscale: bool = False) -> np.ndarray:
71
+ """Read an image from path as RGB or grayscale"""
72
+ if not Path(path).exists():
73
+ raise FileNotFoundError(f'No image at path {path}.')
74
+ mode = cv2.IMREAD_GRAYSCALE if grayscale else cv2.IMREAD_COLOR
75
+ image = cv2.imread(str(path), mode)
76
+ if image is None:
77
+ raise IOError(f'Could not read image at {path}.')
78
+ if not grayscale:
79
+ image = image[..., ::-1]
80
+ return image
81
+
82
+
83
+ def numpy_image_to_torch(image: np.ndarray) -> torch.Tensor:
84
+ """Normalize the image tensor and reorder the dimensions."""
85
+ if image.ndim == 3:
86
+ image = image.transpose((2, 0, 1)) # HxWxC to CxHxW
87
+ elif image.ndim == 2:
88
+ image = image[None] # add channel axis
89
+ else:
90
+ raise ValueError(f'Not an image: {image.shape}')
91
+ return torch.tensor(image / 255., dtype=torch.float)
92
+
93
+
94
+ def resize_image(image: np.ndarray, size: Union[List[int], int],
95
+ fn: str = 'max', interp: Optional[str] = 'area',
96
+ ) -> np.ndarray:
97
+ """Resize an image to a fixed size, or according to max or min edge."""
98
+ h, w = image.shape[:2]
99
+
100
+ fn = {'max': max, 'min': min}[fn]
101
+ if isinstance(size, int):
102
+ scale = size / fn(h, w)
103
+ h_new, w_new = int(round(h*scale)), int(round(w*scale))
104
+ scale = (w_new / w, h_new / h)
105
+ elif isinstance(size, (tuple, list)):
106
+ h_new, w_new = size
107
+ scale = (w_new / w, h_new / h)
108
+ else:
109
+ raise ValueError(f'Incorrect new size: {size}')
110
+ mode = {
111
+ 'linear': cv2.INTER_LINEAR,
112
+ 'cubic': cv2.INTER_CUBIC,
113
+ 'nearest': cv2.INTER_NEAREST,
114
+ 'area': cv2.INTER_AREA}[interp]
115
+ return cv2.resize(image, (w_new, h_new), interpolation=mode), scale
116
+
117
+
118
+ def load_image(path: Path, resize: int = None, **kwargs) -> torch.Tensor:
119
+ image = read_image(path)
120
+ if resize is not None:
121
+ image, _ = resize_image(image, resize, **kwargs)
122
+ return numpy_image_to_torch(image)
123
+
124
+
125
+ def match_pair(extractor, matcher,
126
+ image0: torch.Tensor, image1: torch.Tensor,
127
+ device: str = 'cpu', **preprocess):
128
+ """Match a pair of images (image0, image1) with an extractor and matcher"""
129
+ feats0 = extractor.extract(image0, **preprocess)
130
+ feats1 = extractor.extract(image1, **preprocess)
131
+ matches01 = matcher({'image0': feats0, 'image1': feats1})
132
+ data = [feats0, feats1, matches01]
133
+ # remove batch dim and move to target device
134
+ feats0, feats1, matches01 = [batch_to_device(rbd(x), device) for x in data]
135
+ return feats0, feats1, matches01
third_party/LightGlue/lightglue/viz2d.py ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ 2D visualization primitives based on Matplotlib.
3
+ 1) Plot images with `plot_images`.
4
+ 2) Call `plot_keypoints` or `plot_matches` any number of times.
5
+ 3) Optionally: save a .png or .pdf plot (nice in papers!) with `save_plot`.
6
+ """
7
+
8
+ import matplotlib
9
+ import matplotlib.pyplot as plt
10
+ import matplotlib.patheffects as path_effects
11
+ import numpy as np
12
+ import torch
13
+
14
+
15
+ def cm_RdGn(x):
16
+ """Custom colormap: red (0) -> yellow (0.5) -> green (1)."""
17
+ x = np.clip(x, 0, 1)[..., None]*2
18
+ c = x*np.array([[0, 1., 0]]) + (2-x)*np.array([[1., 0, 0]])
19
+ return np.clip(c, 0, 1)
20
+
21
+
22
+ def cm_BlRdGn(x_):
23
+ """Custom colormap: blue (-1) -> red (0.0) -> green (1)."""
24
+ x = np.clip(x_, 0, 1)[..., None]*2
25
+ c = x*np.array([[0, 1., 0, 1.]]) + (2-x)*np.array([[1., 0, 0, 1.]])
26
+
27
+ xn = -np.clip(x_, -1, 0)[..., None]*2
28
+ cn = xn*np.array([[0, 0.1, 1, 1.]]) + (2-xn)*np.array([[1., 0, 0, 1.]])
29
+ out = np.clip(np.where(x_[..., None] < 0, cn, c), 0, 1)
30
+ return out
31
+
32
+
33
+ def cm_prune(x_):
34
+ """ Custom colormap to visualize pruning """
35
+ if isinstance(x_, torch.Tensor):
36
+ x_ = x_.cpu().numpy()
37
+ max_i = max(x_)
38
+ norm_x = np.where(x_ == max_i, -1, (x_-1) / 9)
39
+ return cm_BlRdGn(norm_x)
40
+
41
+
42
+ def plot_images(imgs, titles=None, cmaps='gray', dpi=100, pad=.5,
43
+ adaptive=True):
44
+ """Plot a set of images horizontally.
45
+ Args:
46
+ imgs: list of NumPy RGB (H, W, 3) or PyTorch RGB (3, H, W) or mono (H, W).
47
+ titles: a list of strings, as titles for each image.
48
+ cmaps: colormaps for monochrome images.
49
+ adaptive: whether the figure size should fit the image aspect ratios.
50
+ """
51
+ # conversion to (H, W, 3) for torch.Tensor
52
+ imgs = [img.permute(1, 2, 0).cpu().numpy()
53
+ if (isinstance(img, torch.Tensor) and img.dim() == 3) else img
54
+ for img in imgs]
55
+
56
+ n = len(imgs)
57
+ if not isinstance(cmaps, (list, tuple)):
58
+ cmaps = [cmaps] * n
59
+
60
+ if adaptive:
61
+ ratios = [i.shape[1] / i.shape[0] for i in imgs] # W / H
62
+ else:
63
+ ratios = [4/3] * n
64
+ figsize = [sum(ratios)*4.5, 4.5]
65
+ fig, ax = plt.subplots(
66
+ 1, n, figsize=figsize, dpi=dpi, gridspec_kw={'width_ratios': ratios})
67
+ if n == 1:
68
+ ax = [ax]
69
+ for i in range(n):
70
+ ax[i].imshow(imgs[i], cmap=plt.get_cmap(cmaps[i]))
71
+ ax[i].get_yaxis().set_ticks([])
72
+ ax[i].get_xaxis().set_ticks([])
73
+ ax[i].set_axis_off()
74
+ for spine in ax[i].spines.values(): # remove frame
75
+ spine.set_visible(False)
76
+ if titles:
77
+ ax[i].set_title(titles[i])
78
+ fig.tight_layout(pad=pad)
79
+
80
+
81
+ def plot_keypoints(kpts, colors='lime', ps=4, axes=None, a=1.0):
82
+ """Plot keypoints for existing images.
83
+ Args:
84
+ kpts: list of ndarrays of size (N, 2).
85
+ colors: string, or list of list of tuples (one for each keypoints).
86
+ ps: size of the keypoints as float.
87
+ """
88
+ if not isinstance(colors, list):
89
+ colors = [colors] * len(kpts)
90
+ if not isinstance(a, list):
91
+ a = [a] * len(kpts)
92
+ if axes is None:
93
+ axes = plt.gcf().axes
94
+ for ax, k, c, alpha in zip(axes, kpts, colors, a):
95
+ if isinstance(k, torch.Tensor):
96
+ k = k.cpu().numpy()
97
+ ax.scatter(k[:, 0], k[:, 1], c=c, s=ps, linewidths=0, alpha=alpha)
98
+
99
+
100
+ def plot_matches(kpts0, kpts1, color=None, lw=1.5, ps=4, a=1., labels=None,
101
+ axes=None):
102
+ """Plot matches for a pair of existing images.
103
+ Args:
104
+ kpts0, kpts1: corresponding keypoints of size (N, 2).
105
+ color: color of each match, string or RGB tuple. Random if not given.
106
+ lw: width of the lines.
107
+ ps: size of the end points (no endpoint if ps=0)
108
+ indices: indices of the images to draw the matches on.
109
+ a: alpha opacity of the match lines.
110
+ """
111
+ fig = plt.gcf()
112
+ if axes is None:
113
+ ax = fig.axes
114
+ ax0, ax1 = ax[0], ax[1]
115
+ else:
116
+ ax0, ax1 = axes
117
+ if isinstance(kpts0, torch.Tensor):
118
+ kpts0 = kpts0.cpu().numpy()
119
+ if isinstance(kpts1, torch.Tensor):
120
+ kpts1 = kpts1.cpu().numpy()
121
+ assert len(kpts0) == len(kpts1)
122
+ if color is None:
123
+ color = matplotlib.cm.hsv(np.random.rand(len(kpts0))).tolist()
124
+ elif len(color) > 0 and not isinstance(color[0], (tuple, list)):
125
+ color = [color] * len(kpts0)
126
+
127
+ if lw > 0:
128
+ for i in range(len(kpts0)):
129
+ line = matplotlib.patches.ConnectionPatch(
130
+ xyA=(kpts0[i, 0], kpts0[i, 1]), xyB=(kpts1[i, 0], kpts1[i, 1]),
131
+ coordsA=ax0.transData, coordsB=ax1.transData,
132
+ axesA=ax0, axesB=ax1,
133
+ zorder=1, color=color[i], linewidth=lw, clip_on=True,
134
+ alpha=a, label=None if labels is None else labels[i],
135
+ picker=5.0)
136
+ line.set_annotation_clip(True)
137
+ fig.add_artist(line)
138
+
139
+ # freeze the axes to prevent the transform to change
140
+ ax0.autoscale(enable=False)
141
+ ax1.autoscale(enable=False)
142
+
143
+ if ps > 0:
144
+ ax0.scatter(kpts0[:, 0], kpts0[:, 1], c=color, s=ps)
145
+ ax1.scatter(kpts1[:, 0], kpts1[:, 1], c=color, s=ps)
146
+
147
+
148
+ def add_text(idx, text, pos=(0.01, 0.99), fs=15, color='w',
149
+ lcolor='k', lwidth=2, ha='left', va='top'):
150
+ ax = plt.gcf().axes[idx]
151
+ t = ax.text(*pos, text, fontsize=fs, ha=ha, va=va,
152
+ color=color, transform=ax.transAxes)
153
+ if lcolor is not None:
154
+ t.set_path_effects([
155
+ path_effects.Stroke(linewidth=lwidth, foreground=lcolor),
156
+ path_effects.Normal()])
157
+
158
+
159
+ def save_plot(path, **kw):
160
+ """Save the current figure without any white margin."""
161
+ plt.savefig(path, bbox_inches='tight', pad_inches=0, **kw)
third_party/LightGlue/requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ torch>=1.9.1
2
+ torchvision>=0.3
3
+ numpy
4
+ opencv-python
5
+ matplotlib
6
+ kornia>=0.6.11
third_party/LightGlue/setup.py ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pathlib import Path
2
+ from setuptools import setup
3
+
4
+ description = ['LightGlue']
5
+
6
+ with open(str(Path(__file__).parent / 'README.md'), 'r', encoding='utf-8') as f:
7
+ readme = f.read()
8
+ with open(str(Path(__file__).parent / 'requirements.txt'), 'r') as f:
9
+ dependencies = f.read().split('\n')
10
+
11
+ setup(
12
+ name='lightglue',
13
+ version='0.0',
14
+ packages=['lightglue'],
15
+ python_requires='>=3.6',
16
+ install_requires=dependencies,
17
+ author='Philipp Lindenberger, Paul-Edouard Sarlin',
18
+ description=description,
19
+ long_description=readme,
20
+ long_description_content_type="text/markdown",
21
+ url='https://github.com/cvg/LightGlue/',
22
+ classifiers=[
23
+ "Programming Language :: Python :: 3",
24
+ "License :: OSI Approved :: Apache Software License",
25
+ "Operating System :: OS Independent",
26
+ ],
27
+ )