Depth Pro checkpoint
Browse files- README.md +85 -0
- depth_pro.pt +3 -0
README.md
ADDED
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apple-ascl
|
3 |
+
---
|
4 |
+
|
5 |
+
# Depth Pro: Sharp Monocular Metric Depth in Less Than a Second
|
6 |
+
|
7 |
+
![Depth Pro Demo Image](https://github.com/apple/ml-depth-pro/raw/main/data/depth-pro-teaser.jpg)
|
8 |
+
|
9 |
+
We present a foundation model for zero-shot metric monocular depth estimation. Our model, Depth Pro, synthesizes high-resolution depth maps with unparalleled sharpness and high-frequency details. The predictions are metric, with absolute scale, without relying on the availability of metadata such as camera intrinsics. And the model is fast, producing a 2.25-megapixel depth map in 0.3 seconds on a standard GPU. These characteristics are enabled by a number of technical contributions, including an efficient multi-scale vision transformer for dense prediction, a training protocol that combines real and synthetic datasets to achieve high metric accuracy alongside fine boundary tracing, dedicated evaluation metrics for boundary accuracy in estimated depth maps, and state-of-the-art focal length estimation from a single image.
|
10 |
+
|
11 |
+
Depth Pro was introduced in **Depth Pro: Sharp Monocular Metric Depth in Less Than a Second**, by *Aleksei Bochkovskii, Amaël Delaunoy, Hugo Germain, Marcel Santos, Yichao Zhou, Stephan R. Richter, and Vladlen Koltun*.
|
12 |
+
|
13 |
+
The checkpoint in this repository is a reference implementation, which has been re-trained. Its performance is close to the model reported in the paper but does not match it exactly.
|
14 |
+
|
15 |
+
## How to Use
|
16 |
+
|
17 |
+
Please, follow the steps in the [code repository](https://github.com/apple/ml-depth-pro) to set up your environment. Then you can download the checkpoint from the _Files and versions_ tab above, or use the `huggingface-hub` CLI:
|
18 |
+
|
19 |
+
```bash
|
20 |
+
pip install huggingface-hub
|
21 |
+
huggingface-cli download --local-dir checkpoints pcuenq/Depth-Pro
|
22 |
+
```
|
23 |
+
|
24 |
+
### Running from commandline
|
25 |
+
|
26 |
+
The code repo provides a helper script to run the model on a single image:
|
27 |
+
|
28 |
+
```bash
|
29 |
+
# Run prediction on a single image:
|
30 |
+
depth-pro-run -i ./data/example.jpg
|
31 |
+
# Run `depth-pro-run -h` for available options.
|
32 |
+
```
|
33 |
+
|
34 |
+
### Running from Python
|
35 |
+
|
36 |
+
```python
|
37 |
+
from PIL import Image
|
38 |
+
import depth_pro
|
39 |
+
|
40 |
+
# Load model and preprocessing transform
|
41 |
+
model, transform = depth_pro.create_model_and_transforms()
|
42 |
+
model.eval()
|
43 |
+
|
44 |
+
# Load and preprocess an image.
|
45 |
+
image, _, f_px = depth_pro.load_rgb(image_path)
|
46 |
+
image = transform(image)
|
47 |
+
|
48 |
+
# Run inference.
|
49 |
+
prediction = model.infer(image, f_px=f_px)
|
50 |
+
depth = prediction["depth"] # Depth in [m].
|
51 |
+
focallength_px = prediction["focallength_px"] # Focal length in pixels.
|
52 |
+
```
|
53 |
+
|
54 |
+
### Evaluation (boundary metrics)
|
55 |
+
|
56 |
+
Boundary metrics are implemented in `eval/boundary_metrics.py` and can be used as follows:
|
57 |
+
|
58 |
+
```python
|
59 |
+
# for a depth-based dataset
|
60 |
+
boundary_f1 = SI_boundary_F1(predicted_depth, target_depth)
|
61 |
+
|
62 |
+
# for a mask-based dataset (image matting / segmentation)
|
63 |
+
boundary_recall = SI_boundary_Recall(predicted_depth, target_mask)
|
64 |
+
```
|
65 |
+
|
66 |
+
|
67 |
+
## Citation
|
68 |
+
|
69 |
+
If you find our work useful, please cite the following paper:
|
70 |
+
|
71 |
+
```bibtex
|
72 |
+
@article{Bochkovskii2024:arxiv,
|
73 |
+
author = {Aleksei Bochkovskii and Ama\"{e}l Delaunoy and Hugo Germain and Marcel Santos and
|
74 |
+
Yichao Zhou and Stephan R. Richter and Vladlen Koltun}
|
75 |
+
title = {Depth Pro: Sharp Monocular Metric Depth in Less Than a Second},
|
76 |
+
journal = {arXiv},
|
77 |
+
year = {2024},
|
78 |
+
}
|
79 |
+
```
|
80 |
+
|
81 |
+
## Acknowledgements
|
82 |
+
|
83 |
+
Our codebase is built using multiple opensource contributions, please see [Acknowledgements](https://github.com/apple/ml-depth-pro/blob/main/ACKNOWLEDGEMENTS.md) for more details.
|
84 |
+
|
85 |
+
Please check the paper for a complete list of references and datasets used in this work.
|
depth_pro.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3eb35ca68168ad3d14cb150f8947a4edf85589941661fdb2686259c80685c0ce
|
3 |
+
size 1904446787
|