NightRaven109 commited on
Commit
6ffe57a
·
verified ·
1 Parent(s): 32d734b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -114
README.md CHANGED
@@ -1,114 +1,14 @@
1
- # Depth Anything V2 for Metric Depth Estimation
2
-
3
- ![teaser](./assets/compare_zoedepth.png)
4
-
5
- We here provide a simple codebase to fine-tune our Depth Anything V2 pre-trained encoder for metric depth estimation. Built on our powerful encoder, we use a simple DPT head to regress the depth. We fine-tune our pre-trained encoder on synthetic Hypersim / Virtual KITTI datasets for indoor / outdoor metric depth estimation, respectively.
6
-
7
-
8
- # Pre-trained Models
9
-
10
- We provide **six metric depth models** of three scales for indoor and outdoor scenes, respectively.
11
-
12
- | Base Model | Params | Indoor (Hypersim) | Outdoor (Virtual KITTI 2) |
13
- |:-|-:|:-:|:-:|
14
- | Depth-Anything-V2-Small | 24.8M | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-Hypersim-Small/resolve/main/depth_anything_v2_metric_hypersim_vits.pth?download=true) | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-VKITTI-Small/resolve/main/depth_anything_v2_metric_vkitti_vits.pth?download=true) |
15
- | Depth-Anything-V2-Base | 97.5M | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-Hypersim-Base/resolve/main/depth_anything_v2_metric_hypersim_vitb.pth?download=true) | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-VKITTI-Base/resolve/main/depth_anything_v2_metric_vkitti_vitb.pth?download=true) |
16
- | Depth-Anything-V2-Large | 335.3M | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-Hypersim-Large/resolve/main/depth_anything_v2_metric_hypersim_vitl.pth?download=true) | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Metric-VKITTI-Large/resolve/main/depth_anything_v2_metric_vkitti_vitl.pth?download=true) |
17
-
18
- *We recommend to first try our larger models (if computational cost is affordable) and the indoor version.*
19
-
20
- ## Usage
21
-
22
- ### Prepraration
23
-
24
- ```bash
25
- git clone https://github.com/DepthAnything/Depth-Anything-V2
26
- cd Depth-Anything-V2/metric_depth
27
- pip install -r requirements.txt
28
- ```
29
-
30
- Download the checkpoints listed [here](#pre-trained-models) and put them under the `checkpoints` directory.
31
-
32
- ### Use our models
33
- ```python
34
- import cv2
35
- import torch
36
-
37
- from depth_anything_v2.dpt import DepthAnythingV2
38
-
39
- model_configs = {
40
- 'vits': {'encoder': 'vits', 'features': 64, 'out_channels': [48, 96, 192, 384]},
41
- 'vitb': {'encoder': 'vitb', 'features': 128, 'out_channels': [96, 192, 384, 768]},
42
- 'vitl': {'encoder': 'vitl', 'features': 256, 'out_channels': [256, 512, 1024, 1024]}
43
- }
44
-
45
- encoder = 'vitl' # or 'vits', 'vitb'
46
- dataset = 'hypersim' # 'hypersim' for indoor model, 'vkitti' for outdoor model
47
- max_depth = 20 # 20 for indoor model, 80 for outdoor model
48
-
49
- model = DepthAnythingV2(**{**model_configs[encoder], 'max_depth': max_depth})
50
- model.load_state_dict(torch.load(f'checkpoints/depth_anything_v2_metric_{dataset}_{encoder}.pth', map_location='cpu'))
51
- model.eval()
52
-
53
- raw_img = cv2.imread('your/image/path')
54
- depth = model.infer_image(raw_img) # HxW depth map in meters in numpy
55
- ```
56
-
57
- ### Running script on images
58
-
59
- Here, we take the `vitl` encoder as an example. You can also use `vitb` or `vits` encoders.
60
-
61
- ```bash
62
- # indoor scenes
63
- python run.py \
64
- --encoder vitl \
65
- --load-from checkpoints/depth_anything_v2_metric_hypersim_vitl.pth \
66
- --max-depth 20 \
67
- --img-path <path> --outdir <outdir> [--input-size <size>] [--save-numpy]
68
-
69
- # outdoor scenes
70
- python run.py \
71
- --encoder vitl \
72
- --load-from checkpoints/depth_anything_v2_metric_vkitti_vitl.pth \
73
- --max-depth 80 \
74
- --img-path <path> --outdir <outdir> [--input-size <size>] [--save-numpy]
75
- ```
76
-
77
- ### Project 2D images to point clouds:
78
-
79
- ```bash
80
- python depth_to_pointcloud.py \
81
- --encoder vitl \
82
- --load-from checkpoints/depth_anything_v2_metric_hypersim_vitl.pth \
83
- --max-depth 20 \
84
- --img-path <path> --outdir <outdir>
85
- ```
86
-
87
- ### Reproduce training
88
-
89
- Please first prepare the [Hypersim](https://github.com/apple/ml-hypersim) and [Virtual KITTI 2](https://europe.naverlabs.com/research/computer-vision/proxy-virtual-worlds-vkitti-2/) datasets. Then:
90
-
91
- ```bash
92
- bash dist_train.sh
93
- ```
94
-
95
-
96
- ## Citation
97
-
98
- If you find this project useful, please consider citing:
99
-
100
- ```bibtex
101
- @article{depth_anything_v2,
102
- title={Depth Anything V2},
103
- author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
104
- journal={arXiv:2406.09414},
105
- year={2024}
106
- }
107
-
108
- @inproceedings{depth_anything_v1,
109
- title={Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data},
110
- author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
111
- booktitle={CVPR},
112
- year={2024}
113
- }
114
- ```
 
1
+ ---
2
+ title: Diffuse2PBR
3
+ emoji: 🏆
4
+ colorFrom: pink
5
+ colorTo: yellow
6
+ sdk: gradio
7
+ sdk_version: 5.6.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: cc-by-nc-2.0
11
+ short_description: Convert Diffuse Textures to Height and Normal maps
12
+ ---
13
+
14
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference