Update README.md
Browse files
README.md
CHANGED
@@ -16,9 +16,7 @@ The LDM3D model was proposed in ["LDM3D: Latent Diffusion Model for 3D"](https:/
|
|
16 |
|
17 |
LDM3D got accepted to [CVPRW'23]([https://aaai.org/Conferences/AAAI-23/](https://cvpr2023.thecvf.com/)).
|
18 |
|
19 |
-
|
20 |
-
- [polyhaven](https://polyhaven.com/): 585 images for the training set, 66 images for the validation set
|
21 |
-
- [ihdri](https://www.ihdri.com/hdri-skies-outdoor/): 57 outdoor images for the training set, 7 outdoor images for the validation set.
|
22 |
|
23 |
These datasets were augmented using [Text2Light](https://frozenburning.github.io/projects/text2light/) to create a dataset containing 13852 training samples and 1606 validation samples.
|
24 |
|
@@ -47,14 +45,14 @@ Here is how to use this model to get the features of a given text in PyTorch:
|
|
47 |
|
48 |
from diffusers import StableDiffusionLDM3DPipeline
|
49 |
|
50 |
-
pipe = StableDiffusionLDM3DPipeline.from_pretrained("Intel/ldm3d-
|
51 |
pipe.to("cuda")
|
52 |
|
53 |
|
54 |
-
prompt ="
|
55 |
-
name = "
|
56 |
|
57 |
-
output = pipe(prompt)
|
58 |
rgb_image, depth_image = output.rgb, output.depth
|
59 |
rgb_image[0].save(name+"_ldm3d_rgb.jpg")
|
60 |
depth_image[0].save(name+"_ldm3d_depth.png")
|
@@ -62,7 +60,7 @@ depth_image[0].save(name+"_ldm3d_depth.png")
|
|
62 |
|
63 |
This is the result:
|
64 |
|
65 |
-
![ldm3d_results](
|
66 |
|
67 |
|
68 |
### Limitations and bias
|
@@ -77,13 +75,15 @@ The LDM3D model was finetuned on a dataset constructed from a subset of the LAIO
|
|
77 |
|
78 |
### Finetuning
|
79 |
|
80 |
-
|
|
|
|
|
81 |
|
82 |
-
|
|
|
|
|
|
|
83 |
|
84 |
-
Please refer to Table 1 and Table2 from the [paper](https://arxiv.org/abs/2305.10853) for quantitative results.
|
85 |
-
The figure below shows some qualitative results comparing our method with (Stable diffusion v1.4)[https://arxiv.org/pdf/2112.10752.pdf] and with (DPT-Large)[https://arxiv.org/pdf/2103.13413.pdf] for the depth maps
|
86 |
-
![qualitative results](qualitative_results.png)
|
87 |
|
88 |
### BibTeX entry and citation info
|
89 |
```bibtex
|
|
|
16 |
|
17 |
LDM3D got accepted to [CVPRW'23]([https://aaai.org/Conferences/AAAI-23/](https://cvpr2023.thecvf.com/)).
|
18 |
|
19 |
+
|
|
|
|
|
20 |
|
21 |
These datasets were augmented using [Text2Light](https://frozenburning.github.io/projects/text2light/) to create a dataset containing 13852 training samples and 1606 validation samples.
|
22 |
|
|
|
45 |
|
46 |
from diffusers import StableDiffusionLDM3DPipeline
|
47 |
|
48 |
+
pipe = StableDiffusionLDM3DPipeline.from_pretrained("Intel/ldm3d-pano")
|
49 |
pipe.to("cuda")
|
50 |
|
51 |
|
52 |
+
prompt ="360 view of a large bedroom"
|
53 |
+
name = "bedroom_pano"
|
54 |
|
55 |
+
output = pipe(prompt, width=1024, height=512,)
|
56 |
rgb_image, depth_image = output.rgb, output.depth
|
57 |
rgb_image[0].save(name+"_ldm3d_rgb.jpg")
|
58 |
depth_image[0].save(name+"_ldm3d_depth.png")
|
|
|
60 |
|
61 |
This is the result:
|
62 |
|
63 |
+
![ldm3d_results](ldm3d_pano_results.png)
|
64 |
|
65 |
|
66 |
### Limitations and bias
|
|
|
75 |
|
76 |
### Finetuning
|
77 |
|
78 |
+
This checkpoint finetunes the previous [ldm3d-4c](https://huggingface.co/Intel/ldm3d-4c) on 2 panoramic-images datasets:
|
79 |
+
- [polyhaven](https://polyhaven.com/): 585 images for the training set, 66 images for the validation set
|
80 |
+
- [ihdri](https://www.ihdri.com/hdri-skies-outdoor/): 57 outdoor images for the training set, 7 outdoor images for the validation set.
|
81 |
|
82 |
+
|
83 |
+
These datasets were augmented using [Text2Light](https://frozenburning.github.io/projects/text2light/) to create a dataset containing 13852 training samples and 1606 validation samples.
|
84 |
+
|
85 |
+
In order to generate the depth map of those samples, we used [DPT-large](https://github.com/isl-org/MiDaS) and to generate the caption we used [BLIP-2](https://huggingface.co/docs/transformers/main/model_doc/blip-2)
|
86 |
|
|
|
|
|
|
|
87 |
|
88 |
### BibTeX entry and citation info
|
89 |
```bibtex
|