Intel
/

dpt-beit-base-384

Depth Estimation

Inference Endpoints

Model card Files Files and versions Community

nielsr HF staff commited on Nov 28, 2023

Commit

e553fa2

•

1 Parent(s): f609b93

Update README.md (#2)

- Update README.md (172c440f08454205a35ec8663aabd820bd741301)

Files changed (1) hide show

README.md +3 -4

README.md CHANGED Viewed

@@ -5,15 +5,14 @@ license: mit
 # DPT 3.1 (BEiT backbone)
 DPT (Dense Prediction Transformer) model trained on 1.4 million images for monocular depth estimation. It was introduced in the paper [Vision Transformers for Dense Prediction](https://arxiv.org/abs/2103.13413) by Ranftl et al. (2021) and first released in [this repository](https://github.com/isl-org/DPT).
-DPT uses the [BEiT](https://huggingface.co/docs/transformers/model_doc/beit) model as backbone and adds a neck + head on top for monocular depth estimation.
-![model image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/dpt_architecture.jpg)
 Disclaimer: The team releasing DPT did not write a model card for this model so this model card has been written by the Hugging Face team.
 ## Model description
-The Table Transformer is equivalent to [DETR](https://huggingface.co/docs/transformers/model_doc/detr), a Transformer-based object detection model. Note that the authors decided to use the "normalize before" setting of DETR, which means that layernorm is applied before self- and cross-attention.
 ## How to use

 # DPT 3.1 (BEiT backbone)
 DPT (Dense Prediction Transformer) model trained on 1.4 million images for monocular depth estimation. It was introduced in the paper [Vision Transformers for Dense Prediction](https://arxiv.org/abs/2103.13413) by Ranftl et al. (2021) and first released in [this repository](https://github.com/isl-org/DPT).
 Disclaimer: The team releasing DPT did not write a model card for this model so this model card has been written by the Hugging Face team.
 ## Model description
+This DPT model uses the [BEiT](https://huggingface.co/docs/transformers/model_doc/beit) model as backbone and adds a neck + head on top for monocular depth estimation.
+![model image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/dpt_architecture.jpg)
 ## How to use