---
title: Opdmulti Demo
emoji: 🌍
colorFrom: gray
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: mit
---

# OPDMulti: Openable Part Detection for Multiple Objects
[Xiaohao Sun*](https://sun-xh.github.io/), [Hanxiao Jiang*](https://jianghanxiao.github.io/), [Manolis Savva](https://msavva.github.io/), [Angel Xuan Chang](http://angelxuanchang.github.io/)

This repository is intended as a deployment of a demo for the [OPDMulti](https://github.com/3dlg-hcvc/OPDMulti) project.
Please refer there for more information about the proect and implementation.

[arXiv](https://arxiv.org/abs/2303.14087)&nbsp; [Website](https://3dlg-hcvc.github.io/OPDMulti/)

## Installation

## Requirements

For the docker build, you will just need docker in order to build and run the container, else you will need

* python 3.10 (this definitely does not work with 3.11, and you may need to downgrade some packages to work with earlier versions of Python)
* git
* cmake
* libosmesa6-dev (for open3d headless rendering)

A full list of other packages can be found in the Dockerfile, or in `Open3D/util/install_deps_ubuntu.sh`.

**BEFORE BUILDING** as of writing, you will need to copy the model file manually to `.data/models/motion_state_pred_opdformerp_rgb.pth`
in the repository. This step must occur before the docker build, or if building locally then before running. Future
work will make this step no longer required.

### Docker (preferred)

To build the docker container, run
```
docker build -f Dockerfile -t opdmulti-demo .
```

### Local

To setup the environment, run the following (recommended in a virtual environment):
```
# install base requirements
python3.10 -m pip install -r requirements.txt

# install detectron2 (must be done after some of the libraries in requirements.txt)
python3.10 -m pip install git+https://github.com/facebookresearch/detectron2.git@fc9c33b1f6e5d4c37bbb46dde19af41afc1ddb2a

# build library for model
cd mask2former/modeling/pixel_decoder/ops
python setup.py build install

# INSTALL OPEN3D
# --------------
# Option A: running locally only
pip install open3d==0.17.0

# Option B: running over ssh connection / headless environment
# in a separate folder
git clone https://github.com/isl-org/Open3D.git
cd Open3D/
mkdir build && cd build
cmake -DENABLE_HEADLESS_RENDERING=ON -DBUILD_GUI=OFF -DBUILD_WEBRTC=OFF -DUSE_SYSTEM_GLEW=OFF -DUSE_SYSTEM_GLFW=OFF ..
make -j$(nproc)
make install-pip-package
# to test custom build
cd ../examples/python/visualization/
python headless_rendering.py
```

## Usage

### Docker (preferred)

To run the docker container, execute
```
docker run -d --network host -t opdmulti-demo
```
If you want to see the output of the container or interact with it,

* use `-it` to run in interactive mode, and remove the `-d` option
* add `bash` to the end to open into a console rather than running the app directly

### Local

To startup the application locally, run
```
gradio app.py
```

You can view the app on the specified port (usually 7860). To run over an ssh connection, setup port forwarding using
`-L 7860:localhost:7860` when you create your ssh connection. Note that you will need to install Open3D in headless 
rendering for this to work, as described above.

## Citation
If you find this code useful, please consider citing:
```bibtex
@article{sun2023opdmulti,
  title={OPDMulti: Openable Part Detection for Multiple Objects},
  author={Sun, Xiaohao and Jiang, Hanxiao and Savva, Manolis and Chang, Angel Xuan},
  journal={arXiv preprint arXiv:2303.14087},
  year={2023}
}

@article{mao2022multiscan,
  title={MultiScan: Scalable RGBD scanning for 3D environments with articulated objects},
  author={Mao, Yongsen and Zhang, Yiming and Jiang, Hanxiao and Chang, Angel and Savva, Manolis},
  journal={Advances in Neural Information Processing Systems},
  volume={35},
  pages={9058--9071},
  year={2022}
}

@inproceedings{jiang2022opd,
  title={OPD: Single-view 3D openable part detection},
  author={Jiang, Hanxiao and Mao, Yongsen and Savva, Manolis and Chang, Angel X},
  booktitle={Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XXXIX},
  pages={410--426},
  year={2022},
  organization={Springer}
}

@inproceedings{cheng2022masked,
  title={Masked-attention mask transformer for universal image segmentation},
  author={Cheng, Bowen and Misra, Ishan and Schwing, Alexander G and Kirillov, Alexander and Girdhar, Rohit},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={1290--1299},
  year={2022}
}
```