Spaces:
Running
Running
## Generation of synthetic image pairs using Habitat-Sim | |
These instructions allow to generate pre-training pairs from the Habitat simulator. | |
As we did not save metadata of the pairs used in the original paper, they are not strictly the same, but these data use the same setting and are equivalent. | |
### Download Habitat-Sim scenes | |
Download Habitat-Sim scenes: | |
- Download links can be found here: https://github.com/facebookresearch/habitat-sim/blob/main/DATASETS.md | |
- We used scenes from the HM3D, habitat-test-scenes, Replica, ReplicaCad and ScanNet datasets. | |
- Please put the scenes under `./data/habitat-sim-data/scene_datasets/` following the structure below, or update manually paths in `paths.py`. | |
``` | |
./data/ | |
└──habitat-sim-data/ | |
└──scene_datasets/ | |
├──hm3d/ | |
├──gibson/ | |
├──habitat-test-scenes/ | |
├──replica_cad_baked_lighting/ | |
├──replica_cad/ | |
├──ReplicaDataset/ | |
└──scannet/ | |
``` | |
### Image pairs generation | |
We provide metadata to generate reproducible images pairs for pretraining and validation. | |
Experiments described in the paper used similar data, but whose generation was not reproducible at the time. | |
Specifications: | |
- 256x256 resolution images, with 60 degrees field of view . | |
- Up to 1000 image pairs per scene. | |
- Number of scenes considered/number of images pairs per dataset: | |
- Scannet: 1097 scenes / 985 209 pairs | |
- HM3D: | |
- hm3d/train: 800 / 800k pairs | |
- hm3d/val: 100 scenes / 100k pairs | |
- hm3d/minival: 10 scenes / 10k pairs | |
- habitat-test-scenes: 3 scenes / 3k pairs | |
- replica_cad_baked_lighting: 13 scenes / 13k pairs | |
- Scenes from hm3d/val and hm3d/minival pairs were not used for the pre-training but kept for validation purposes. | |
Download metadata and extract it: | |
```bash | |
mkdir -p data/habitat_release_metadata/ | |
cd data/habitat_release_metadata/ | |
wget https://download.europe.naverlabs.com/ComputerVision/CroCo/data/habitat_release_metadata/multiview_habitat_metadata.tar.gz | |
tar -xvf multiview_habitat_metadata.tar.gz | |
cd ../.. | |
# Location of the metadata | |
METADATA_DIR="./data/habitat_release_metadata/multiview_habitat_metadata" | |
``` | |
Generate image pairs from metadata: | |
- The following command will print a list of commandlines to generate image pairs for each scene: | |
```bash | |
# Target output directory | |
PAIRS_DATASET_DIR="./data/habitat_release/" | |
python datasets/habitat_sim/generate_from_metadata_files.py --input_dir=$METADATA_DIR --output_dir=$PAIRS_DATASET_DIR | |
``` | |
- One can launch multiple of such commands in parallel e.g. using GNU Parallel: | |
```bash | |
python datasets/habitat_sim/generate_from_metadata_files.py --input_dir=$METADATA_DIR --output_dir=$PAIRS_DATASET_DIR | parallel -j 16 | |
``` | |
## Metadata generation | |
Image pairs were randomly sampled using the following commands, whose outputs contain randomness and are thus not exactly reproducible: | |
```bash | |
# Print commandlines to generate image pairs from the different scenes available. | |
PAIRS_DATASET_DIR=MY_CUSTOM_PATH | |
python datasets/habitat_sim/generate_multiview_images.py --list_commands --output_dir=$PAIRS_DATASET_DIR | |
# Once a dataset is generated, pack metadata files for reproducibility. | |
METADATA_DIR=MY_CUSTON_PATH | |
python datasets/habitat_sim/pack_metadata_files.py $PAIRS_DATASET_DIR $METADATA_DIR | |
``` | |