Note: Benchmarks in their original form can be found at the original github repo.
Instructions to run
Docker
All of the below commands should be run in a Docker container built using the Dockerfile in the repo, with the data and repo being exposed as volumes in the container.
To build:
docker build -t benchmarks_img .
To run an interactive shell:
docker run -it --shm-size=2G --gpus all -v /path/to/neurips2023-benchmarks:/neurips2023-benchmarks -v /path/to/datasets/:/data benchmarks_img
Analysis Task
In the analysis part, two folder are available, one for the classification comparison with the results in the appendix and another one for the regression comparison in the benchmark section (section 5.1).
First, the path to the dataset must be correctly initialized in the config.py file in analysis/regression and analysis/classification, the variable DATA_DIR must lead to the dataset with the folders images/, metadata/ and the file metadata.json.
To run the code for regression or classification :
python3 train.py --model_name resnet18 --size 224 --cropped True --device 0
Replace the arguments with your desired values:
--device
: The index of the GPU device to use (0 by default) (e.g. number between 0 and the number of devices in your GPU, runnvidia-smi
for more infos).--size
: Input image size (e.g., 512).--cropped
: Whether to use cropped images (True or False).
The train.py files are the entry points of the code. And then it basically follows the PytorchLightning workflow : https://lightning.ai/docs/pytorch/stable/starter/introduction.html
Regression
More specifically to run the regression code :
cd analysis/regression
python3 train.py --model_name resnet18 --labels wind --size 224 --cropped True --device 0
In this case more arguments can be changed :
--model_name
: The architecture of the model (e.g., "resnet18", "resnet50").--label
: The target label for regression (e.g., "wind", "pressure").
By changing the arguments you can obtain the results of the tables 2 and 3 in the paper.
A pipeline is also provided which trains all the benchmarks in a row:
cd analysis/regression
python3 pipeline.py --device 0
Classification
To run the classification code :
cd analysis/classification
python3 train.py --model_name vgg --size 224 --cropped True --device 0
In this case, the model_name possibilities are different :
--model_name
: The architecture of the model (e.g., "resnet18", "vgg", "vit").
Same as above, a pipeline in the classification folder launch all the training of the classification benchmarks in a row:
cd analysis/classification
python3 pipeline.py --device 0
Results visualisation
The results are saved in a /analysis/[classification,regression]/results/
Folders with names depending of the arguments are created after a training and it creates different version after a few tries. It is managed by the pytorch_lightning.loggers TensorboardLogger : https://lightning.ai/docs/pytorch/stable/extensions/generated/lightning.pytorch.loggers.TensorBoardLogger.html#tensorboardlogger
To visualize the results you can launch the command
tensorboard --logdir /results/[training_name]/version_X/
And then open the local link given in a browser.