Main Results & Postprocessing for the APEBench paper

Check if git lfs is available

git lfs --version

Clone without large files (subsets or all data can be downloaded later)

GIT_LFS_SKIP_SMUDGE=1 git clone git@hf.co:thuerey-group/apebench-paper

Alternatively if you have ~250GB of free space, you can download all the data at once

git clone git@hf.co:thuerey-group/apebench-paper

Installation

Change into the cloned directory

cd apebench-paper

Activate LFS

git lfs install

Conda environment

conda create -n ape python=3.12
conda activate ape
pip install -U "jax[cuda12]"
pip install git+ssh://git@github.com/Ceyron/apebench@640020c08f466ab55169b5c3327e1c096e705a64

Pre-commit setup

pip install pre-commit
pre-commit install --install-hooks

Quickstart

(not used in the paper)

apebench studies/hello_world.py

Postprocessing with postprocessing/hello_world.ipynb

Download a subset of the data

Download pre-melted data

easiest and least resource intensive

git lfs pull -I melted/adv_varying_difficulty_nonlin_emulators

Or all melted data at once (~150MB)

git lfs pull -I melted/*

Download a post-processing notebook

git lfs pull -I postprocessing/adv_varying_difficulty_nonlin_emulators/postprocessing_adv_varying_difficulty_nonlin_emulators.ipynb

Download all eval data

Not recommended

git lfs pull -I results/raw/*

Download all network weights

Not recommended

git lfs pull -I results/network_weights/*

Download data needed for one study

Execute this Python script (for an example study)

import apebench
from studies.broad_comparison_1d import CONFIGS

raw_file_list, network_weights_list = apebench.run_study(CONFIGS, "results/")

file_list_together = ",".join(str(p) for p in raw_file_list)
network_weights_list_together = ",".join(str(p) for p in network_weights_list)

with open("download.sh", "w") as f:
    f.write(f"git lfs pull -I {file_list_together}\n")
    # f.write(f"git lfs pull -I {network_weights_list_together}\n")

Or directly from bash

python -c "import apebench; from studies.broad_comparison_1d import CONFIGS; raw_file_list, network_weights_list = apebench.run_study(CONFIGS, 'results/'); file_list_together = ','.join(str(p) for p in raw_file_list);
with open('download.sh', 'w') as f:
    f.write(f'''git lfs pull -I '{file_list_together}'\n''')"

Then

bash download.sh

Melt and concatenate the data

(replace the path to the Python file with the study you want to process)

apebench studies/broad_comparison_1d.py

This can take anything from ~2min to 20min depending on the study. Also consider downloading the pre-melted data if you only need the results.