Main Results & Postprocessing for the APEBench paper
Check if git lfs is available
git lfs --version
Clone without large files (subsets or all data can be downloaded later)
GIT_LFS_SKIP_SMUDGE=1 git clone git@hf.co:thuerey-group/apebench-paper
Alternatively if you have ~250GB of free space, you can download all the data at once
git clone git@hf.co:thuerey-group/apebench-paper
Installation
Change into the cloned directory
cd apebench-paper
Activate LFS
git lfs install
Conda environment
conda create -n ape python=3.12
conda activate ape
pip install -U "jax[cuda12]"
pip install git+ssh://git@github.com/Ceyron/apebench@640020c08f466ab55169b5c3327e1c096e705a64
Pre-commit setup
pip install pre-commit
pre-commit install --install-hooks
Quickstart
(not used in the paper)
apebench studies/hello_world.py
Postprocessing with postprocessing/hello_world.ipynb
Download a subset of the data
Download pre-melted data
easiest and least resource intensive
git lfs pull -I melted/adv_varying_difficulty_nonlin_emulators
Or all melted data at once (~150MB)
git lfs pull -I melted/*
Download a post-processing notebook
git lfs pull -I postprocessing/adv_varying_difficulty_nonlin_emulators/postprocessing_adv_varying_difficulty_nonlin_emulators.ipynb
Download all eval data
Not recommended
git lfs pull -I results/raw/*
Download all network weights
Not recommended
git lfs pull -I results/network_weights/*
Download data needed for one study
Execute this Python script (for an example study)
import apebench
from studies.broad_comparison_1d import CONFIGS
raw_file_list, network_weights_list = apebench.run_study(CONFIGS, "results/")
file_list_together = ",".join(str(p) for p in raw_file_list)
network_weights_list_together = ",".join(str(p) for p in network_weights_list)
with open("download.sh", "w") as f:
f.write(f"git lfs pull -I {file_list_together}\n")
# f.write(f"git lfs pull -I {network_weights_list_together}\n")
Or directly from bash
python -c "import apebench; from studies.broad_comparison_1d import CONFIGS; raw_file_list, network_weights_list = apebench.run_study(CONFIGS, 'results/'); file_list_together = ','.join(str(p) for p in raw_file_list);
with open('download.sh', 'w') as f:
f.write(f'''git lfs pull -I '{file_list_together}'\n''')"
Then
bash download.sh
Melt and concatenate the data
(replace the path to the Python file with the study you want to process)
apebench studies/broad_comparison_1d.py
This can take anything from ~2min to 20min depending on the study. Also consider downloading the pre-melted data if you only need the results.