# Lyrebird VampNet This repository contains recipes for training generative music models on top of the Lyrebird Audio Codec. ## Install hooks First install the pre-commit util: https://pre-commit.com/#install pip install pre-commit # with pip brew install pre-commit # on Mac Then install the git hooks pre-commit install # check .pre-commit-config.yaml for details of hooks Upon `git commit`, the pre-commit hooks will be run automatically on the stage files (i.e. added by `git add`) **N.B. By default, pre-commit checks only run on staged files** If you need to run it on all files: pre-commit run --all-files ## Development ### Setting everything up Run the setup script to set up your environment via: ```bash python env/setup.py ``` The setup script does not require any dependencies beyond just Python. Once run, follow the instructions it prints out to create your environment file, which will be at `env/env.sh`. Note that if this is a new machine, and the data is not downloaded somewhere on it already, it will ask you for a directory to download the data to. For Github setup, if you don't have a .netrc token, create one by going to your Github profile -> Developer settings -> Personal access tokens -> Generate new token. Copy the token and [keep it secret, keep it safe](https://www.youtube.com/watch?v=iThtELZvfPs). When complete, run: ```bash source env/env.sh ``` Now build and launch the Docker containers: ```bash docker compose up -d ``` This builds and runs a Jupyter notebook and Tensorboard in the background, which points to your `TENSORBOARD_PATH` env. variable. Now, launch your development environment via: ```bash docker compose run dev ``` To tear down your development environment, just do ```bash docker compose down ``` ### Launching an experiment Experiments are first _staged_ by running the `stage` command (which corresponds to the script `scripts/exp/stage.py`). `stage` creates a directory with a copy of all of the Git-tracked files in the root repository.`stage` launches a shell into said directory, so all commands are run on the copy of the original repository code. This is useful for rewinding to an old experiment and resuming it, for example. Even if the repository code changes, the snapshot in the experiment directory is unchanged from the original run, so it can be re-used. Then, the experiment can be run via: ```bash torchrun --nproc_per_node gpu \ scripts/exp/train.py \ --args.load=conf/args.yml \ ``` The full settings are in [conf/daps/train.yml](conf/daps/train.yml). ### Useful commands #### Cleaning up after a run Sometimes DDP runs fail to clear themselves out of the machine. To fix this, run ```bash cleanup ```