vampnet / README.md
Hugo Flores Garcia
add model ckpt path
6f55a79
|
raw
history blame
3.17 kB
# VampNet
This repository contains recipes for training generative music models on top of the Lyrebird Audio Codec.
# Setting up
install AudioTools
```bash
git clone https://github.com/hugofloresgarcia/audiotools.git
pip install -e ./audiotools
```
install the LAC library.
```bash
git clone https://github.com/hugofloresgarcia/lac.git
pip install -e ./lac
```
install VampNet
```bash
git clone https://github.com/hugofloresgarcia/vampnet2.git
pip install -e ./vampnet2
```
## A note on argbind
This repository relies on [argbind](https://github.com/pseeth/argbind) to manage CLIs and config files.
Config files are stored in the `conf/` folder.
## Getting the Pretrained Models
Download the pretrained models from [this link](https://drive.google.com/file/d/1ZIBMJMt8QRE8MYYGjg4lH7v7BLbZneq2/view?usp=sharing). Then, extract the models to the `models/` folder.
# How the code is structured
This code was written fast to meet a publication deadline, so it can be messy and redundant at times. Currently working on cleaning it up.
```
β”œβ”€β”€ conf <- (conf files for training, finetuning, etc)
β”œβ”€β”€ demo.py <- (gradio UI for playing with vampnet)
β”œβ”€β”€ env <- (environment variables)
β”‚Β Β  └── env.sh
β”œβ”€β”€ models <- (extract pretrained models)
β”‚Β Β  β”œβ”€β”€ spotdl
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ c2f.pth <- (coarse2fine checkpoint)
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ coarse.pth <- (coarse checkpoint)
β”‚Β Β  β”‚Β Β  └── codec.pth <- (codec checkpoint)
β”‚Β Β  └── wavebeat.pth
β”œβ”€β”€ README.md
β”œβ”€β”€ scripts
β”‚Β Β  β”œβ”€β”€ exp
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ eval.py <- (eval script)
β”‚Β Β  β”‚Β Β  └── train.py <- (training/finetuning script)
β”‚Β Β  └── utils
β”œβ”€β”€ vampnet
β”‚Β Β  β”œβ”€β”€ beats.py <- (beat tracking logic)
β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  β”œβ”€β”€ interface.py <- (high-level programmatic interface)
β”‚Β Β  β”œβ”€β”€ mask.py
β”‚Β Β  β”œβ”€β”€ modules
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ activations.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ layers.py
β”‚Β Β  β”‚Β Β  └── transformer.py <- (architecture + sampling code)
β”‚Β Β  β”œβ”€β”€ scheduler.py
β”‚Β Β  └── util.py
```
# Usage
First, you'll want to set up your environment
```bash
source ./env/env.sh
```
## Staging a Run
Staging a run makes a copy of all the git-tracked files in the codebase and saves them to a folder for reproducibility. You can then run the training script from the staged folder.
```
stage --name my_run --run_dir /path/to/staging/folder
```
## Training a model
```bash
python scripts/exp/train.py --args.load conf/vampnet.yml --save_path /path/to/checkpoints
```
## Fine-tuning
To fine-tune a model, see the configuration files under `conf/lora/`.
You just need to provide a list of audio files // folders to fine-tune on, then launch the training job as usual.
```bash
python scripts/exp/train.py --args.load conf/lora/birds.yml --save_path /path/to/checkpoints
```
## Launching the Gradio Interface
```bash
python demo.py --args.load conf/interface/spotdl.yml --Interface.device cuda
```