Spaces:
Sleeping
Sleeping
File size: 3,171 Bytes
99122c4 50f034f 99122c4 50f034f 99122c4 50f034f 99122c4 50f034f 99122c4 50f034f 99122c4 50f034f 99122c4 50f034f 99122c4 50f034f 99122c4 50f034f e3ca5f7 99122c4 50f034f 6fcf6a4 6f55a79 6fcf6a4 7aa3063 6fcf6a4 99122c4 50f034f 6fcf6a4 99122c4 50f034f 99122c4 50f034f 6fcf6a4 50f034f 99122c4 50f034f 99122c4 50f034f 99122c4 50f034f 99122c4 50f034f 99122c4 e3ca5f7 99122c4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
# VampNet
This repository contains recipes for training generative music models on top of the Lyrebird Audio Codec.
# Setting up
install AudioTools
```bash
git clone https://github.com/hugofloresgarcia/audiotools.git
pip install -e ./audiotools
```
install the LAC library.
```bash
git clone https://github.com/hugofloresgarcia/lac.git
pip install -e ./lac
```
install VampNet
```bash
git clone https://github.com/hugofloresgarcia/vampnet2.git
pip install -e ./vampnet2
```
## A note on argbind
This repository relies on [argbind](https://github.com/pseeth/argbind) to manage CLIs and config files.
Config files are stored in the `conf/` folder.
## Getting the Pretrained Models
Download the pretrained models from [this link](https://drive.google.com/file/d/1ZIBMJMt8QRE8MYYGjg4lH7v7BLbZneq2/view?usp=sharing). Then, extract the models to the `models/` folder.
# How the code is structured
This code was written fast to meet a publication deadline, so it can be messy and redundant at times. Currently working on cleaning it up.
```
βββ conf <- (conf files for training, finetuning, etc)
βββ demo.py <- (gradio UI for playing with vampnet)
βββ env <- (environment variables)
βΒ Β βββ env.sh
βββ models <- (extract pretrained models)
βΒ Β βββ spotdl
βΒ Β βΒ Β βββ c2f.pth <- (coarse2fine checkpoint)
βΒ Β βΒ Β βββ coarse.pth <- (coarse checkpoint)
βΒ Β βΒ Β βββ codec.pth <- (codec checkpoint)
βΒ Β βββ wavebeat.pth
βββ README.md
βββ scripts
βΒ Β βββ exp
βΒ Β βΒ Β βββ eval.py <- (eval script)
βΒ Β βΒ Β βββ train.py <- (training/finetuning script)
βΒ Β βββ utils
βββ vampnet
βΒ Β βββ beats.py <- (beat tracking logic)
βΒ Β βββ __init__.py
βΒ Β βββ interface.py <- (high-level programmatic interface)
βΒ Β βββ mask.py
βΒ Β βββ modules
βΒ Β βΒ Β βββ activations.py
βΒ Β βΒ Β βββ __init__.py
βΒ Β βΒ Β βββ layers.py
βΒ Β βΒ Β βββ transformer.py <- (architecture + sampling code)
βΒ Β βββ scheduler.py
βΒ Β βββ util.py
```
# Usage
First, you'll want to set up your environment
```bash
source ./env/env.sh
```
## Staging a Run
Staging a run makes a copy of all the git-tracked files in the codebase and saves them to a folder for reproducibility. You can then run the training script from the staged folder.
```
stage --name my_run --run_dir /path/to/staging/folder
```
## Training a model
```bash
python scripts/exp/train.py --args.load conf/vampnet.yml --save_path /path/to/checkpoints
```
## Fine-tuning
To fine-tune a model, see the configuration files under `conf/lora/`.
You just need to provide a list of audio files // folders to fine-tune on, then launch the training job as usual.
```bash
python scripts/exp/train.py --args.load conf/lora/birds.yml --save_path /path/to/checkpoints
```
## Launching the Gradio Interface
```bash
python demo.py --args.load conf/interface/spotdl.yml --Interface.device cuda
``` |