Update README.md
Browse files
README.md
CHANGED
@@ -15,6 +15,8 @@ license: cc-by-nc-4.0
|
|
15 |
|
16 |
musicgen-songstarter-v0.2 is a [`musicgen-stereo-melody-large`](https://huggingface.co/facebook/musicgen-stereo-melody-large) fine-tuned on a dataset of melody loops from my Splice sample library. It's intended to be used to generate song ideas that are useful for music producers. It generates stereo audio in 32khz.
|
17 |
|
|
|
|
|
18 |
Compared to [`musicgen-songstarter-v0.1`](https://huggingface.co/nateraw/musicgen-songstarter-v0.1), this new version:
|
19 |
- was trained on 3x more unique, manually-curated samples that I painstakingly purchased on Splice
|
20 |
- Is twice the size, bumped up from size `medium` ➡️ `large` transformer LM
|
@@ -58,7 +60,7 @@ for idx, one_wav in enumerate(wav):
|
|
58 |
Follow the following prompt format:
|
59 |
|
60 |
```
|
61 |
-
{tag_1}, {
|
62 |
```
|
63 |
|
64 |
For example:
|
@@ -67,6 +69,8 @@ For example:
|
|
67 |
hip hop, soul, piano, chords, jazz, neo jazz, G# minor, 140 bpm
|
68 |
```
|
69 |
|
|
|
|
|
70 |
## Samples
|
71 |
|
72 |
<table style="width:100%; text-align:center;">
|
@@ -111,6 +115,23 @@ hip hop, soul, piano, chords, jazz, neo jazz, G# minor, 140 bpm
|
|
111 |
</tr>
|
112 |
</table>
|
113 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
114 |
## Acknowledgements
|
115 |
|
116 |
This work would not have been possible without:
|
|
|
15 |
|
16 |
musicgen-songstarter-v0.2 is a [`musicgen-stereo-melody-large`](https://huggingface.co/facebook/musicgen-stereo-melody-large) fine-tuned on a dataset of melody loops from my Splice sample library. It's intended to be used to generate song ideas that are useful for music producers. It generates stereo audio in 32khz.
|
17 |
|
18 |
+
**👀 Update:** I wrote a [blogpost](https://nateraw.com/posts/training_musicgen_songstarter.html) detailing how and why I trained this model, including training details, the dataset, Weights and Biases logs, etc.
|
19 |
+
|
20 |
Compared to [`musicgen-songstarter-v0.1`](https://huggingface.co/nateraw/musicgen-songstarter-v0.1), this new version:
|
21 |
- was trained on 3x more unique, manually-curated samples that I painstakingly purchased on Splice
|
22 |
- Is twice the size, bumped up from size `medium` ➡️ `large` transformer LM
|
|
|
60 |
Follow the following prompt format:
|
61 |
|
62 |
```
|
63 |
+
{tag_1}, {tag_2}, ..., {tag_n}, {key}, {bpm} bpm
|
64 |
```
|
65 |
|
66 |
For example:
|
|
|
69 |
hip hop, soul, piano, chords, jazz, neo jazz, G# minor, 140 bpm
|
70 |
```
|
71 |
|
72 |
+
For some example tags, [see the prompt format section of musicgen-songstarter-v0.1's readme](https://huggingface.co/nateraw/musicgen-songstarter-v0.1#prompt-format). The tags there are for the smaller v1 dataset, but should give you an idea of what the model saw.
|
73 |
+
|
74 |
## Samples
|
75 |
|
76 |
<table style="width:100%; text-align:center;">
|
|
|
115 |
</tr>
|
116 |
</table>
|
117 |
|
118 |
+
## Training Details
|
119 |
+
|
120 |
+
For more verbose details, you can check out the [blogpost](https://nateraw.com/posts/training_musicgen_songstarter.html#training).
|
121 |
+
|
122 |
+
- **code**:
|
123 |
+
- Repo is [here](https://github.com/nateraw/audiocraft). It's an undocumented fork of [facebookresearch/audiocraft](https://github.com/facebookresearch/audiocraft) where I rewrote the training loop with PyTorch Lightning, which worked a bit better for me.
|
124 |
+
- **data**:
|
125 |
+
- around 1700-1800 samples I manually listened to + purchased via my personal [Splice](https://splice.com) account. About 7-8 hours of audio.
|
126 |
+
- Given the licensing terms, I cannot share the data.
|
127 |
+
- **hardware**:
|
128 |
+
- 8xA100 40GB instance from Lambda Labs
|
129 |
+
- **procedure**:
|
130 |
+
- trained for 10k steps, which took about 6 hours
|
131 |
+
- reduced segment duration at train time to 15 seconds
|
132 |
+
- **hparams/logs**:
|
133 |
+
- See the wandb [run](https://wandb.ai/nateraw/musicgen-songstarter-v0.2/runs/63gh4l7m), which includes training metrics, logs, hardware metrics at train time, hyperparameters, and the exact command I used when I ran the training script.
|
134 |
+
|
135 |
## Acknowledgements
|
136 |
|
137 |
This work would not have been possible without:
|