Spaces:
Sleeping
beat tracker fix (#1)
Browse files- first commit (50f034f90d55446d572c869278b664ed381dd9ab)
- readme cleanup (5582d2e7b94dfab12e7dc90c1fa384f02af09b41)
- refactor (fc839a6a78647ed5e706d0e5b2eee512e8f8a8ec)
- refactor bugfixes (534a89cc9cf61b96dd2bcc82f7898cf77eb94bc8)
- remove wavenet, readability (04c5b94d12624e6a884f6e06a90c1abfb8afeae2)
- fix: sample prefix suffix (326b5bbb45c9ceeef030192c83e1d09035a7f509)
- remove library refs (b8622751c80761f8fd2ebdff5cfa6c58bfe34533)
- add save_epochs (225770674d206c769ab4300f2990634a01febd49)
- loudnorm, (dde5c212d420bfaba0a7d4100fc02c5fc53e200c)
- update reqs (d6b9d5b546a9d85fa14d74e94eebf6a17b24b8ae)
- fix sampling logic for paella (9439b644fe82f14614630b16e4cd6f1921d95f0f)
- fix random seeds for train! (79bcce65c6d7e0f60730055a9919866b48a6d070)
- Merge branch 'main' of github.com:descriptinc/lyrebird-vampnet into main (3d0828584db269031562ab23743bb0228543f2b8)
- add a coarse2fine eval script (260b46dfc267c9e2ef7807c18ad4379f6e0609f1)
- rm old eval script (cc3a37b0704fe38f4dedeff555ad598c9db5da88)
- save each metric on its own (275afd0ca5725a99558b5f65fcf697ff923d2530)
- interface (b54865d690d9f96e5ecceb3a539401c8ac86d339)
- fix volumenorm t -24 (34fcef96617aefac478da2667eb2000fc1bbecae)
- remove seq_len (920f55c77de892f058e26b98c1f8dd8d1acaafc3)
- fix transforms (d6a029bc0fa273ebf2229432f8bb1e8e1f611b5c)
- add opt for no prefix and no suffix (183d21c9728a3563625eaba80097b3a78580db11)
- update confs (b688797f68df14fb502f03049d5ab389b4a02cee)
- confs (f1ccdc1d6b867ce985af26725e0b1669be401ab8)
- interface, cleanup imputation code (4a2dc41ab59e9afdc5b2ce71a4527bd7bf08f4ba)
- interface improvements (a63cce029ca1ac0fd0c6326c935cc31f54f975b1)
- towards beat tracking in the interface (91f86380efebdb549e42686d5c025e523a33f54a)
- beat tracker bugfixes (5a0a80a25fda79285bf740b2158f303cbe577472)
- rm comment (554c010ce3fa91983548c4729de3cae18781eae5)
- c2f (e4e3c4e936cf0f127f7c021fb30f0b5e0e2052af)
- Merge branch 'main' of https://github.com/descriptinc/lyrebird-vampnet into main (b9277bd5d60c4f7e79d6a0074fa6197dfd12f771)
- upgrade to new codec ckpt (d48dcc40be3114122307de4e0a5eba44e8baee37)
- remove cancerous subprocess call (0a036acd9ed7b6d25eeaa0b54af54ce4435869e6)
- per-instrument models (4687dd93844cb3c6718085890238c44bc20faffa)
- exps (6f6fd13dfa58b2500b674821702865f7d1c85df9)
- typical filtering set to true (22a680a5c76cd779407a15e5eb52cae3a2f00ef9)
- coarse eval (57047e5fcc7fd5cc03de8c88b8fbb7cd24063f91)
- exps (9fbfaa60fffd03918045b3fc91ed1a546c63a16a)
- fix baseline cond (84d4ed6e7a531509140aba6b96dc347182a65556)
- c2f can be None (e3c7f4691a33dceac82b9d36c5654c606e6d94c2)
- use items instead of tensors (c1b9ba0d82891213ce63ebe02a20a3a7f100eb13)
- revert docker image (972000ef6b970bf834c88db486378974817683cd)
- negate sisdr (1fc975729b1babae43899a4e96d8450021841099)
- rm clmr (39847d40d8fd2d82f05442550d4f0cb6c8efca06)
- fix dockerfile (bcc33054a49ecb649755e7d202a5623ad0f40537)
- changes (ac059f495a6e84db76b20e97eb597886f0bb3bd4)
- eval, demo (3815be3a1c5368b233d430f7cf7a661481245cfc)
- gooood outputs (f3f463449cd952f0ebeee104c4a89ab9e270923a)
- better sampling defaults (03f09ee3138b5b6ee4a03bc5ef94cc9cf2b9e7c7)
- demo (128981df2a5a90b1350c2752ca6b350b7d29b10e)
- more tweaks (93b48cbe204776ce6946237d13272dd708f8f290)
- maestro interface (8544bbfee05e914106d8fbe74431dad70deadf4e)
- moving out (322cc3a20235540596bf36d06635ce8b75dc6636)
- reqs (4908bb47c041426e738c85c0d5139322f04fd947)
- interface for max (fa490b8a8ed1fb74ed956d3fbc36145d2bb6b53a)
- lora prep (1a5973b052a24dbcb9c02af05708462d41a7f83e)
- looks like it's working? (63015d58fe3e9e4b6a7813c64083400fb3890586)
- confs (910b45f2864f1e63471f6b12f0b6d7ed447e0a59)
- constructions (f4c9665b03e75ef98bc504ebba460d7761ca5b58)
- the refactor begins (5a343f4b6619bd99f5887d19e9a55870033c3611)
- basic readme stuff (99122c4357d54bd8ee98cbbbc8bb33f98d40dfb4)
- readme (7aa3063cad5ad723808e1d1f4306d22bc9771908)
- refactor masking, interface, demo (e3ca5f7b568fd1da332209583e9ff595bbb637ed)
- update readme (6fcf6a46dc9da33ade982190623c8f8c9cfb5a32)
- lac (bad2d3fe7ac6ed21963bc29f4366546c0058a5f0)
- add model ckpt path (6f55a79b36c42f37b501f063e97143d4b43aa299)
- demo cleanup, onset masks, pitch shifting (881d56d533c7fb9d7b96668ae1ad69603b257ec4)
- fix dropout bug for masks, refactor interfaces, add finetune setup script (c940f256e750010f780abcefb8a06fa0d49f9fd4)
- gamelan xeno canto (85e8a86c2b936d65aed99ba4056edf00f326a9c0)
- tiny sampling refactor (4d0cbfe60c83a9efd3aa7fc670a6c90baa832ad0)
- more demo ctrls (b61e699ad77c0d0d468e4c2a5bbca1e39d6f7308)
- confs (13b04cf217b9c4dc46055ac1e6684ae318df706b)
- efficient lora ckpts (75a71694824cf17fd51cce0705c062009a583660)
- critical sampling fix, two demoes for comparing old and new sampling (3f6f517b4fd2f14c7443d813a55af56ccbd79b3b)
- settling down on the new sampling routine (09b9691a666738aa47618e869266a5b282efff0c)
- update splits, reqs (cf172ac4841de1161ba130b62504d02d2f4f42b3)
- dariacore (b3caf82e70e770b17a1cceff1572ae173797f257)
- maestro script (c068a295a6fe239c94253716693a8cbf3b50c808)
- cleanup (d98455c3df585892baf81b4d0ea0130ab7ba51da)
- Update README.md (9da46f9203f655e935868ba0068f56608d4e3bf1)
- Update README.md (a84c25c272432ce9aa05e04c9bf1b2cda8d916c7)
- fix setup reqs (45390f991ea29f76afb711df3d7c32b923126221)
- update setup reqs (22b423a67d6f44dcb2e0bd0cbdda7710e84dead9)
- Create LICENSE (3445a716606e07f38aa0687d331c587f4eb7a9e0)
- more sampling fixes (33469209487504d1193f5be0a94b5edd3a6f9b5e)
- interface (2f3fb3279b5defd34899668c0731ba2d62b99ddb)
- add fig plots (e5dcb5f3962516056c96673794d5cc1d57f2c640)
- cleanup (e9fd215995de88f5c98bbe965f9e909d77f9c194)
- update readme (74cced76b6748ee5de93e176b16ed5be65a8199a)
- license updates! (eaa691b3ff238a4b8bb65aea182da776a1ae6515)
- update license (e251e23d39fdd4b0d375979094cb701d1fadfe51)
- pin audiotools version! (bfacd003d6729c963d62f9a410724ff6e0099ec1)
- add loralib (fd975c2086845ed0df7b0a929e26e9c62d0215ff)
- pip install from commit hangs? (04f1577f31421027dc71d0a0df846027ffb1d0ee)
- old audiotools commit can't find an html file (f373fd1cd920efeafaae9a0b9ceab8e3d72aad2f)
- update readme (b1cbc10df865c2eae3ae50fe4ba73b97fe275aa4)
- python version (c039932823574158f8772803e2d3a8a55b81c74b)
- cleaning up (03107fd63345468cfa32dacdb558e74b4cc5d6ae)
- host weights in zenodo (fff28a2ecf290953a2948bfd39c3127786ca5d61)
- pin numy (4c6c719ffde2395c8fc82e63e5f9ea360fb0c7e2)
- readme (3efca1407013ac845c232d9d1d7f41310a1780bc)
- disable wavebeat for now (d51eb6d10e0acb9586e547a5db3c417c1ab5899a)
- HF space prep (91e3ceb923bbd34a2cc78e1ccaf8088b4d9c7e87)
- example audio (5bd16c27e643e899e849a1a1d0f6793afbbdf1bd)
- demo (a004369aab5bee7125b564a124b6709811ba8bd7)
- hf space (b01c87d94ac1a839f258a5fbd1addb44640d3459)
- fix-readme (930a29c1bec390b504e3c051b5d3e0504fad50ed)
- colors (a371ce8b215a0f02e0f5590af011cc3a747cc372)
- gradio version (3f4f2c577ccc0874b83a7abbbd11adf0816356d8)
- gradio (5132796a5634c52a2dc826b093f65e70044a4809)
- LAC! (93ca7213c4745221b1b83e805ca961f5c8c6f055)
- pin to my audiotools fork for now (419ccdcace423e2eec652da5ee0f4a376d4a9be9)
- point to audiotools fork (4d1e39c1c90cb092950f60f146ef30e483b6a623)
- add wavebeat ckpt again (fed03a187e43a2578c294641222e7fa9c03614ee)
- codec (b56603390c10c740a60e2f8fd109e58dc5752e18)
- add wavebeat (e2a08a917c691abaa5474fa78249148967599f69)
- stuff (bc216144907abba7d41c39fe54b625bdb0a2d508)
- launch (fc887bfd15f1959b2249b07be6796187ad5c843f)
- queue (c0a5ff0616ac06c9869000cd7f8bc391a977886d)
- hmm (551e9043ff0549364c41f10b4dc7cc6d5773e23f)
- update gradio? (3ef5e8609989082cf9c2cf06dd672d01be64cfe5)
- add example audio (369a20d3a2f48c7709a68a83e9de923809ff8e28)
- lac :( (e32ae3e0f6f507c6d339a904ae7375d892efb1f4)
- update setup, reade (82c5a3634286fe4840e423d3cea64344cd5b5c34)
- lac again (bdcf1d8642781f084c35d01065a0fe03b7155126)
- update reqs (7aee7dbef2d44453620ff280735040b6175d0fc9)
- update presets (4b2f92ac2ceabbce0edf630f1794f747a0a1126e)
- defaults (e1ff20c6412cba7dff0b3439791b78913c70e7c3)
- defaults (a8b7c7569181ceed122c849c5e2311caa3af5834)
- update temp (9c53f253c3f412401db4466048abe01e72e25746)
- temp (4cc9b99cf680768da5c895333a2e688f2fe39679)
- presets (3391d3e9d418b7aade24dde4054d69462283f556)
- description (49febdc0a73baf337e93eb70ae63af4dedefc8c3)
- wavebeat (b1581a76bf3526acb123692280713ea3d16f5982)
- merge (519a4a4eb22905a80d918abc9fcdfabea2ee1e0a)
- merge (0cdf0ff7336806b588f81e5aa1030e07b01fc2de)
- temp float (7053a144728ade7f8d51418506da13a76fec58fa)
- Merge branch 'main' into pr/1 (35d58e62364c082ee19b239477a6136cd582df90)
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
title:
|
3 |
emoji: 🤖
|
4 |
colorFrom: gray
|
5 |
colorTo: gray
|
|
|
1 |
---
|
2 |
+
title: "VampNet: Music Generation with Masked Transformers"
|
3 |
emoji: 🤖
|
4 |
colorFrom: gray
|
5 |
colorTo: gray
|
@@ -22,6 +22,7 @@ interface = Interface(
|
|
22 |
coarse_ckpt="./models/vampnet/coarse.pth",
|
23 |
coarse2fine_ckpt="./models/vampnet/c2f.pth",
|
24 |
codec_ckpt="./models/vampnet/codec.pth",
|
|
|
25 |
device="cuda" if torch.cuda.is_available() else "cpu",
|
26 |
)
|
27 |
|
@@ -114,7 +115,7 @@ def _vamp(data, return_mask=False):
|
|
114 |
z,
|
115 |
mask=mask,
|
116 |
sampling_steps=data[num_steps],
|
117 |
-
temperature=data[temp]*10,
|
118 |
return_mask=True,
|
119 |
typical_filtering=data[typical_filtering],
|
120 |
typical_mass=data[typical_mass],
|
@@ -186,9 +187,9 @@ with gr.Blocks() as demo:
|
|
186 |
|
187 |
with gr.Row():
|
188 |
with gr.Column():
|
189 |
-
gr.Markdown("# VampNet
|
190 |
gr.Markdown("""## Description:
|
191 |
-
This is a demo of
|
192 |
You can control the extent and nature of variation with a set of manual controls and presets.
|
193 |
Use this interface to experiment with different mask settings and explore the audio outputs.
|
194 |
""")
|
@@ -196,8 +197,8 @@ with gr.Blocks() as demo:
|
|
196 |
gr.Markdown("""
|
197 |
## Instructions:
|
198 |
1. You can start by uploading some audio, or by loading the example audio.
|
199 |
-
2. Choose a preset for the vamp operation, or manually adjust the controls to customize the mask settings.
|
200 |
-
3. Click the "generate (vamp)!!!" button to
|
201 |
4. Optionally, you can add some notes and save the result.
|
202 |
5. You can also use the output as the new input and continue experimenting!
|
203 |
""")
|
@@ -248,12 +249,15 @@ with gr.Blocks() as demo:
|
|
248 |
"beat_mask_downbeats": False,
|
249 |
},
|
250 |
"slight periodic variation": {
|
|
|
251 |
"periodic_p": 5,
|
252 |
"onset_mask_width": 5,
|
253 |
"beat_mask_width": 0,
|
254 |
"beat_mask_downbeats": False,
|
255 |
},
|
256 |
"moderate periodic variation": {
|
|
|
|
|
257 |
"periodic_p": 13,
|
258 |
"onset_mask_width": 5,
|
259 |
"beat_mask_width": 0,
|
@@ -274,15 +278,9 @@ with gr.Blocks() as demo:
|
|
274 |
"beat-driven variation": {
|
275 |
"periodic_p": 0,
|
276 |
"onset_mask_width": 0,
|
277 |
-
"beat_mask_width":
|
278 |
"beat_mask_downbeats": False,
|
279 |
},
|
280 |
-
"beat-driven variation (downbeats only)": {
|
281 |
-
"periodic_p": 0,
|
282 |
-
"onset_mask_width": 0,
|
283 |
-
"beat_mask_width": 50,
|
284 |
-
"beat_mask_downbeats": True,
|
285 |
-
},
|
286 |
"beat-driven variation (downbeats only, strong)": {
|
287 |
"periodic_p": 0,
|
288 |
"onset_mask_width": 0,
|
@@ -304,7 +302,7 @@ with gr.Blocks() as demo:
|
|
304 |
minimum=0,
|
305 |
maximum=128,
|
306 |
step=1,
|
307 |
-
value=
|
308 |
)
|
309 |
|
310 |
|
@@ -392,7 +390,7 @@ with gr.Blocks() as demo:
|
|
392 |
label="temperature",
|
393 |
minimum=0.0,
|
394 |
maximum=10.0,
|
395 |
-
value=
|
396 |
)
|
397 |
|
398 |
|
|
|
22 |
coarse_ckpt="./models/vampnet/coarse.pth",
|
23 |
coarse2fine_ckpt="./models/vampnet/c2f.pth",
|
24 |
codec_ckpt="./models/vampnet/codec.pth",
|
25 |
+
wavebeat_ckpt="./models/wavebeat.pth",
|
26 |
device="cuda" if torch.cuda.is_available() else "cpu",
|
27 |
)
|
28 |
|
|
|
115 |
z,
|
116 |
mask=mask,
|
117 |
sampling_steps=data[num_steps],
|
118 |
+
temperature=float(data[temp]*10),
|
119 |
return_mask=True,
|
120 |
typical_filtering=data[typical_filtering],
|
121 |
typical_mass=data[typical_mass],
|
|
|
187 |
|
188 |
with gr.Row():
|
189 |
with gr.Column():
|
190 |
+
gr.Markdown("# VampNet")
|
191 |
gr.Markdown("""## Description:
|
192 |
+
This is a demo of VampNet, a masked generative music model capable of doing music variations.
|
193 |
You can control the extent and nature of variation with a set of manual controls and presets.
|
194 |
Use this interface to experiment with different mask settings and explore the audio outputs.
|
195 |
""")
|
|
|
197 |
gr.Markdown("""
|
198 |
## Instructions:
|
199 |
1. You can start by uploading some audio, or by loading the example audio.
|
200 |
+
2. Choose a preset for the vamp operation, or manually adjust the controls to customize the mask settings. Click the load preset button.
|
201 |
+
3. Click the "generate (vamp)!!!" button to generate audio. Listen to the output audio, and the masked audio to hear the mask hints.
|
202 |
4. Optionally, you can add some notes and save the result.
|
203 |
5. You can also use the output as the new input and continue experimenting!
|
204 |
""")
|
|
|
249 |
"beat_mask_downbeats": False,
|
250 |
},
|
251 |
"slight periodic variation": {
|
252 |
+
<<<<<<< HEAD
|
253 |
"periodic_p": 5,
|
254 |
"onset_mask_width": 5,
|
255 |
"beat_mask_width": 0,
|
256 |
"beat_mask_downbeats": False,
|
257 |
},
|
258 |
"moderate periodic variation": {
|
259 |
+
=======
|
260 |
+
>>>>>>> main
|
261 |
"periodic_p": 13,
|
262 |
"onset_mask_width": 5,
|
263 |
"beat_mask_width": 0,
|
|
|
278 |
"beat-driven variation": {
|
279 |
"periodic_p": 0,
|
280 |
"onset_mask_width": 0,
|
281 |
+
"beat_mask_width": 20,
|
282 |
"beat_mask_downbeats": False,
|
283 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
284 |
"beat-driven variation (downbeats only, strong)": {
|
285 |
"periodic_p": 0,
|
286 |
"onset_mask_width": 0,
|
|
|
302 |
minimum=0,
|
303 |
maximum=128,
|
304 |
step=1,
|
305 |
+
value=13,
|
306 |
)
|
307 |
|
308 |
|
|
|
390 |
label="temperature",
|
391 |
minimum=0.0,
|
392 |
maximum=10.0,
|
393 |
+
value=1.8
|
394 |
)
|
395 |
|
396 |
|