pseeth commited on
Commit
120d6c2
1 Parent(s): c91e8cc

beat tracker fix (#1)

Browse files

- first commit (50f034f90d55446d572c869278b664ed381dd9ab)
- readme cleanup (5582d2e7b94dfab12e7dc90c1fa384f02af09b41)
- refactor (fc839a6a78647ed5e706d0e5b2eee512e8f8a8ec)
- refactor bugfixes (534a89cc9cf61b96dd2bcc82f7898cf77eb94bc8)
- remove wavenet, readability (04c5b94d12624e6a884f6e06a90c1abfb8afeae2)
- fix: sample prefix suffix (326b5bbb45c9ceeef030192c83e1d09035a7f509)
- remove library refs (b8622751c80761f8fd2ebdff5cfa6c58bfe34533)
- add save_epochs (225770674d206c769ab4300f2990634a01febd49)
- loudnorm, (dde5c212d420bfaba0a7d4100fc02c5fc53e200c)
- update reqs (d6b9d5b546a9d85fa14d74e94eebf6a17b24b8ae)
- fix sampling logic for paella (9439b644fe82f14614630b16e4cd6f1921d95f0f)
- fix random seeds for train! (79bcce65c6d7e0f60730055a9919866b48a6d070)
- Merge branch 'main' of github.com:descriptinc/lyrebird-vampnet into main (3d0828584db269031562ab23743bb0228543f2b8)
- add a coarse2fine eval script (260b46dfc267c9e2ef7807c18ad4379f6e0609f1)
- rm old eval script (cc3a37b0704fe38f4dedeff555ad598c9db5da88)
- save each metric on its own (275afd0ca5725a99558b5f65fcf697ff923d2530)
- interface (b54865d690d9f96e5ecceb3a539401c8ac86d339)
- fix volumenorm t -24 (34fcef96617aefac478da2667eb2000fc1bbecae)
- remove seq_len (920f55c77de892f058e26b98c1f8dd8d1acaafc3)
- fix transforms (d6a029bc0fa273ebf2229432f8bb1e8e1f611b5c)
- add opt for no prefix and no suffix (183d21c9728a3563625eaba80097b3a78580db11)
- update confs (b688797f68df14fb502f03049d5ab389b4a02cee)
- confs (f1ccdc1d6b867ce985af26725e0b1669be401ab8)
- interface, cleanup imputation code (4a2dc41ab59e9afdc5b2ce71a4527bd7bf08f4ba)
- interface improvements (a63cce029ca1ac0fd0c6326c935cc31f54f975b1)
- towards beat tracking in the interface (91f86380efebdb549e42686d5c025e523a33f54a)
- beat tracker bugfixes (5a0a80a25fda79285bf740b2158f303cbe577472)
- rm comment (554c010ce3fa91983548c4729de3cae18781eae5)
- c2f (e4e3c4e936cf0f127f7c021fb30f0b5e0e2052af)
- Merge branch 'main' of https://github.com/descriptinc/lyrebird-vampnet into main (b9277bd5d60c4f7e79d6a0074fa6197dfd12f771)
- upgrade to new codec ckpt (d48dcc40be3114122307de4e0a5eba44e8baee37)
- remove cancerous subprocess call (0a036acd9ed7b6d25eeaa0b54af54ce4435869e6)
- per-instrument models (4687dd93844cb3c6718085890238c44bc20faffa)
- exps (6f6fd13dfa58b2500b674821702865f7d1c85df9)
- typical filtering set to true (22a680a5c76cd779407a15e5eb52cae3a2f00ef9)
- coarse eval (57047e5fcc7fd5cc03de8c88b8fbb7cd24063f91)
- exps (9fbfaa60fffd03918045b3fc91ed1a546c63a16a)
- fix baseline cond (84d4ed6e7a531509140aba6b96dc347182a65556)
- c2f can be None (e3c7f4691a33dceac82b9d36c5654c606e6d94c2)
- use items instead of tensors (c1b9ba0d82891213ce63ebe02a20a3a7f100eb13)
- revert docker image (972000ef6b970bf834c88db486378974817683cd)
- negate sisdr (1fc975729b1babae43899a4e96d8450021841099)
- rm clmr (39847d40d8fd2d82f05442550d4f0cb6c8efca06)
- fix dockerfile (bcc33054a49ecb649755e7d202a5623ad0f40537)
- changes (ac059f495a6e84db76b20e97eb597886f0bb3bd4)
- eval, demo (3815be3a1c5368b233d430f7cf7a661481245cfc)
- gooood outputs (f3f463449cd952f0ebeee104c4a89ab9e270923a)
- better sampling defaults (03f09ee3138b5b6ee4a03bc5ef94cc9cf2b9e7c7)
- demo (128981df2a5a90b1350c2752ca6b350b7d29b10e)
- more tweaks (93b48cbe204776ce6946237d13272dd708f8f290)
- maestro interface (8544bbfee05e914106d8fbe74431dad70deadf4e)
- moving out (322cc3a20235540596bf36d06635ce8b75dc6636)
- reqs (4908bb47c041426e738c85c0d5139322f04fd947)
- interface for max (fa490b8a8ed1fb74ed956d3fbc36145d2bb6b53a)
- lora prep (1a5973b052a24dbcb9c02af05708462d41a7f83e)
- looks like it's working? (63015d58fe3e9e4b6a7813c64083400fb3890586)
- confs (910b45f2864f1e63471f6b12f0b6d7ed447e0a59)
- constructions (f4c9665b03e75ef98bc504ebba460d7761ca5b58)
- the refactor begins (5a343f4b6619bd99f5887d19e9a55870033c3611)
- basic readme stuff (99122c4357d54bd8ee98cbbbc8bb33f98d40dfb4)
- readme (7aa3063cad5ad723808e1d1f4306d22bc9771908)
- refactor masking, interface, demo (e3ca5f7b568fd1da332209583e9ff595bbb637ed)
- update readme (6fcf6a46dc9da33ade982190623c8f8c9cfb5a32)
- lac (bad2d3fe7ac6ed21963bc29f4366546c0058a5f0)
- add model ckpt path (6f55a79b36c42f37b501f063e97143d4b43aa299)
- demo cleanup, onset masks, pitch shifting (881d56d533c7fb9d7b96668ae1ad69603b257ec4)
- fix dropout bug for masks, refactor interfaces, add finetune setup script (c940f256e750010f780abcefb8a06fa0d49f9fd4)
- gamelan xeno canto (85e8a86c2b936d65aed99ba4056edf00f326a9c0)
- tiny sampling refactor (4d0cbfe60c83a9efd3aa7fc670a6c90baa832ad0)
- more demo ctrls (b61e699ad77c0d0d468e4c2a5bbca1e39d6f7308)
- confs (13b04cf217b9c4dc46055ac1e6684ae318df706b)
- efficient lora ckpts (75a71694824cf17fd51cce0705c062009a583660)
- critical sampling fix, two demoes for comparing old and new sampling (3f6f517b4fd2f14c7443d813a55af56ccbd79b3b)
- settling down on the new sampling routine (09b9691a666738aa47618e869266a5b282efff0c)
- update splits, reqs (cf172ac4841de1161ba130b62504d02d2f4f42b3)
- dariacore (b3caf82e70e770b17a1cceff1572ae173797f257)
- maestro script (c068a295a6fe239c94253716693a8cbf3b50c808)
- cleanup (d98455c3df585892baf81b4d0ea0130ab7ba51da)
- Update README.md (9da46f9203f655e935868ba0068f56608d4e3bf1)
- Update README.md (a84c25c272432ce9aa05e04c9bf1b2cda8d916c7)
- fix setup reqs (45390f991ea29f76afb711df3d7c32b923126221)
- update setup reqs (22b423a67d6f44dcb2e0bd0cbdda7710e84dead9)
- Create LICENSE (3445a716606e07f38aa0687d331c587f4eb7a9e0)
- more sampling fixes (33469209487504d1193f5be0a94b5edd3a6f9b5e)
- interface (2f3fb3279b5defd34899668c0731ba2d62b99ddb)
- add fig plots (e5dcb5f3962516056c96673794d5cc1d57f2c640)
- cleanup (e9fd215995de88f5c98bbe965f9e909d77f9c194)
- update readme (74cced76b6748ee5de93e176b16ed5be65a8199a)
- license updates! (eaa691b3ff238a4b8bb65aea182da776a1ae6515)
- update license (e251e23d39fdd4b0d375979094cb701d1fadfe51)
- pin audiotools version! (bfacd003d6729c963d62f9a410724ff6e0099ec1)
- add loralib (fd975c2086845ed0df7b0a929e26e9c62d0215ff)
- pip install from commit hangs? (04f1577f31421027dc71d0a0df846027ffb1d0ee)
- old audiotools commit can't find an html file (f373fd1cd920efeafaae9a0b9ceab8e3d72aad2f)
- update readme (b1cbc10df865c2eae3ae50fe4ba73b97fe275aa4)
- python version (c039932823574158f8772803e2d3a8a55b81c74b)
- cleaning up (03107fd63345468cfa32dacdb558e74b4cc5d6ae)
- host weights in zenodo (fff28a2ecf290953a2948bfd39c3127786ca5d61)
- pin numy (4c6c719ffde2395c8fc82e63e5f9ea360fb0c7e2)
- readme (3efca1407013ac845c232d9d1d7f41310a1780bc)
- disable wavebeat for now (d51eb6d10e0acb9586e547a5db3c417c1ab5899a)
- HF space prep (91e3ceb923bbd34a2cc78e1ccaf8088b4d9c7e87)
- example audio (5bd16c27e643e899e849a1a1d0f6793afbbdf1bd)
- demo (a004369aab5bee7125b564a124b6709811ba8bd7)
- hf space (b01c87d94ac1a839f258a5fbd1addb44640d3459)
- fix-readme (930a29c1bec390b504e3c051b5d3e0504fad50ed)
- colors (a371ce8b215a0f02e0f5590af011cc3a747cc372)
- gradio version (3f4f2c577ccc0874b83a7abbbd11adf0816356d8)
- gradio (5132796a5634c52a2dc826b093f65e70044a4809)
- LAC! (93ca7213c4745221b1b83e805ca961f5c8c6f055)
- pin to my audiotools fork for now (419ccdcace423e2eec652da5ee0f4a376d4a9be9)
- point to audiotools fork (4d1e39c1c90cb092950f60f146ef30e483b6a623)
- add wavebeat ckpt again (fed03a187e43a2578c294641222e7fa9c03614ee)
- codec (b56603390c10c740a60e2f8fd109e58dc5752e18)
- add wavebeat (e2a08a917c691abaa5474fa78249148967599f69)
- stuff (bc216144907abba7d41c39fe54b625bdb0a2d508)
- launch (fc887bfd15f1959b2249b07be6796187ad5c843f)
- queue (c0a5ff0616ac06c9869000cd7f8bc391a977886d)
- hmm (551e9043ff0549364c41f10b4dc7cc6d5773e23f)
- update gradio? (3ef5e8609989082cf9c2cf06dd672d01be64cfe5)
- add example audio (369a20d3a2f48c7709a68a83e9de923809ff8e28)
- lac :( (e32ae3e0f6f507c6d339a904ae7375d892efb1f4)
- update setup, reade (82c5a3634286fe4840e423d3cea64344cd5b5c34)
- lac again (bdcf1d8642781f084c35d01065a0fe03b7155126)
- update reqs (7aee7dbef2d44453620ff280735040b6175d0fc9)
- update presets (4b2f92ac2ceabbce0edf630f1794f747a0a1126e)
- defaults (e1ff20c6412cba7dff0b3439791b78913c70e7c3)
- defaults (a8b7c7569181ceed122c849c5e2311caa3af5834)
- update temp (9c53f253c3f412401db4466048abe01e72e25746)
- temp (4cc9b99cf680768da5c895333a2e688f2fe39679)
- presets (3391d3e9d418b7aade24dde4054d69462283f556)
- description (49febdc0a73baf337e93eb70ae63af4dedefc8c3)
- wavebeat (b1581a76bf3526acb123692280713ea3d16f5982)
- merge (519a4a4eb22905a80d918abc9fcdfabea2ee1e0a)
- merge (0cdf0ff7336806b588f81e5aa1030e07b01fc2de)
- temp float (7053a144728ade7f8d51418506da13a76fec58fa)
- Merge branch 'main' into pr/1 (35d58e62364c082ee19b239477a6136cd582df90)

Files changed (2) hide show
  1. README.md +1 -1
  2. app.py +12 -14
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- title: 'VampNet: Music Generation with Masked Transformers'
3
  emoji: 🤖
4
  colorFrom: gray
5
  colorTo: gray
 
1
  ---
2
+ title: "VampNet: Music Generation with Masked Transformers"
3
  emoji: 🤖
4
  colorFrom: gray
5
  colorTo: gray
app.py CHANGED
@@ -22,6 +22,7 @@ interface = Interface(
22
  coarse_ckpt="./models/vampnet/coarse.pth",
23
  coarse2fine_ckpt="./models/vampnet/c2f.pth",
24
  codec_ckpt="./models/vampnet/codec.pth",
 
25
  device="cuda" if torch.cuda.is_available() else "cpu",
26
  )
27
 
@@ -114,7 +115,7 @@ def _vamp(data, return_mask=False):
114
  z,
115
  mask=mask,
116
  sampling_steps=data[num_steps],
117
- temperature=data[temp]*10,
118
  return_mask=True,
119
  typical_filtering=data[typical_filtering],
120
  typical_mass=data[typical_mass],
@@ -186,9 +187,9 @@ with gr.Blocks() as demo:
186
 
187
  with gr.Row():
188
  with gr.Column():
189
- gr.Markdown("# VampNet Audio Vamping")
190
  gr.Markdown("""## Description:
191
- This is a demo of the VampNet, a generative audio model that transforms the input audio based on the chosen settings.
192
  You can control the extent and nature of variation with a set of manual controls and presets.
193
  Use this interface to experiment with different mask settings and explore the audio outputs.
194
  """)
@@ -196,8 +197,8 @@ with gr.Blocks() as demo:
196
  gr.Markdown("""
197
  ## Instructions:
198
  1. You can start by uploading some audio, or by loading the example audio.
199
- 2. Choose a preset for the vamp operation, or manually adjust the controls to customize the mask settings.
200
- 3. Click the "generate (vamp)!!!" button to apply the vamp operation. Listen to the output audio.
201
  4. Optionally, you can add some notes and save the result.
202
  5. You can also use the output as the new input and continue experimenting!
203
  """)
@@ -248,12 +249,15 @@ with gr.Blocks() as demo:
248
  "beat_mask_downbeats": False,
249
  },
250
  "slight periodic variation": {
 
251
  "periodic_p": 5,
252
  "onset_mask_width": 5,
253
  "beat_mask_width": 0,
254
  "beat_mask_downbeats": False,
255
  },
256
  "moderate periodic variation": {
 
 
257
  "periodic_p": 13,
258
  "onset_mask_width": 5,
259
  "beat_mask_width": 0,
@@ -274,15 +278,9 @@ with gr.Blocks() as demo:
274
  "beat-driven variation": {
275
  "periodic_p": 0,
276
  "onset_mask_width": 0,
277
- "beat_mask_width": 50,
278
  "beat_mask_downbeats": False,
279
  },
280
- "beat-driven variation (downbeats only)": {
281
- "periodic_p": 0,
282
- "onset_mask_width": 0,
283
- "beat_mask_width": 50,
284
- "beat_mask_downbeats": True,
285
- },
286
  "beat-driven variation (downbeats only, strong)": {
287
  "periodic_p": 0,
288
  "onset_mask_width": 0,
@@ -304,7 +302,7 @@ with gr.Blocks() as demo:
304
  minimum=0,
305
  maximum=128,
306
  step=1,
307
- value=3,
308
  )
309
 
310
 
@@ -392,7 +390,7 @@ with gr.Blocks() as demo:
392
  label="temperature",
393
  minimum=0.0,
394
  maximum=10.0,
395
- value=0.8
396
  )
397
 
398
 
 
22
  coarse_ckpt="./models/vampnet/coarse.pth",
23
  coarse2fine_ckpt="./models/vampnet/c2f.pth",
24
  codec_ckpt="./models/vampnet/codec.pth",
25
+ wavebeat_ckpt="./models/wavebeat.pth",
26
  device="cuda" if torch.cuda.is_available() else "cpu",
27
  )
28
 
 
115
  z,
116
  mask=mask,
117
  sampling_steps=data[num_steps],
118
+ temperature=float(data[temp]*10),
119
  return_mask=True,
120
  typical_filtering=data[typical_filtering],
121
  typical_mass=data[typical_mass],
 
187
 
188
  with gr.Row():
189
  with gr.Column():
190
+ gr.Markdown("# VampNet")
191
  gr.Markdown("""## Description:
192
+ This is a demo of VampNet, a masked generative music model capable of doing music variations.
193
  You can control the extent and nature of variation with a set of manual controls and presets.
194
  Use this interface to experiment with different mask settings and explore the audio outputs.
195
  """)
 
197
  gr.Markdown("""
198
  ## Instructions:
199
  1. You can start by uploading some audio, or by loading the example audio.
200
+ 2. Choose a preset for the vamp operation, or manually adjust the controls to customize the mask settings. Click the load preset button.
201
+ 3. Click the "generate (vamp)!!!" button to generate audio. Listen to the output audio, and the masked audio to hear the mask hints.
202
  4. Optionally, you can add some notes and save the result.
203
  5. You can also use the output as the new input and continue experimenting!
204
  """)
 
249
  "beat_mask_downbeats": False,
250
  },
251
  "slight periodic variation": {
252
+ <<<<<<< HEAD
253
  "periodic_p": 5,
254
  "onset_mask_width": 5,
255
  "beat_mask_width": 0,
256
  "beat_mask_downbeats": False,
257
  },
258
  "moderate periodic variation": {
259
+ =======
260
+ >>>>>>> main
261
  "periodic_p": 13,
262
  "onset_mask_width": 5,
263
  "beat_mask_width": 0,
 
278
  "beat-driven variation": {
279
  "periodic_p": 0,
280
  "onset_mask_width": 0,
281
+ "beat_mask_width": 20,
282
  "beat_mask_downbeats": False,
283
  },
 
 
 
 
 
 
284
  "beat-driven variation (downbeats only, strong)": {
285
  "periodic_p": 0,
286
  "onset_mask_width": 0,
 
302
  minimum=0,
303
  maximum=128,
304
  step=1,
305
+ value=13,
306
  )
307
 
308
 
 
390
  label="temperature",
391
  minimum=0.0,
392
  maximum=10.0,
393
+ value=1.8
394
  )
395
 
396