Spaces:

thecollabagepatch
/

magenta-retry

Running

App Files Files Community

Apply for community grant: Personal project (gpu)

by thecollabagepatch - opened Sep 19

Discussion

thecollabagepatch

Owner Sep 19

•

edited Sep 19

I’ve been obsessed with real-time audio injection since I saw Jesse Engel's demo of MagentaRT, but google colab just wasn't gonna cut it for me.

So I dockerized it, built an API (HTTP + WebSocket), and wrapped it in a Hugging Face Space you can duplicate and jam with today:

The http endpoints are designed to work inside an IOS app I've been iterating on. The user can simply duplicate the template space and point to its url inside the app for personal use. It returns 4-8 bar chunks using input audio as context, either single shot or in continuous fashion.

I also created a simple websockets route so that it can run easily inside of other UIs, like the html web app demo you'll find inside the space.

There are also endpoints to switch to any MagentaRT finetune hosted on huggingface, and instructions for how to train and upload so that we can all jam with each other's models.
MagentaRT is pretty large and I don't have access to a big enough GPU locally to run it, but I would like to continue developing this open source project with the community's help.
I'll be releasing the IOS app for free once all the kinks are ironed out.

MagentaRT requires 30+ GB gpu VRAM to reliably run real-time generation, so using huggingface infrastructure, that means I need to be able to test on an L40s.

I'll be exploring ways to optimize it so that it can run on smaller GPUs. An L4 can't quite hit realtime with it yet, but I hope to solve that.

A community GPU grant will help me continue testing these applications designed for local usage. It's important to me that music models remain open source and run locally.
Side note: this API will also find its way into a free VST I've created that already includes 3 other music model architectures (musicgen, melodyflow, and stable-audio-open-small).

https://thepatch.gumroad.com/l/gary4juce

The single-shot generation endpoints are already well designed for workflows using the DAW for audio context, and i have a feeling something truly dank will emerge if I incorporate the continuous generation endpoints into it.

i'll be iterating on this project every day, as you can see from my commits lol...

thecollabagepatch

Owner Sep 21

v1 of IOS app is up on testflight. lmk if anyone wants to play with it.

thecollabagepatch

Owner Sep 25

single-shot generations with the DAW audio as context using this api coming shortly.

thecollabagepatch

Owner Sep 27

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment