@fffiloni on Hugging Face: "I'm happy to announce that ✨ Image to Music v2 ✨ is ready for you to try and i…"

fffiloni

posted an update Feb 2, 2024

Post

I'm happy to announce that ✨ Image to Music v2 ✨ is ready for you to try and i hope you'll like it too ! 😌

This new version has been crafted with transparency in mind,
so you can understand the process of translating an image to a musical equivalent.

How does it works under the hood ? 🤔

First, we get a very literal caption from microsoft/kosmos-2-patch14-224; this caption is then given to a LLM Agent (currently HuggingFaceH4/zephyr-7b-beta )which task is to translate the image caption to a musical and inspirational prompt for the next step.

Once we got a nice musical text from the LLM, we can send it to the text-to-music model of your choice:
MAGNet, MusicGen, AudioLDM-2, Riffusion or Mustango

Instead of the previous version of Image to Music which used Mubert API, and could output curious and obscure combinations, we only provide open sourced models available on the hub, called via the gradio API.

Also i guess the music result should be more accurate to the atmosphere of the image input, thanks to the LLM Agent step.

Pro tip, you can adjust the inspirational prompt to match your expectations, according to the chosen model and specific behavior of each one 👌

Try it, explore different models and tell me which one is your favorite 🤗
—› fffiloni/image-to-music-v2

mvaloatto

Feb 5, 2024

✨ Tried with this portrait of mine, there is something other-worldly about it... https://x.com/mvaloatto/status/1754404850664616240

victor

Feb 5, 2024

Wow it sounds quite epic to me!

MaziyarPanahi

Feb 5, 2024

This is so cool! I can't stop playing with it!

andgly95

Feb 13, 2024

This is pretty neat

acebruck

Mar 12, 2024

hey, your previous version "Image-to-MusicGen" is or was much much better, with the use of "CLIP-Interrogator-2". 30 seconds was ok, could be longer, but it was perfect! i made very good music from my images and can you please fix it!? and again, if the track could be a bit longer than 30 seconds it would be crazy good!
thank you!

MehdiLeZ

May 24, 2024

Cool stuff @fffiloni ! Happy to see how we could connect it to MusicLang (https://huggingface.co/musiclang) to add fine-grained control over the music generation for the next version :)

Join the conversation