Update index.html
Browse files- index.html +6 -4
index.html
CHANGED
@@ -9,13 +9,13 @@
|
|
9 |
<body>
|
10 |
<div class="card">
|
11 |
<h1>Riffusion-Melodiff-v1</h1>
|
12 |
-
<p>Riffusion-Melodiff is simple, but interesting idea, (that I have not seen anywhere else) how to
|
13 |
-
<p>Riffusion-Melodiff is built on a top of
|
14 |
<a href="https://huggingface.co/riffusion/riffusion-model-v1" target="_blank">Riffusion </a>
|
15 |
model, which is fine-tuned Stable Diffusion model to generate Mel Spectrograms. (Spectrogram is kind of
|
16 |
visual representation of music by dividing waveforms into frequencies.) Riffusion-Melodiff does not contain new model, there was no new training, nor fine-tunig.
|
17 |
It uses the same model as Riffusion only in a different way.</p>
|
18 |
-
<p>Riffusion-Melodiff uses Img2Img pipeline from Diffusers library to modify images of Mel Spectrograms to produce
|
19 |
in wav format (if you have audio in a different format, transfer it first to wav by online converter). Then you may use Img2img pipeline from the Diffusers library
|
20 |
with your prompt, seed and strength. Stregth parameter decides, how much will modified audio relate to initial audio and how much it will relate to the prompt.
|
21 |
When strength is too low the spectrogram is too similar with original one and we do not receive new modification. When strength is too high, then spectrogram is too
|
@@ -80,7 +80,9 @@
|
|
80 |
<source src="When_the_Saints_long_piano_i2i_flute.wav" type="audio/wav">
|
81 |
Your browser does not support the audio element.
|
82 |
</audio>
|
83 |
-
</p>
|
|
|
|
|
84 |
</div>
|
85 |
</body>
|
86 |
</html>
|
|
|
9 |
<body>
|
10 |
<div class="card">
|
11 |
<h1>Riffusion-Melodiff-v1</h1>
|
12 |
+
<p><br> Riffusion-Melodiff is simple, but interesting idea, (that I have not seen anywhere else) how to create cover versions from songs.</p>
|
13 |
+
<p><br> Riffusion-Melodiff is built on a top of
|
14 |
<a href="https://huggingface.co/riffusion/riffusion-model-v1" target="_blank">Riffusion </a>
|
15 |
model, which is fine-tuned Stable Diffusion model to generate Mel Spectrograms. (Spectrogram is kind of
|
16 |
visual representation of music by dividing waveforms into frequencies.) Riffusion-Melodiff does not contain new model, there was no new training, nor fine-tunig.
|
17 |
It uses the same model as Riffusion only in a different way.</p>
|
18 |
+
<p>Riffusion-Melodiff uses Img2Img pipeline from Diffusers library to modify images of Mel Spectrograms to produce new versions of music. Just upload your audio
|
19 |
in wav format (if you have audio in a different format, transfer it first to wav by online converter). Then you may use Img2img pipeline from the Diffusers library
|
20 |
with your prompt, seed and strength. Stregth parameter decides, how much will modified audio relate to initial audio and how much it will relate to the prompt.
|
21 |
When strength is too low the spectrogram is too similar with original one and we do not receive new modification. When strength is too high, then spectrogram is too
|
|
|
80 |
<source src="When_the_Saints_long_piano_i2i_flute.wav" type="audio/wav">
|
81 |
Your browser does not support the audio element.
|
82 |
</audio>
|
83 |
+
</p>
|
84 |
+
<p> <br> Im using standard (not paid) Google Colab Gpu configuration for inference. Im using default values for number of inference steps (23) from the underlying
|
85 |
+
pipelines. With this setup it takes about 8s to produce 5s long modified sample. For start it is ok, I would say.</p>
|
86 |
</div>
|
87 |
</body>
|
88 |
</html>
|