Dance Diffusion by Zach Evans.
Dance Diffusion is the first in a suite of generative audio tools for producers and musicians to be released by Harmonai. For more info or to get involved in the development of these tools, please visit https://harmonai.org and fill out the form on the front page.
The original codebase of this implementation can be found here.
|pipeline_dance_diffusion.py||Unconditional Audio Generation||-|
class diffusers.DanceDiffusionPipeline< source >
( unet scheduler )
- unet (UNet1DModel) — U-Net architecture to denoise the encoded image.
scheduler (SchedulerMixin) —
A scheduler to be used in combination with
unetto denoise the encoded image. Can be one of IPNDMScheduler.
This model inherits from DiffusionPipeline. Check the superclass documentation for the generic methods the library implements for all the pipelines (such as downloading or saving, running on a particular device, etc.)
__call__< source >
batch_size: int = 1
num_inference_steps: int = 100
generator: typing.Optional[torch._C.Generator] = None
audio_length_in_s: typing.Optional[float] = None
return_dict: bool = True
int, optional, defaults to 1) — The number of audio samples to generate.
int, optional, defaults to 50) — The number of denoising steps. More denoising steps usually lead to a higher quality audio sample at the expense of slower inference.
torch.Generator, optional) — A torch generator to make generation deterministic.
float, optional, defaults to
self.unet.config.sample_size/self.unet.config.sample_rate) — The length of the generated audio sample in seconds. Note that the output of the pipeline, i.e.
sample_size, will be
bool, optional, defaults to
True) — Whether or not to return a
AudioPipelineOutputinstead of a plain tuple.
return_dict is True, otherwise a `tuple. When returning a tuple, the first element is a list with the