Diffusers documentation

Pseudo numerical methods for diffusion models (PNDM)

You are viewing v0.18.2 version. A newer version v0.32.1 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Pseudo numerical methods for diffusion models (PNDM)

Overview

Original implementation can be found here.

PNDMScheduler

class diffusers.PNDMScheduler

< >

( num_train_timesteps: int = 1000 beta_start: float = 0.0001 beta_end: float = 0.02 beta_schedule: str = 'linear' trained_betas: typing.Union[numpy.ndarray, typing.List[float], NoneType] = None skip_prk_steps: bool = False set_alpha_to_one: bool = False prediction_type: str = 'epsilon' timestep_spacing: str = 'leading' steps_offset: int = 0 )

Parameters

  • num_train_timesteps (int) — number of diffusion steps used to train the model.
  • beta_start (float) — the starting beta value of inference.
  • beta_end (float) — the final beta value.
  • beta_schedule (str) — the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from linear, scaled_linear, or squaredcos_cap_v2.
  • trained_betas (np.ndarray, optional) — option to pass an array of betas directly to the constructor to bypass beta_start, beta_end etc.
  • skip_prk_steps (bool) — allows the scheduler to skip the Runge-Kutta steps that are defined in the original paper as being required before plms steps; defaults to False.
  • set_alpha_to_one (bool, default False) — each diffusion step uses the value of alphas product at that step and at the previous one. For the final step there is no previous alpha. When this option is True the previous alpha product is fixed to 1, otherwise it uses the value of alpha at step 0.
  • prediction_type (str, default epsilon, optional) — prediction type of the scheduler function, one of epsilon (predicting the noise of the diffusion process) or v_prediction (see section 2.4 https://imagen.research.google/video/paper.pdf)
  • timestep_spacing (str, default "leading") — The way the timesteps should be scaled. Refer to Table 2. of Common Diffusion Noise Schedules and Sample Steps are Flawed for more information.
  • steps_offset (int, default 0) — an offset added to the inference steps. You can use a combination of offset=1 and set_alpha_to_one=False, to make the last step use step 0 for the previous alpha product, as done in stable diffusion.

Pseudo numerical methods for diffusion models (PNDM) proposes using more advanced ODE integration techniques, namely Runge-Kutta method and a linear multi-step method.

~ConfigMixin takes care of storing all config attributes that are passed in the scheduler’s __init__ function, such as num_train_timesteps. They can be accessed via scheduler.config.num_train_timesteps. SchedulerMixin provides general loading and saving functionality via the SchedulerMixin.save_pretrained() and from_pretrained() functions.

For more details, see the original paper: https://arxiv.org/abs/2202.09778

scale_model_input

< >

( sample: FloatTensor *args **kwargs ) torch.FloatTensor

Parameters

  • sample (torch.FloatTensor) — input sample

Returns

torch.FloatTensor

scaled input sample

Ensures interchangeability with schedulers that need to scale the denoising model input depending on the current timestep.

set_timesteps

< >

( num_inference_steps: int device: typing.Union[str, torch.device] = None )

Parameters

  • num_inference_steps (int) — the number of diffusion steps used when generating samples with a pre-trained model.

Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference.

step

< >

( model_output: FloatTensor timestep: int sample: FloatTensor return_dict: bool = True ) SchedulerOutput or tuple

Parameters

  • model_output (torch.FloatTensor) — direct output from learned diffusion model.
  • timestep (int) — current discrete timestep in the diffusion chain.
  • sample (torch.FloatTensor) — current instance of sample being created by diffusion process.
  • return_dict (bool) — option for returning tuple rather than SchedulerOutput class

Returns

SchedulerOutput or tuple

SchedulerOutput if return_dict is True, otherwise a tuple. When returning a tuple, the first element is the sample tensor.

Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion process from the learned model outputs (most often the predicted noise).

This function calls step_prk() or step_plms() depending on the internal variable counter.

step_plms

< >

( model_output: FloatTensor timestep: int sample: FloatTensor return_dict: bool = True ) ~scheduling_utils.SchedulerOutput or tuple

Parameters

  • model_output (torch.FloatTensor) — direct output from learned diffusion model.
  • timestep (int) — current discrete timestep in the diffusion chain.
  • sample (torch.FloatTensor) — current instance of sample being created by diffusion process.
  • return_dict (bool) — option for returning tuple rather than SchedulerOutput class

Returns

~scheduling_utils.SchedulerOutput or tuple

~scheduling_utils.SchedulerOutput if return_dict is True, otherwise a tuple. When returning a tuple, the first element is the sample tensor.

Step function propagating the sample with the linear multi-step method. This has one forward pass with multiple times to approximate the solution.

step_prk

< >

( model_output: FloatTensor timestep: int sample: FloatTensor return_dict: bool = True ) ~scheduling_utils.SchedulerOutput or tuple

Parameters

  • model_output (torch.FloatTensor) — direct output from learned diffusion model.
  • timestep (int) — current discrete timestep in the diffusion chain.
  • sample (torch.FloatTensor) — current instance of sample being created by diffusion process.
  • return_dict (bool) — option for returning tuple rather than SchedulerOutput class

Returns

~scheduling_utils.SchedulerOutput or tuple

~scheduling_utils.SchedulerOutput if return_dict is True, otherwise a tuple. When returning a tuple, the first element is the sample tensor.

Step function propagating the sample with the Runge-Kutta method. RK takes 4 forward passes to approximate the solution to the differential equation.