UniPCMultistepScheduler
UniPCMultistepScheduler
is a trainingfree framework designed for fast sampling of diffusion models. It was introduced in UniPC: A Unified PredictorCorrector Framework for Fast Sampling of Diffusion Models by Wenliang Zhao, Lujia Bai, Yongming Rao, Jie Zhou, Jiwen Lu.
It consists of a corrector (UniC) and a predictor (UniP) that share a unified analytical form and support arbitrary orders. UniPC is by design modelagnostic, supporting pixelspace/latentspace DPMs on unconditional/conditional sampling. It can also be applied to both noise prediction and data prediction models. The corrector UniC can be also applied after any offtheshelf solvers to increase the order of accuracy.
The abstract from the paper is:
Diffusion probabilistic models (DPMs) have demonstrated a very promising ability in highresolution image synthesis. However, sampling from a pretrained DPM usually requires hundreds of model evaluations, which is computationally expensive. Despite recent progress in designing highorder solvers for DPMs, there still exists room for further speedup, especially in extremely few steps (e.g., 5~10 steps). Inspired by the predictorcorrector for ODE solvers, we develop a unified corrector (UniC) that can be applied after any existing DPM sampler to increase the order of accuracy without extra model evaluations, and derive a unified predictor (UniP) that supports arbitrary order as a byproduct. Combining UniP and UniC, we propose a unified predictorcorrector framework called UniPC for the fast sampling of DPMs, which has a unified analytical form for any order and can significantly improve the sampling quality over previous methods. We evaluate our methods through extensive experiments including both unconditional and conditional sampling using pixelspace and latentspace DPMs. Our UniPC can achieve 3.87 FID on CIFAR10 (unconditional) and 7.51 FID on ImageNet 256times256 (conditional) with only 10 function evaluations. Code is available at https://github.com/wlzhao/UniPC.
The original codebase can be found at wlzhao/UniPC.
Tips
It is recommended to set solver_order
to 2 for guide sampling, and solver_order=3
for unconditional sampling.
Dynamic thresholding from Imagen (https://huggingface.co/papers/2205.11487) is supported, and for pixelspace
diffusion models, you can set both predict_x0=True
and thresholding=True
to use dynamic thresholding. This thresholding method is unsuitable for latentspace diffusion models such as Stable Diffusion.
UniPCMultistepScheduler
class diffusers.UniPCMultistepScheduler
< source >( num_train_timesteps: int = 1000 beta_start: float = 0.0001 beta_end: float = 0.02 beta_schedule: str = 'linear' trained_betas: typing.Union[numpy.ndarray, typing.List[float], NoneType] = None solver_order: int = 2 prediction_type: str = 'epsilon' thresholding: bool = False dynamic_thresholding_ratio: float = 0.995 sample_max_value: float = 1.0 predict_x0: bool = True solver_type: str = 'bh2' lower_order_final: bool = True disable_corrector: typing.List[int] = [] solver_p: SchedulerMixin = None use_karras_sigmas: typing.Optional[bool] = False timestep_spacing: str = 'linspace' steps_offset: int = 0 )
Parameters

num_train_timesteps (
int
, defaults to 1000) — The number of diffusion steps to train the model. 
beta_start (
float
, defaults to 0.0001) — The startingbeta
value of inference. 
beta_end (
float
, defaults to 0.02) — The finalbeta
value. 
beta_schedule (
str
, defaults to"linear"
) — The beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose fromlinear
,scaled_linear
, orsquaredcos_cap_v2
. 
trained_betas (
np.ndarray
, optional) — Pass an array of betas directly to the constructor to bypassbeta_start
andbeta_end
. 
solver_order (
int
, default2
) — The UniPC order which can be any positive integer. The effective order of accuracy issolver_order + 1
due to the UniC. It is recommended to usesolver_order=2
for guided sampling, andsolver_order=3
for unconditional sampling. 
prediction_type (
str
, defaults toepsilon
, optional) — Prediction type of the scheduler function; can beepsilon
(predicts the noise of the diffusion process),sample
(directly predicts the noisy sample) or
v_prediction` (see section 2.4 of Imagen Video paper). 
thresholding (
bool
, defaults toFalse
) — Whether to use the “dynamic thresholding” method. This is unsuitable for latentspace diffusion models such as Stable Diffusion. 
dynamic_thresholding_ratio (
float
, defaults to 0.995) — The ratio for the dynamic thresholding method. Valid only whenthresholding=True
. 
sample_max_value (
float
, defaults to 1.0) — The threshold value for dynamic thresholding. Valid only whenthresholding=True
andpredict_x0=True
. 
predict_x0 (
bool
, defaults toTrue
) — Whether to use the updating algorithm on the predicted x0. 
solver_type (
str
, defaultbh2
) — Solver type for UniPC. It is recommended to usebh1
for unconditional sampling when steps < 10, andbh2
otherwise. 
lower_order_final (
bool
, defaultTrue
) — Whether to use lowerorder solvers in the final steps. Only valid for < 15 inference steps. This can stabilize the sampling of DPMSolver for steps < 15, especially for steps <= 10. 
disable_corrector (
list
, default[]
) — Decides which step to disable the corrector to mitigate the misalignment betweenepsilon_theta(x_t, c)
andepsilon_theta(x_t^c, c)
which can influence convergence for a large guidance scale. Corrector is usually disabled during the first few steps. 
solver_p (
SchedulerMixin
, defaultNone
) — Any other scheduler that if specified, the algorithm becomessolver_p + UniC
. 
use_karras_sigmas (
bool
, optional, defaults toFalse
) — Whether to use Karras sigmas for step sizes in the noise schedule during the sampling process. IfTrue
, the sigmas are determined according to a sequence of noise levels {σi}. 
timestep_spacing (
str
, defaults to"linspace"
) — The way the timesteps should be scaled. Refer to Table 2 of the Common Diffusion Noise Schedules and Sample Steps are Flawed for more information. 
steps_offset (
int
, defaults to 0) — An offset added to the inference steps. You can use a combination ofoffset=1
andset_alpha_to_one=False
to make the last step use step 0 for the previous alpha product like in Stable Diffusion.
UniPCMultistepScheduler
is a trainingfree framework designed for the fast sampling of diffusion models.
This model inherits from SchedulerMixin and ConfigMixin. Check the superclass documentation for the generic methods the library implements for all schedulers such as loading and saving.
convert_model_output
< source >(
model_output: FloatTensor
timestep: int
sample: FloatTensor
)
→
torch.FloatTensor
Parameters

model_output (
torch.FloatTensor
) — The direct output from the learned diffusion model. 
timestep (
int
) — The current discrete timestep in the diffusion chain. 
sample (
torch.FloatTensor
) — A current instance of a sample created by the diffusion process.
Returns
torch.FloatTensor
The converted model output.
Convert the model output to the corresponding type the UniPC algorithm needs.
multistep_uni_c_bh_update
< source >(
this_model_output: FloatTensor
this_timestep: int
last_sample: FloatTensor
this_sample: FloatTensor
order: int
)
→
torch.FloatTensor
Parameters

this_model_output (
torch.FloatTensor
) — The model outputs atx_t
. 
this_timestep (
int
) — The current timestept
. 
last_sample (
torch.FloatTensor
) — The generated sample before the last predictorx_{t1}
. 
this_sample (
torch.FloatTensor
) — The generated sample after the last predictorx_{t}
. 
order (
int
) — Thep
of UniCp at this step. The effective order of accuracy should beorder + 1
.
Returns
torch.FloatTensor
The corrected sample tensor at the current timestep.
One step for the UniC (B(h) version).
multistep_uni_p_bh_update
< source >(
model_output: FloatTensor
prev_timestep: int
sample: FloatTensor
order: int
)
→
torch.FloatTensor
Parameters

model_output (
torch.FloatTensor
) — The direct output from the learned diffusion model at the current timestep. 
prev_timestep (
int
) — The previous discrete timestep in the diffusion chain. 
sample (
torch.FloatTensor
) — A current instance of a sample created by the diffusion process. 
order (
int
) — The order of UniP at this timestep (corresponds to the p in UniPCp).
Returns
torch.FloatTensor
The sample tensor at the previous timestep.
One step for the UniP (B(h) version). Alternatively, self.solver_p
is used if is specified.
scale_model_input
< source >(
sample: FloatTensor
*args
**kwargs
)
→
torch.FloatTensor
Ensures interchangeability with schedulers that need to scale the denoising model input depending on the current timestep.
set_timesteps
< source >( num_inference_steps: int device: typing.Union[str, torch.device] = None )
Sets the discrete timesteps used for the diffusion chain (to be run before inference).
step
< source >(
model_output: FloatTensor
timestep: int
sample: FloatTensor
return_dict: bool = True
)
→
SchedulerOutput or tuple
Parameters

model_output (
torch.FloatTensor
) — The direct output from learned diffusion model. 
timestep (
int
) — The current discrete timestep in the diffusion chain. 
sample (
torch.FloatTensor
) — A current instance of a sample created by the diffusion process. 
return_dict (
bool
) — Whether or not to return a SchedulerOutput ortuple
.
Returns
SchedulerOutput or tuple
If return_dict is True
, SchedulerOutput is returned, otherwise a
tuple is returned where the first element is the sample tensor.
Predict the sample from the previous timestep by reversing the SDE. This function propagates the sample with the multistep UniPC.
SchedulerOutput
class diffusers.schedulers.scheduling_utils.SchedulerOutput
< source >( prev_sample: FloatTensor )
Base class for the output of a scheduler’s step
function.