Consistency Model Multistep Scheduler
Overview
Multistep and onestep scheduler (Algorithm 1) introduced alongside consistency models in the paper Consistency Models by Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Based on the original consistency models implementation. Should generate good samples from ConsistencyModelPipeline in one or a small number of steps.
CMStochasticIterativeScheduler
class diffusers.CMStochasticIterativeScheduler
< source >( num_train_timesteps: int = 40 sigma_min: float = 0.002 sigma_max: float = 80.0 sigma_data: float = 0.5 s_noise: float = 1.0 rho: float = 7.0 clip_denoised: bool = True )
Parameters
-
num_train_timesteps (
int
) — number of diffusion steps used to train the model. -
sigma_min (
float
) — Minimum noise magnitude in the sigma schedule. This was set to 0.002 in the original implementation. -
sigma_max (
float
) — Maximum noise magnitude in the sigma schedule. This was set to 80.0 in the original implementation. -
sigma_data (
float
) — The standard deviation of the data distribution, following the EDM paper [2]. This was set to 0.5 in the original implementation, which is also the original value suggested in the EDM paper. -
s_noise (
float
) — The amount of additional noise to counteract loss of detail during sampling. A reasonable range is [1.000, 1.011]. This was set to 1.0 in the original implementation. -
rho (
float
) — The rho parameter used for calculating the Karras sigma schedule, introduced in the EDM paper [2]. This was set to 7.0 in the original implementation, which is also the original value suggested in the EDM paper. -
clip_denoised (
bool
) — Whether to clip the denoised outputs to(-1, 1)
. Defaults toTrue
. -
timesteps (
List
ornp.ndarray
ortorch.Tensor
, optional) — Optionally, an explicit timestep schedule can be specified. The timesteps are expected to be in increasing order.
Multistep and onestep sampling for consistency models from Song et al. 2023 [1]. This implements Algorithm 1 in the paper [1].
[1] Song, Yang and Dhariwal, Prafulla and Chen, Mark and Sutskever, Ilya. “Consistency Models” https://arxiv.org/pdf/2303.01469 [2] Karras, Tero, et al. “Elucidating the Design Space of Diffusion-Based Generative Models.” https://arxiv.org/abs/2206.00364
~ConfigMixin takes care of storing all config attributes that are passed in the scheduler’s __init__
function, such as num_train_timesteps
. They can be accessed via scheduler.config.num_train_timesteps
.
SchedulerMixin provides general loading and saving functionality via the SchedulerMixin.save_pretrained() and
from_pretrained() functions.
get_scalings_for_boundary_condition
< source >(
sigma
)
→
tuple
Gets the scalings used in the consistency model parameterization, following Appendix C of the original paper. This enforces the consistency model boundary condition.
Note that epsilon
in the equations for c_skip and c_out is set to sigma_min.
scale_model_input
< source >(
sample: FloatTensor
timestep: typing.Union[float, torch.FloatTensor]
)
→
torch.FloatTensor
Scales the consistency model input by (sigma**2 + sigma_data**2) ** 0.5
, following the EDM model.
set_timesteps
< source >( num_inference_steps: typing.Optional[int] = None device: typing.Union[str, torch.device] = None timesteps: typing.Optional[typing.List[int]] = None )
Parameters
-
num_inference_steps (
int
) — the number of diffusion steps used when generating samples with a pre-trained model. -
device (
str
ortorch.device
, optional) — the device to which the timesteps should be moved to. IfNone
, the timesteps are not moved. -
timesteps (
List[int]
, optional) — custom timesteps used to support arbitrary spacing between timesteps. IfNone
, then the default timestep spacing strategy of equal spacing between timesteps is used. If passed,num_inference_steps
must beNone
.
Sets the timesteps used for the diffusion chain. Supporting function to be run before inference.
sigma_to_t
< source >(
sigmas: typing.Union[float, numpy.ndarray]
)
→
float
or np.ndarray
Gets scaled timesteps from the Karras sigmas, for input to the consistency model.
step
< source >(
model_output: FloatTensor
timestep: typing.Union[float, torch.FloatTensor]
sample: FloatTensor
generator: typing.Optional[torch._C.Generator] = None
return_dict: bool = True
)
→
~schedulers.scheduling_utils.CMStochasticIterativeSchedulerOutput
or tuple
Parameters
-
model_output (
torch.FloatTensor
) — direct output from learned diffusion model. -
timestep (
float
) — current timestep in the diffusion chain. -
sample (
torch.FloatTensor
) — current instance of sample being created by diffusion process. -
generator (
torch.Generator
, optional) — Random number generator. -
return_dict (
bool
) — option for returning tuple rather than EulerDiscreteSchedulerOutput class
Returns
~schedulers.scheduling_utils.CMStochasticIterativeSchedulerOutput
or tuple
~schedulers.scheduling_utils.CMStochasticIterativeSchedulerOutput
if return_dict
is True, otherwise a
tuple
. When returning a tuple, the first element is the sample tensor.
Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion process from the learned model outputs (most often the predicted noise).