Attention Processor
An attention processor is a class for applying different types of attention mechanisms.
AttnProcessor
Default processor for performing attention-related computations.
AttnProcessor2_0
Processor for implementing scaled dot-product attention (enabled by default if you’re using PyTorch 2.0).
LoRAAttnProcessor
class diffusers.models.attention_processor.LoRAAttnProcessor
< source >( hidden_size cross_attention_dim = None rank = 4 network_alpha = None )
Parameters
- hidden_size (
int
, optional) — The hidden size of the attention layer. -
cross_attention_dim (
int
, optional) — The number of channels in theencoder_hidden_states
. -
rank (
int
, defaults to 4) — The dimension of the LoRA update matrices. -
network_alpha (
int
, optional) — Equivalent toalpha
but it’s usage is specific to Kohya (A1111) style LoRAs.
Processor for implementing the LoRA attention mechanism.
LoRAAttnProcessor2_0
class diffusers.models.attention_processor.LoRAAttnProcessor2_0
< source >( hidden_size cross_attention_dim = None rank = 4 network_alpha = None )
Parameters
- hidden_size (
int
) — The hidden size of the attention layer. -
cross_attention_dim (
int
, optional) — The number of channels in theencoder_hidden_states
. -
rank (
int
, defaults to 4) — The dimension of the LoRA update matrices. -
network_alpha (
int
, optional) — Equivalent toalpha
but it’s usage is specific to Kohya (A1111) style LoRAs.
Processor for implementing the LoRA attention mechanism using PyTorch 2.0’s memory-efficient scaled dot-product attention.
CustomDiffusionAttnProcessor
class diffusers.models.attention_processor.CustomDiffusionAttnProcessor
< source >( train_kv = True train_q_out = True hidden_size = None cross_attention_dim = None out_bias = True dropout = 0.0 )
Parameters
-
train_kv (
bool
, defaults toTrue
) — Whether to newly train the key and value matrices corresponding to the text features. -
train_q_out (
bool
, defaults toTrue
) — Whether to newly train query matrices corresponding to the latent image features. - hidden_size (
int
, optional, defaults toNone
) — The hidden size of the attention layer. -
cross_attention_dim (
int
, optional, defaults toNone
) — The number of channels in theencoder_hidden_states
. -
out_bias (
bool
, defaults toTrue
) — Whether to include the bias parameter intrain_q_out
. -
dropout (
float
, optional, defaults to 0.0) — The dropout probability to use.
Processor for implementing attention for the Custom Diffusion method.
AttnAddedKVProcessor
Processor for performing attention-related computations with extra learnable key and value matrices for the text encoder.
AttnAddedKVProcessor2_0
Processor for performing scaled dot-product attention (enabled by default if you’re using PyTorch 2.0), with extra learnable key and value matrices for the text encoder.
LoRAAttnAddedKVProcessor
class diffusers.models.attention_processor.LoRAAttnAddedKVProcessor
< source >( hidden_size cross_attention_dim = None rank = 4 network_alpha = None )
Processor for implementing the LoRA attention mechanism with extra learnable key and value matrices for the text encoder.
XFormersAttnProcessor
class diffusers.models.attention_processor.XFormersAttnProcessor
< source >( attention_op: typing.Optional[typing.Callable] = None )
Parameters
-
attention_op (
Callable
, optional, defaults toNone
) — The base operator to use as the attention operator. It is recommended to set toNone
, and allow xFormers to choose the best operator.
Processor for implementing memory efficient attention using xFormers.
LoRAXFormersAttnProcessor
class diffusers.models.attention_processor.LoRAXFormersAttnProcessor
< source >( hidden_size cross_attention_dim rank = 4 attention_op: typing.Optional[typing.Callable] = None network_alpha = None )
Parameters
- hidden_size (
int
, optional) — The hidden size of the attention layer. -
cross_attention_dim (
int
, optional) — The number of channels in theencoder_hidden_states
. -
rank (
int
, defaults to 4) — The dimension of the LoRA update matrices. -
attention_op (
Callable
, optional, defaults toNone
) — The base operator to use as the attention operator. It is recommended to set toNone
, and allow xFormers to choose the best operator. -
network_alpha (
int
, optional) — Equivalent toalpha
but it’s usage is specific to Kohya (A1111) style LoRAs.
Processor for implementing the LoRA attention mechanism with memory efficient attention using xFormers.
CustomDiffusionXFormersAttnProcessor
class diffusers.models.attention_processor.CustomDiffusionXFormersAttnProcessor
< source >( train_kv = True train_q_out = False hidden_size = None cross_attention_dim = None out_bias = True dropout = 0.0 attention_op: typing.Optional[typing.Callable] = None )
Parameters
-
train_kv (
bool
, defaults toTrue
) — Whether to newly train the key and value matrices corresponding to the text features. -
train_q_out (
bool
, defaults toTrue
) — Whether to newly train query matrices corresponding to the latent image features. - hidden_size (
int
, optional, defaults toNone
) — The hidden size of the attention layer. -
cross_attention_dim (
int
, optional, defaults toNone
) — The number of channels in theencoder_hidden_states
. -
out_bias (
bool
, defaults toTrue
) — Whether to include the bias parameter intrain_q_out
. -
dropout (
float
, optional, defaults to 0.0) — The dropout probability to use. -
attention_op (
Callable
, optional, defaults toNone
) — The base operator to use as the attention operator. It is recommended to set toNone
, and allow xFormers to choose the best operator.
Processor for implementing memory efficient attention using xFormers for the Custom Diffusion method.
SlicedAttnProcessor
class diffusers.models.attention_processor.SlicedAttnProcessor
< source >( slice_size )
Processor for implementing sliced attention.
SlicedAttnAddedKVProcessor
class diffusers.models.attention_processor.SlicedAttnAddedKVProcessor
< source >( slice_size )
Processor for implementing sliced attention with extra learnable key and value matrices for the text encoder.