Activation functions

Customized activation functions for supporting various models in 🤗 Diffusers.

GELU

( dim_in: int dim_out: int approximate: str = 'none' bias: bool = True )

Parameters

dim_in (int) — The number of channels in the input.
dim_out (int) — The number of channels in the output.
approximate (str, optional, defaults to "none") — If "tanh", use tanh approximation.
bias (bool, defaults to True) — Whether to use a bias in the linear layer.

GELU activation function with tanh approximation support with approximate="tanh".

( dim_in: int dim_out: int bias: bool = True )

Parameters

dim_in (int) — The number of channels in the input.
dim_out (int) — The number of channels in the output.
bias (bool, defaults to True) — Whether to use a bias in the linear layer.

A variant of the gated linear unit activation function.

( dim_in: int dim_out: int bias: bool = True )

Parameters

dim_in (int) — The number of channels in the input.
dim_out (int) — The number of channels in the output.
bias (bool, defaults to True) — Whether to use a bias in the linear layer.

The approximate form of the Gaussian Error Linear Unit (GELU). For more details, see section 2 of this paper.

( dim_in: int dim_out: int bias: bool = True )

Parameters

dim_in (int) — The number of channels in the input.
dim_out (int) — The number of channels in the output.
bias (bool, defaults to True) — Whether to use a bias in the linear layer.

A variant of the gated linear unit activation function. It’s similar to GEGLU but uses SiLU / Swish instead of GeLU.

( )

SiLU activation function with input upcasted to torch.float32.

( dim_in: int dim_out: int bias: bool = True activation: str = 'silu' )