Transformers documentation
사용자 정의 레이어 및 유틸리티
사용자 정의 레이어 및 유틸리티
이 페이지는 라이브러리에서 사용되는 사용자 정의 레이어와 모델링을 위한 유틸리티 함수들을 나열합니다.
이 함수들 대부분은 라이브러리 내의 모델 코드를 연구할 때만 유용합니다.
PyTorch 사용자 정의 모듈
class transformers.Conv1D
< source >( nf nx )
1D-convolutional layer as defined by Radford et al. for OpenAI GPT (and also used in GPT-2).
Basically works like a linear layer but the weights are transposed.
PyTorch 헬퍼(helper) 함수
transformers.apply_chunking_to_forward
< source >( forward_fn: Callable[..., torch.Tensor] chunk_size: int chunk_dim: int *input_tensors ) → torch.Tensor
Parameters
- forward_fn (
Callable[..., torch.Tensor]
) — The forward function of the model. - chunk_size (
int
) — The chunk size of a chunked tensor:num_chunks = len(input_tensors[0]) / chunk_size
. - chunk_dim (
int
) — The dimension over which theinput_tensors
should be chunked. - input_tensors (
tuple[torch.Tensor]
) — The input tensors offorward_fn
which will be chunked
Returns
torch.Tensor
A tensor with the same shape as the forward_fn
would have given if applied`.
This function chunks the input_tensors
into smaller input tensor parts of size chunk_size
over the dimension
chunk_dim
. It then applies a layer forward_fn
to each chunk independently to save memory.
If the forward_fn
is independent across the chunk_dim
this function will yield the same result as directly
applying forward_fn
to input_tensors
.
Examples:
# rename the usual forward() fn to forward_chunk()
def forward_chunk(self, hidden_states):
hidden_states = self.decoder(hidden_states)
return hidden_states
# implement a chunked forward function
def forward(self, hidden_states):
return apply_chunking_to_forward(self.forward_chunk, self.chunk_size_lm_head, self.seq_len_dim, hidden_states)
transformers.pytorch_utils.find_pruneable_heads_and_indices
< source >( heads: list[int] n_heads: int head_size: int already_pruned_heads: set[int] ) → tuple[Set[int], torch.LongTensor]
Parameters
- heads (
list[int]
) — List of the indices of heads to prune. - n_heads (
int
) — The number of heads in the model. - head_size (
int
) — The size of each head. - already_pruned_heads (
Set[int]
) — A set of already pruned heads.
Returns
tuple[Set[int], torch.LongTensor]
A tuple with the indices of heads to prune taking already_pruned_heads
into account and the indices of rows/columns to keep in the layer weight.
Finds the heads and their indices taking already_pruned_heads
into account.
transformers.prune_layer
< source >( layer: nn.Linear | Conv1D index: torch.LongTensor dim: int | None = None ) → torch.nn.Linear
or Conv1D
Parameters
- layer (
Union[torch.nn.Linear, Conv1D]
) — The layer to prune. - index (
torch.LongTensor
) — The indices to keep in the layer. - dim (
int
, optional) — The dimension on which to keep the indices.
Returns
torch.nn.Linear
or Conv1D
The pruned layer as a new layer with requires_grad=True
.
Prune a Conv1D or linear layer to keep only entries in index.
Used to remove heads.
transformers.pytorch_utils.prune_conv1d_layer
< source >( layer: Conv1D index: torch.LongTensor dim: int = 1 ) → Conv1D
Prune a Conv1D layer to keep only entries in index. A Conv1D work as a Linear layer (see e.g. BERT) but the weights are transposed.
Used to remove heads.
transformers.pytorch_utils.prune_linear_layer
< source >( layer: nn.Linear index: torch.LongTensor dim: int = 0 ) → torch.nn.Linear
Prune a linear layer to keep only entries in index.
Used to remove heads.