Models¶

The base classes PreTrainedModel, TFPreTrainedModel, and FlaxPreTrainedModel implement the common methods for loading/saving a model either from a local file or directory, or from a pretrained model configuration provided by the library (downloaded from HuggingFace’s AWS S3 repository).

PreTrainedModel and TFPreTrainedModel also implement a few methods which are common among all the models to:

  • resize the input token embeddings when new tokens are added to the vocabulary

  • prune the attention heads of the model.

The other methods that are common to each model are defined in ModuleUtilsMixin (for the PyTorch models) and TFModuleUtilsMixin (for the TensorFlow models) or for text generation, GenerationMixin (for the PyTorch models) and TFGenerationMixin (for the TensorFlow models)

PreTrainedModel¶

ModuleUtilsMixin¶

TFPreTrainedModel¶

TFModelUtilsMixin¶

FlaxPreTrainedModel¶

Generation¶