Models¶
The base classes PreTrainedModel
, TFPreTrainedModel
, and
FlaxPreTrainedModel
implement the common methods for loading/saving a model either from a local
file or directory, or from a pretrained model configuration provided by the library (downloaded from HuggingFace’s AWS
S3 repository).
PreTrainedModel
and TFPreTrainedModel
also implement a few methods which
are common among all the models to:
resize the input token embeddings when new tokens are added to the vocabulary
prune the attention heads of the model.
The other methods that are common to each model are defined in ModuleUtilsMixin
(for the PyTorch models) and TFModuleUtilsMixin
(for the TensorFlow models) or
for text generation, GenerationMixin
(for the PyTorch models) and
TFGenerationMixin
(for the TensorFlow models)