transformers documentation

Auto Classes

Auto Classes

In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you are supplying to the from_pretrained() method. AutoClasses are here to do this job for you so that you automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary.

Instantiating one of AutoConfig, AutoModel, and AutoTokenizer will directly create a class of the relevant architecture. For instance

model = AutoModel.from_pretrained('bert-base-cased')

will create a model that is an instance of BertModel.

There is one class of AutoModel for each task, and for each backend (PyTorch, TensorFlow, or Flax).

Extending the Auto Classes

Each of the auto classes has a method to be extended with your custom classes. For instance, if you have defined a custom class of model NewModel, make sure you have a NewModelConfig then you can add those to the auto classes like this:

from transformers import AutoConfig, AutoModel

AutoConfig.register("new-model", NewModelConfig)
AutoModel.register(NewModelConfig, NewModel)

You will then be able to use the auto classes like you would usually do!

If your NewModelConfig is a subclass of PretrainedConfig, make sure its model_type attribute is set to the same key you use when registering the config (here "new-model").

Likewise, if your NewModel is a subclass of PreTrainedModel, make sure its config_class attribute is set to the same class you use when registering the model (here NewModelConfig).

AutoConfig

class transformers.AutoConfig < >

( )

This is a generic configuration class that will be instantiated as one of the configuration classes of the library when created with the from_pretrained() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_pretrained < >

( pretrained_model_name_or_path **kwargs )

Parameters

  • pretrained_model_name_or_path (str or os.PathLike) — Can be either:

    • A string, the model id of a pretrained model configuration hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing a configuration file saved using the [~PretrainedConfig.save_pretrained] method, or the [~PreTrainedModel.save_pretrained] method, e.g., ./my_model_directory/.
    • A path or url to a saved configuration JSON file, e.g., ./my_model_directory/configuration.json.
  • cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
  • force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
  • proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
  • revision(str, optional, defaults to “main”) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • return_unused_kwargs (bool, optional, defaults to False) — If False, then this function returns just the final configuration object.

    If True, then this functions returns a Tuple(config, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part of kwargs which has not been used to update config and is otherwise ignored.

  • trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
  • kwargs(additional keyword arguments, optional) — The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not configuration attributes is controlled by the return_unused_kwargs keyword parameter.

Instantiate one of the configuration classes of the library from a pretrained model configuration.

The configuration class to instantiate is selected based on the model_type property of the config object that is loaded, or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

Examples:

>>> from transformers import AutoConfig

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')

>>> # Download configuration from huggingface.co (user-uploaded) and cache.
>>> config = AutoConfig.from_pretrained('dbmdz/bert-base-german-cased')

>>> # If configuration file is in a directory (e.g., was saved using *save_pretrained('./test/saved_model/')*).
>>> config = AutoConfig.from_pretrained('./test/bert_saved_model/')

>>> # Load a specific configuration file.
>>> config = AutoConfig.from_pretrained('./test/bert_saved_model/my_configuration.json')

>>> # Change some config attributes when loading a pretrained config.
>>> config = AutoConfig.from_pretrained('bert-base-uncased', output_attentions=True, foo=False)
>>> config.output_attentions
True
>>> config, unused_kwargs = AutoConfig.from_pretrained('bert-base-uncased', output_attentions=True, foo=False, return_unused_kwargs=True)
>>> config.output_attentions
True
>>> config.unused_kwargs
{'foo': False}
register < >

( model_type config )

Parameters

  • model_type (str) — The model type like “bert” or “gpt”.
  • config (PretrainedConfig) — The config to register.

Register a new configuration for this class.

AutoTokenizer

class transformers.AutoTokenizer < >

( )

This is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when created with the AutoTokenizer.from_pretrained() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_pretrained < >

( pretrained_model_name_or_path *inputs **kwargs )

Parameters

  • pretrained_model_name_or_path (str or os.PathLike) — Can be either:

    • A string, the model id of a predefined tokenizer hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing vocabulary files required by the tokenizer, for instance saved using the [~PreTrainedTokenizer.save_pretrained] method, e.g., ./my_model_directory/.
    • A path or url to a single saved vocabulary file if and only if the tokenizer only requires a single vocabulary file (like Bert or XLNet), e.g.: ./my_model_directory/vocab.txt. (Not applicable to all derived classes)
  • inputs (additional positional arguments, optional) — Will be passed along to the Tokenizer init() method.
  • config ([PretrainedConfig], optional) — The configuration object used to dertermine the tokenizer class to instantiate.
  • cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
  • force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
  • proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
  • revision(str, optional, defaults to “main”) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • subfolder (str, optional) — In case the relevant files are located inside a subfolder of the model repo on huggingface.co (e.g. for facebook/rag-token-base), specify it here.
  • use_fast (bool, optional, defaults to True) — Whether or not to try to load the fast version of the tokenizer.
  • tokenizer_type (str, optional) — Tokenizer type to be loaded.
  • trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
  • kwargs (additional keyword arguments, optional) — Will be passed to the Tokenizer init() method. Can be used to set special tokens like bos_token, eos_token, unk_token, sep_token, pad_token, cls_token, mask_token, additional_special_tokens. See parameters in the init() for more details.

Instantiate one of the tokenizer classes of the library from a pretrained model vocabulary.

The tokenizer class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

Examples:

>>> from transformers import AutoTokenizer

>>> # Download vocabulary from huggingface.co and cache.
>>> tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

>>> # Download vocabulary from huggingface.co (user-uploaded) and cache.
>>> tokenizer = AutoTokenizer.from_pretrained('dbmdz/bert-base-german-cased')

>>> # If vocabulary files are in a directory (e.g. tokenizer was saved using *save_pretrained('./test/saved_model/')*)
>>> tokenizer = AutoTokenizer.from_pretrained('./test/bert_saved_model/')
register < >

( config_class slow_tokenizer_class = None fast_tokenizer_class = None )

Parameters

  • config_class (PretrainedConfig) — The configuration corresponding to the model to register.
  • slow_tokenizer_class (PretrainedTokenizer, optional) — The slow tokenizer to register.
  • slow_tokenizer_class (PretrainedTokenizerFast, optional) — The fast tokenizer to register.

Register a new tokenizer in this mapping.

AutoFeatureExtractor

class transformers.AutoFeatureExtractor < >

( )

This is a generic feature extractor class that will be instantiated as one of the feature extractor classes of the library when created with the AutoFeatureExtractor.from_pretrained() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_pretrained < >

( pretrained_model_name_or_path **kwargs )

Parameters

  • pretrained_model_name_or_path (str or os.PathLike) — This can be either:

    • a string, the model id of a pretrained feature_extractor hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • a path to a directory containing a feature extractor file saved using the [~feature_extraction_utils.FeatureExtractionMixin.save_pretrained] method, e.g., ./my_model_directory/.
    • a path or url to a saved feature extractor JSON file, e.g., ./my_model_directory/preprocessor_config.json.
  • cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used.
  • force_download (bool, optional, defaults to False) — Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists.
  • proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
  • use_auth_token (str or bool, optional) — The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running transformers-cli login (stored in ~/.huggingface).
  • revision(str, optional, defaults to “main”) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • return_unused_kwargs (bool, optional, defaults to False) — If False, then this function returns just the final feature extractor object. If True, then this functions returns a Tuple(feature_extractor, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of kwargs which has not been used to update feature_extractor and is otherwise ignored.
  • kwargs (Dict[str, Any], optional) — The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not feature extractor attributes is controlled by the return_unused_kwargs keyword parameter.

Instantiate one of the feature extractor classes of the library from a pretrained model vocabulary.

The feature extractor class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

Passing use_auth_token=True is required when you want to use a private model.

Examples:

>>> from transformers import AutoFeatureExtractor

>>> # Download feature extractor from huggingface.co and cache.
>>> feature_extractor = AutoFeatureExtractor.from_pretrained('facebook/wav2vec2-base-960h')

>>> # If feature extractor files are in a directory (e.g. feature extractor was saved using *save_pretrained('./test/saved_model/')*)
>>> feature_extractor = AutoFeatureExtractor.from_pretrained('./test/saved_model/')

AutoProcessor

class transformers.AutoProcessor < >

( )

This is a generic processor class that will be instantiated as one of the processor classes of the library when created with the AutoProcessor.from_pretrained() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_pretrained < >

( pretrained_model_name_or_path **kwargs )

Parameters

  • pretrained_model_name_or_path (str or os.PathLike) — This can be either:

    • a string, the model id of a pretrained feature_extractor hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • a path to a directory containing a processor files saved using the save_pretrained() method, e.g., ./my_model_directory/.
  • cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used.
  • force_download (bool, optional, defaults to False) — Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists.
  • proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
  • use_auth_token (str or bool, optional) — The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running transformers-cli login (stored in ~/.huggingface).
  • revision (str, optional, defaults to “main”) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • return_unused_kwargs (bool, optional, defaults to False) — If False, then this function returns just the final feature extractor object. If True, then this functions returns a Tuple(feature_extractor, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of kwargs which has not been used to update feature_extractor and is otherwise ignored.
  • kwargs (Dict[str, Any], optional) — The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not feature extractor attributes is controlled by the return_unused_kwargs keyword parameter.

Instantiate one of the processor classes of the library from a pretrained model vocabulary.

The processor class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible):

Passing use_auth_token=True is required when you want to use a private model.

Examples:

>>> from transformers import AutoProcessor

>>> # Download processor from huggingface.co and cache.
>>> processor = AutoProcessor.from_pretrained('facebook/wav2vec2-base-960h')

>>> # If processor files are in a directory (e.g. processor was saved using *save_pretrained('./test/saved_model/')*)
>>> processor = AutoProcessor.from_pretrained('./test/saved_model/')

AutoModel

class transformers.AutoModel < >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the base model classes of the library when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_config < >

( **kwargs )

Parameters

Instantiates one of the base model classes of the library from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use [~AutoModel.from_pretrained] to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModel
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-cased')
>>> model = AutoModel.from_config(config)
from_pretrained < >

( *model_args **kwargs )

Parameters

  • pretrained_model_name_or_path (str or os.PathLike) — Can be either:

    • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing model weights saved using [~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.
    • A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
  • model_args (additional positional arguments, optional) — Will be passed along to the underlying model init() method.
  • config ([PretrainedConfig], optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:

    • The model is a model provided by the library (loaded with the model id string of a pretrained model).
    • The model was saved using [~PreTrainedModel.save_pretrained] and is reloaded by supplying the save directory.
    • The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
  • state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file.

    This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [~PreTrainedModel.save_pretrained] and [~PreTrainedModel.from_pretrained] is not a simpler option.

  • cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
  • from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
  • force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
  • proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
  • output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
  • local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
  • revision(str, optional, defaults to “main”) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
  • kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:

    • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s init method (we assume all relevant updates to the configuration have already been done)
    • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function ([~PretrainedConfig.from_pretrained]). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s init function.

Instantiate one of the base model classes of the library from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Examples:

>>> from transformers import AutoConfig, AutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModel.from_pretrained('bert-base-cased')

>>> # Update configuration during loading
>>> model = AutoModel.from_pretrained('bert-base-cased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained('./tf_model/bert_tf_model_config.json')
>>> model = AutoModel.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForPreTraining

class transformers.AutoModelForPreTraining < >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_config < >

( **kwargs )

Parameters

Instantiates one of the model classes of the library (with a pretraining head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use [~AutoModelForPreTraining.from_pretrained] to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForPreTraining
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-cased')
>>> model = AutoModelForPreTraining.from_config(config)
from_pretrained < >

( *model_args **kwargs )

Parameters

  • pretrained_model_name_or_path (str or os.PathLike) — Can be either:

    • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing model weights saved using [~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.
    • A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
  • model_args (additional positional arguments, optional) — Will be passed along to the underlying model init() method.
  • config ([PretrainedConfig], optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:

    • The model is a model provided by the library (loaded with the model id string of a pretrained model).
    • The model was saved using [~PreTrainedModel.save_pretrained] and is reloaded by supplying the save directory.
    • The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
  • state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file.

    This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [~PreTrainedModel.save_pretrained] and [~PreTrainedModel.from_pretrained] is not a simpler option.

  • cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
  • from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
  • force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
  • proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
  • output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
  • local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
  • revision(str, optional, defaults to “main”) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
  • kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:

    • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s init method (we assume all relevant updates to the configuration have already been done)
    • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function ([~PretrainedConfig.from_pretrained]). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s init function.

Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Examples:

>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForPreTraining.from_pretrained('bert-base-cased')

>>> # Update configuration during loading
>>> model = AutoModelForPreTraining.from_pretrained('bert-base-cased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained('./tf_model/bert_tf_model_config.json')
>>> model = AutoModelForPreTraining.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForCausalLM

class transformers.AutoModelForCausalLM < >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_config < >

( **kwargs )

Parameters

Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use [~AutoModelForCausalLM.from_pretrained] to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForCausalLM
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-cased')
>>> model = AutoModelForCausalLM.from_config(config)
from_pretrained < >

( *model_args **kwargs )

Parameters

  • pretrained_model_name_or_path (str or os.PathLike) — Can be either:

    • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing model weights saved using [~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.
    • A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
  • model_args (additional positional arguments, optional) — Will be passed along to the underlying model init() method.
  • config ([PretrainedConfig], optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:

    • The model is a model provided by the library (loaded with the model id string of a pretrained model).
    • The model was saved using [~PreTrainedModel.save_pretrained] and is reloaded by supplying the save directory.
    • The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
  • state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file.

    This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [~PreTrainedModel.save_pretrained] and [~PreTrainedModel.from_pretrained] is not a simpler option.

  • cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
  • from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
  • force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
  • proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
  • output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
  • local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
  • revision(str, optional, defaults to “main”) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
  • kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:

    • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s init method (we assume all relevant updates to the configuration have already been done)
    • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function ([~PretrainedConfig.from_pretrained]). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s init function.

Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Examples:

>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCausalLM.from_pretrained('bert-base-cased')

>>> # Update configuration during loading
>>> model = AutoModelForCausalLM.from_pretrained('bert-base-cased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained('./tf_model/bert_tf_model_config.json')
>>> model = AutoModelForCausalLM.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForMaskedLM

class transformers.AutoModelForMaskedLM < >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_config < >

( **kwargs )

Parameters

Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use [~AutoModelForMaskedLM.from_pretrained] to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForMaskedLM
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-cased')
>>> model = AutoModelForMaskedLM.from_config(config)
from_pretrained < >

( *model_args **kwargs )

Parameters

  • pretrained_model_name_or_path (str or os.PathLike) — Can be either:

    • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing model weights saved using [~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.
    • A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
  • model_args (additional positional arguments, optional) — Will be passed along to the underlying model init() method.
  • config ([PretrainedConfig], optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:

    • The model is a model provided by the library (loaded with the model id string of a pretrained model).
    • The model was saved using [~PreTrainedModel.save_pretrained] and is reloaded by supplying the save directory.
    • The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
  • state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file.

    This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [~PreTrainedModel.save_pretrained] and [~PreTrainedModel.from_pretrained] is not a simpler option.

  • cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
  • from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
  • force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
  • proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
  • output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
  • local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
  • revision(str, optional, defaults to “main”) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
  • kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:

    • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s init method (we assume all relevant updates to the configuration have already been done)
    • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function ([~PretrainedConfig.from_pretrained]). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s init function.

Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Examples:

>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedLM.from_pretrained('bert-base-cased')

>>> # Update configuration during loading
>>> model = AutoModelForMaskedLM.from_pretrained('bert-base-cased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained('./tf_model/bert_tf_model_config.json')
>>> model = AutoModelForMaskedLM.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForSeq2SeqLM

class transformers.AutoModelForSeq2SeqLM < >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_config < >

( **kwargs )

Parameters

Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use [~AutoModelForSeq2SeqLM.from_pretrained] to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('t5-base')
>>> model = AutoModelForSeq2SeqLM.from_config(config)
from_pretrained < >

( *model_args **kwargs )

Parameters

  • pretrained_model_name_or_path (str or os.PathLike) — Can be either:

    • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing model weights saved using [~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.
    • A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
  • model_args (additional positional arguments, optional) — Will be passed along to the underlying model init() method.
  • config ([PretrainedConfig], optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:

    • The model is a model provided by the library (loaded with the model id string of a pretrained model).
    • The model was saved using [~PreTrainedModel.save_pretrained] and is reloaded by supplying the save directory.
    • The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
  • state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file.

    This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [~PreTrainedModel.save_pretrained] and [~PreTrainedModel.from_pretrained] is not a simpler option.

  • cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
  • from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
  • force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
  • proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
  • output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
  • local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
  • revision(str, optional, defaults to “main”) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
  • kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:

    • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s init method (we assume all relevant updates to the configuration have already been done)
    • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function ([~PretrainedConfig.from_pretrained]). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s init function.

Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Examples:

>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSeq2SeqLM.from_pretrained('t5-base')

>>> # Update configuration during loading
>>> model = AutoModelForSeq2SeqLM.from_pretrained('t5-base', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained('./tf_model/t5_tf_model_config.json')
>>> model = AutoModelForSeq2SeqLM.from_pretrained('./tf_model/t5_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForSequenceClassification

class transformers.AutoModelForSequenceClassification < >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_config < >

( **kwargs )

Parameters

Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use [~AutoModelForSequenceClassification.from_pretrained] to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForSequenceClassification
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-cased')
>>> model = AutoModelForSequenceClassification.from_config(config)
from_pretrained < >

( *model_args **kwargs )

Parameters

  • pretrained_model_name_or_path (str or os.PathLike) — Can be either:

    • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing model weights saved using [~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.
    • A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
  • model_args (additional positional arguments, optional) — Will be passed along to the underlying model init() method.
  • config ([PretrainedConfig], optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:

    • The model is a model provided by the library (loaded with the model id string of a pretrained model).
    • The model was saved using [~PreTrainedModel.save_pretrained] and is reloaded by supplying the save directory.
    • The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
  • state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file.

    This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [~PreTrainedModel.save_pretrained] and [~PreTrainedModel.from_pretrained] is not a simpler option.

  • cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
  • from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
  • force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
  • proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
  • output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
  • local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
  • revision(str, optional, defaults to “main”) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
  • kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:

    • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s init method (we assume all relevant updates to the configuration have already been done)
    • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function ([~PretrainedConfig.from_pretrained]). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s init function.

Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Examples:

>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSequenceClassification.from_pretrained('bert-base-cased')

>>> # Update configuration during loading
>>> model = AutoModelForSequenceClassification.from_pretrained('bert-base-cased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained('./tf_model/bert_tf_model_config.json')
>>> model = AutoModelForSequenceClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForMultipleChoice

class transformers.AutoModelForMultipleChoice < >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_config < >

( **kwargs )

Parameters

Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use [~AutoModelForMultipleChoice.from_pretrained] to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForMultipleChoice
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-cased')
>>> model = AutoModelForMultipleChoice.from_config(config)
from_pretrained < >

( *model_args **kwargs )

Parameters

  • pretrained_model_name_or_path (str or os.PathLike) — Can be either:

    • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing model weights saved using [~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.
    • A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
  • model_args (additional positional arguments, optional) — Will be passed along to the underlying model init() method.
  • config ([PretrainedConfig], optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:

    • The model is a model provided by the library (loaded with the model id string of a pretrained model).
    • The model was saved using [~PreTrainedModel.save_pretrained] and is reloaded by supplying the save directory.
    • The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
  • state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file.

    This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [~PreTrainedModel.save_pretrained] and [~PreTrainedModel.from_pretrained] is not a simpler option.

  • cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
  • from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
  • force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
  • proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
  • output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
  • local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
  • revision(str, optional, defaults to “main”) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
  • kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:

    • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s init method (we assume all relevant updates to the configuration have already been done)
    • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function ([~PretrainedConfig.from_pretrained]). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s init function.

Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Examples:

>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMultipleChoice.from_pretrained('bert-base-cased')

>>> # Update configuration during loading
>>> model = AutoModelForMultipleChoice.from_pretrained('bert-base-cased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained('./tf_model/bert_tf_model_config.json')
>>> model = AutoModelForMultipleChoice.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForNextSentencePrediction

class transformers.AutoModelForNextSentencePrediction < >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_config < >

( **kwargs )

Parameters

Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use [~AutoModelForNextSentencePrediction.from_pretrained] to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-cased')
>>> model = AutoModelForNextSentencePrediction.from_config(config)
from_pretrained < >

( *model_args **kwargs )

Parameters

  • pretrained_model_name_or_path (str or os.PathLike) — Can be either:

    • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing model weights saved using [~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.
    • A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
  • model_args (additional positional arguments, optional) — Will be passed along to the underlying model init() method.
  • config ([PretrainedConfig], optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:

    • The model is a model provided by the library (loaded with the model id string of a pretrained model).
    • The model was saved using [~PreTrainedModel.save_pretrained] and is reloaded by supplying the save directory.
    • The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
  • state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file.

    This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [~PreTrainedModel.save_pretrained] and [~PreTrainedModel.from_pretrained] is not a simpler option.

  • cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
  • from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
  • force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
  • proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
  • output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
  • local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
  • revision(str, optional, defaults to “main”) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
  • kwargs (additional keyword arguments, optional) — Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:

    • If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s init method (we assume all relevant updates to the configuration have already been done)
    • If a configuration is not provided, kwargs will be first passed to the configuration class initialization function ([~PretrainedConfig.from_pretrained]). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s init function.

Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Examples:

>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForNextSentencePrediction.from_pretrained('bert-base-cased')

>>> # Update configuration during loading
>>> model = AutoModelForNextSentencePrediction.from_pretrained('bert-base-cased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained('./tf_model/bert_tf_model_config.json')
>>> model = AutoModelForNextSentencePrediction.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForTokenClassification

class transformers.AutoModelForTokenClassification < >

( *args **kwargs )

This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

from_config < >

( **kwargs )

Parameters

Instantiates one of the model classes of the library (with a token classification head) from a configuration.

Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use [~AutoModelForTokenClassification.from_pretrained] to load the model weights.

Examples:

>>> from transformers import AutoConfig, AutoModelForTokenClassification
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-cased')
>>> model = AutoModelForTokenClassification.from_config(config)
from_pretrained < >

( *model_args **kwargs )

Parameters

  • pretrained_model_name_or_path (str or os.PathLike) — Can be either:

    • A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
    • A path to a directory containing model weights saved using [~PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.
    • A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
  • model_args (additional positional arguments, optional) — Will be passed along to the underlying model init() method.
  • config ([PretrainedConfig], optional) — Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:

    • The model is a model provided by the library (loaded with the model id string of a pretrained model).
    • The model was saved using [~PreTrainedModel.save_pretrained] and is reloaded by supplying the save directory.
    • The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
  • state_dict (Dict[str, torch.Tensor], optional) — A state dictionary to use instead of a state dictionary loaded from saved weights file.

    This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [~PreTrainedModel.save_pretrained] and [~PreTrainedModel.from_pretrained] is not a simpler option.

  • cache_dir (str or os.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
  • from_tf (bool, optional, defaults to False) — Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
  • force_download (bool, optional, defaults to False) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
  • resume_download (bool, optional, defaults to False) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
  • proxies (Dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
  • output_loading_info(bool, optional, defaults to False) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
  • local_files_only(bool, optional, defaults to False) — Whether or not to only look at local files (e.g., not try downloading the model).
  • revision(str, optional, defaults to “main”) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
  • trust_remote_code (bool, optional, defaults to False) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This o