
In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you are supplying to the from_pretrained method.

AutoClasses are here to do this job for you so that you automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary:

Instantiating one of AutoModel, AutoConfig and AutoTokenizer will directly create a class of the relevant architecture (ex: model = AutoModel.from_pretrained('bert-base-cased') will create a instance of BertModel).


class transformers.AutoConfig[source]

AutoConfig is a generic configuration class that will be instantiated as one of the configuration classes of the library when created with the from_pretrained() class method.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

classmethod from_pretrained(pretrained_model_name_or_path, **kwargs)[source]

Instantiates one of the configuration classes of the library from a pre-trained model configuration.

The configuration class to instantiate is selected based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

  • pretrained_model_name_or_path (string) –

    Is either:
    • a string with the shortcut name of a pre-trained model configuration to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model configuration that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing a configuration file saved using the save_pretrained() method, e.g.: ./my_model_directory/.

    • a path or url to a saved configuration JSON file, e.g.: ./my_model_directory/configuration.json.

  • cache_dir (string, optional, defaults to None) – Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download (boolean, optional, defaults to False) – Force to (re-)download the model weights and configuration files and override the cached versions if they exist.

  • resume_download (boolean, optional, defaults to False) – Do not delete incompletely received file. Attempt to resume the download if such a file exists.

  • proxies (Dict[str, str], optional, defaults to None) – A dictionary of proxy servers to use by protocol or endpoint, e.g.: {'http': '', 'http://hostname': ''}. The proxies are used on each request. See the requests documentation for usage.

  • return_unused_kwargs (boolean, optional, defaults to False) –

    • If False, then this function returns just the final configuration object.

    • If True, then this functions returns a tuple (config, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: ie the part of kwargs which has not been used to update config and is otherwise ignored.

  • kwargs (Dict[str, any], optional, defaults to {}) – key/value pairs with which to update the configuration object after loading. - The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. - Behavior concerning key/value pairs whose keys are not configuration attributes is controlled by the return_unused_kwargs keyword parameter.


config = AutoConfig.from_pretrained('bert-base-uncased')  # Download configuration from S3 and cache.
config = AutoConfig.from_pretrained('./test/bert_saved_model/')  # E.g. config (or model) was saved using `save_pretrained('./test/saved_model/')`
config = AutoConfig.from_pretrained('./test/bert_saved_model/my_configuration.json')
config = AutoConfig.from_pretrained('bert-base-uncased', output_attention=True, foo=False)
assert config.output_attention == True
config, unused_kwargs = AutoConfig.from_pretrained('bert-base-uncased', output_attention=True,
                                                   foo=False, return_unused_kwargs=True)
assert config.output_attention == True
assert unused_kwargs == {'foo': False}


class transformers.AutoTokenizer[source]

AutoTokenizer is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when created with the AutoTokenizer.from_pretrained(pretrained_model_name_or_path) class method.

The from_pretrained() method take care of returning the correct tokenizer class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The tokenizer class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

  • contains t5: T5Tokenizer (T5 model)

  • contains distilbert: DistilBertTokenizer (DistilBert model)

  • contains albert: AlbertTokenizer (ALBERT model)

  • contains camembert: CamembertTokenizer (CamemBERT model)

  • contains xlm-roberta: XLMRobertaTokenizer (XLM-RoBERTa model)

  • contains roberta: RobertaTokenizer (RoBERTa model)

  • contains bert: BertTokenizer (Bert model)

  • contains openai-gpt: OpenAIGPTTokenizer (OpenAI GPT model)

  • contains gpt2: GPT2Tokenizer (OpenAI GPT-2 model)

  • contains transfo-xl: TransfoXLTokenizer (Transformer-XL model)

  • contains xlnet: XLNetTokenizer (XLNet model)

  • contains xlm: XLMTokenizer (XLM model)

  • contains ctrl: CTRLTokenizer (Salesforce CTRL model)

  • contains electra: ElectraTokenizer (Google ELECTRA model)

This class cannot be instantiated using __init__() (throw an error).

classmethod from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)[source]

Instantiate one of the tokenizer classes of the library from a pre-trained model vocabulary.

The tokenizer class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

pretrained_model_name_or_path: either:

  • a string with the shortcut name of a predefined tokenizer to load from cache or download, e.g.: bert-base-uncased.

  • a string with the identifier name of a predefined tokenizer that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

  • a path to a directory containing vocabulary files required by the tokenizer, for instance saved using the save_pretrained() method, e.g.: ./my_model_directory/.

  • (not applicable to all derived classes) a path or url to a single saved vocabulary file if and only if the tokenizer only requires a single vocabulary file (e.g. Bert, XLNet), e.g.: ./my_model_directory/vocab.txt.

cache_dir: (optional) string:

Path to a directory in which a downloaded predefined tokenizer vocabulary files should be cached if the standard cache should not be used.

force_download: (optional) boolean, default False:

Force to (re-)download the vocabulary files and override the cached versions if they exists.

resume_download: (optional) boolean, default False:

Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.

proxies: (optional) dict, default None:

A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘’, ‘http://hostname’: ‘’}. The proxies are used on each request.

use_fast: (optional) boolean, default False:

Indicate if transformers should try to load the fast version of the tokenizer (True) or use the Python one (False).

inputs: (optional) positional arguments: will be passed to the Tokenizer __init__ method.

kwargs: (optional) keyword arguments: will be passed to the Tokenizer __init__ method. Can be used to set special tokens like bos_token, eos_token, unk_token, sep_token, pad_token, cls_token, mask_token, additional_special_tokens. See parameters in the doc string of PreTrainedTokenizer for details.


# Download vocabulary from S3 and cache.
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

# Download vocabulary from S3 (user-uploaded) and cache.
tokenizer = AutoTokenizer.from_pretrained('dbmdz/bert-base-german-cased')

# If vocabulary files are in a directory (e.g. tokenizer was saved using `save_pretrained('./test/saved_model/')`)
tokenizer = AutoTokenizer.from_pretrained('./test/bert_saved_model/')


class transformers.AutoModel[source]

AutoModel is a generic model class that will be instantiated as one of the base model classes of the library when created with the AutoModel.from_pretrained(pretrained_model_name_or_path) or the AutoModel.from_config(config) class methods.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]

Instantiates one of the base model classes of the library from a configuration.


config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

  • isInstance of distilbert configuration class: DistilBertModel (DistilBERT model)

  • isInstance of roberta configuration class: RobertaModel (RoBERTa model)

  • isInstance of bert configuration class: BertModel (Bert model)

  • isInstance of openai-gpt configuration class: OpenAIGPTModel (OpenAI GPT model)

  • isInstance of gpt2 configuration class: GPT2Model (OpenAI GPT-2 model)

  • isInstance of ctrl configuration class: CTRLModel (Salesforce CTRL model)

  • isInstance of transfo-xl configuration class: TransfoXLModel (Transformer-XL model)

  • isInstance of xlnet configuration class: XLNetModel (XLNet model)

  • isInstance of xlm configuration class: XLMModel (XLM model)

  • isInstance of flaubert configuration class: FlaubertModel (Flaubert model)

  • isInstance of electra configuration class: ElectraModel (Electra model)


config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModel.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]

Instantiates one of the base model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The base model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

  • pretrained_model_name_or_path


    • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

    • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method

  • config

    (optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

    • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

    • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

    • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

  • state_dict – (optional) dict: an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

  • cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

  • resume_download – (optional) boolean, default False: Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.

  • proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘’, ‘http://hostname’: ‘’}. The proxies are used on each request.

  • output_loading_info – (optional) boolean: Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.

  • kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.


model = AutoModel.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModel.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModel.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)


class transformers.AutoModelForPreTraining[source]

AutoModelForPreTraining is a generic model class that will be instantiated as one of the model classes of the library -with the architecture used for pretraining this model– when created with the AutoModelForPreTraining.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]

Instantiates one of the base model classes of the library from a configuration.


config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:


config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelForPreTraining.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]

Instantiates one of the model classes of the library -with the architecture used for pretraining this model– from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

  • pretrained_model_name_or_path


    • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

    • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method

  • config

    (optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

    • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

    • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

    • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

  • state_dict – (optional) dict: an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

  • cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

  • resume_download – (optional) boolean, default False: Do not delete incompletely received file. Attempt to resume the download if such a file exists.

  • proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘’, ‘http://hostname’: ‘’}. The proxies are used on each request.

  • output_loading_info – (optional) boolean: Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.

  • kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.


model = AutoModelForPreTraining.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelForPreTraining.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelForPreTraining.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)


class transformers.AutoModelWithLMHead[source]

AutoModelWithLMHead is a generic model class that will be instantiated as one of the language modeling model classes of the library when created with the AutoModelWithLMHead.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]

Instantiates one of the base model classes of the library from a configuration.


config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:


config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelWithLMHead.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]

Instantiates one of the language modeling model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

  • pretrained_model_name_or_path


    • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

    • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method

  • config

    (optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

    • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

    • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

    • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

  • state_dict – (optional) dict: an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

  • cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

  • resume_download – (optional) boolean, default False: Do not delete incompletely received file. Attempt to resume the download if such a file exists.

  • proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘’, ‘http://hostname’: ‘’}. The proxies are used on each request.

  • output_loading_info – (optional) boolean: Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.

  • kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.


model = AutoModelWithLMHead.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelWithLMHead.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelWithLMHead.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)


class transformers.AutoModelForSequenceClassification[source]

AutoModelForSequenceClassification is a generic model class that will be instantiated as one of the sequence classification model classes of the library when created with the AutoModelForSequenceClassification.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]

Instantiates one of the base model classes of the library from a configuration.


config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:


config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelForSequenceClassification.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]

Instantiates one of the sequence classification model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

  • pretrained_model_name_or_path


    • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

    • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • model_args – (optional) Sequence of positional arguments: All remaining positional arguments will be passed to the underlying model’s __init__ method

  • config

    (optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

    • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

    • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

    • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

  • state_dict – (optional) dict: an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

  • cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

  • resume_download – (optional) boolean, default False: Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.

  • proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘’, ‘http://hostname’: ‘’}. The proxies are used on each request.

  • output_loading_info – (optional) boolean: Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.

  • kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.


model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelForSequenceClassification.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelForSequenceClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)


class transformers.AutoModelForQuestionAnswering[source]

AutoModelForQuestionAnswering is a generic model class that will be instantiated as one of the question answering model classes of the library when created with the AutoModelForQuestionAnswering.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]

Instantiates one of the base model classes of the library from a configuration.


config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:


config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelForQuestionAnswering.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]

Instantiates one of the question answering model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

  • pretrained_model_name_or_path


    • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

    • a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

    • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

    • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method

  • config

    (optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

    • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

    • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

    • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

  • state_dict – (optional) dict: an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

  • cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

  • proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘’, ‘http://hostname’: ‘’}. The proxies are used on each request.

  • output_loading_info – (optional) boolean: Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.

  • kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.


model = AutoModelForQuestionAnswering.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelForQuestionAnswering.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelForQuestionAnswering.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)


class transformers.AutoModelForTokenClassification[source]

AutoModelForTokenClassification is a generic model class that will be instantiated as one of the token classification model classes of the library when created with the AutoModelForTokenClassification.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]

Instantiates one of the base model classes of the library from a configuration.


config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

  • isInstance of distilbert configuration class: DistilBertModelForTokenClassification (DistilBERT model)

  • isInstance of xlm configuration class: XLMForTokenClassification (XLM model)

  • isInstance of xlm roberta configuration class: XLMRobertaModelForTokenClassification (XLMRoberta model)

  • isInstance of bert configuration class: BertModelForTokenClassification (Bert model)

  • isInstance of albert configuration class: AlbertForTokenClassification (AlBert model)

  • isInstance of xlnet configuration class: XLNetModelForTokenClassification (XLNet model)

  • isInstance of camembert configuration class: CamembertModelForTokenClassification (Camembert model)

  • isInstance of roberta configuration class: RobertaModelForTokenClassification (Roberta model)

  • isInstance of electra configuration class: ElectraForTokenClassification (Electra model)


config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelForTokenClassification.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]

Instantiates one of the question answering model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

  • pretrained_model_name_or_path


    • a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.

    • a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.

    • a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.

  • model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method

  • config

    (optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:

    • the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or

    • the model was saved using save_pretrained() and is reloaded by suppling the save directory.

    • the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.

  • state_dict – (optional) dict: an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.

  • cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.

  • force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.

  • proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘’, ‘http://hostname’: ‘’}. The proxies are used on each request.

  • output_loading_info – (optional) boolean: Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.

  • kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.


model = AutoModelForTokenClassification.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelForTokenClassification.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelForTokenClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)