AutoModels¶

In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you are supplying to the from_pretrained method.

AutoClasses are here to do this job for you so that you automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary:

Instantiating one of AutoModel, AutoConfig and AutoTokenizer will directly create a class of the relevant architecture (ex: model = AutoModel.from_pretrained('bert-base-cased') will create a instance of BertModel).

`AutoConfig`¶

class transformers.AutoConfig[source]¶

AutoConfig is a generic configuration class that will be instantiated as one of the configuration classes of the library when created with the from_pretrained() class method.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

classmethod from_pretrained(pretrained_model_name_or_path, **kwargs)[source]¶

Instantiates one of the configuration classes of the library from a pre-trained model configuration.

The configuration class to instantiate is selected based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

contains t5: T5Config (T5 model)

contains distilbert: DistilBertConfig (DistilBERT model)

contains albert: AlbertConfig (ALBERT model)

contains camembert: CamembertConfig (CamemBERT model)

contains xlm-roberta: XLMRobertaConfig (XLM-RoBERTa model)

contains roberta: RobertaConfig (RoBERTa model)

contains reformer: ReformerConfig (Reformer model)

contains bert: BertConfig (Bert model)

contains openai-gpt: OpenAIGPTConfig (OpenAI GPT model)

contains gpt2: GPT2Config (OpenAI GPT-2 model)

contains transfo-xl: TransfoXLConfig (Transformer-XL model)

contains xlnet: XLNetConfig (XLNet model)

contains xlm: XLMConfig (XLM model)

contains ctrl : CTRLConfig (CTRL model)

contains flaubert : FlaubertConfig (Flaubert model)

contains electra : ElectraConfig (ELECTRA model)

Parameters

pretrained_model_name_or_path (string) –
Is either:
- a string with the shortcut name of a pre-trained model configuration to load from cache or download, e.g.: bert-base-uncased.
- a string with the identifier name of a pre-trained model configuration that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.
- a path to a directory containing a configuration file saved using the save_pretrained() method, e.g.: ./my_model_directory/.
- a path or url to a saved configuration JSON file, e.g.: ./my_model_directory/configuration.json.
cache_dir (string, optional, defaults to None) – Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.
force_download (boolean, optional, defaults to False) – Force to (re-)download the model weights and configuration files and override the cached versions if they exist.
resume_download (boolean, optional, defaults to False) – Do not delete incompletely received file. Attempt to resume the download if such a file exists.
proxies (Dict[str, str], optional, defaults to None) – A dictionary of proxy servers to use by protocol or endpoint, e.g.: {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. See the requests documentation for usage.
return_unused_kwargs (boolean, optional, defaults to False) –
- If False, then this function returns just the final configuration object.
- If True, then this functions returns a tuple (config, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: ie the part of kwargs which has not been used to update config and is otherwise ignored.
kwargs (Dict[str, any], optional, defaults to {}) – key/value pairs with which to update the configuration object after loading. - The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. - Behavior concerning key/value pairs whose keys are not configuration attributes is controlled by the return_unused_kwargs keyword parameter.

Examples:

config = AutoConfig.from_pretrained('bert-base-uncased')  # Download configuration from S3 and cache.
config = AutoConfig.from_pretrained('./test/bert_saved_model/')  # E.g. config (or model) was saved using `save_pretrained('./test/saved_model/')`
config = AutoConfig.from_pretrained('./test/bert_saved_model/my_configuration.json')
config = AutoConfig.from_pretrained('bert-base-uncased', output_attention=True, foo=False)
assert config.output_attention == True
config, unused_kwargs = AutoConfig.from_pretrained('bert-base-uncased', output_attention=True,
                                                   foo=False, return_unused_kwargs=True)
assert config.output_attention == True
assert unused_kwargs == {'foo': False}

`AutoTokenizer`¶

class transformers.AutoTokenizer[source]¶

AutoTokenizer is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when created with the AutoTokenizer.from_pretrained(pretrained_model_name_or_path) class method.

The from_pretrained() method take care of returning the correct tokenizer class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The tokenizer class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

contains t5: T5Tokenizer (T5 model)

contains distilbert: DistilBertTokenizer (DistilBert model)

contains albert: AlbertTokenizer (ALBERT model)

contains camembert: CamembertTokenizer (CamemBERT model)

contains xlm-roberta: XLMRobertaTokenizer (XLM-RoBERTa model)

contains roberta: RobertaTokenizer (RoBERTa model)

contains bert: BertTokenizer (Bert model)

contains openai-gpt: OpenAIGPTTokenizer (OpenAI GPT model)

contains gpt2: GPT2Tokenizer (OpenAI GPT-2 model)

contains transfo-xl: TransfoXLTokenizer (Transformer-XL model)

contains xlnet: XLNetTokenizer (XLNet model)

contains xlm: XLMTokenizer (XLM model)

contains ctrl: CTRLTokenizer (Salesforce CTRL model)

contains electra: ElectraTokenizer (Google ELECTRA model)

This class cannot be instantiated using __init__() (throw an error).

classmethod from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)[source]¶

Instantiate one of the tokenizer classes of the library from a pre-trained model vocabulary.

The tokenizer class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

contains t5: T5Tokenizer (T5 model)

contains distilbert: DistilBertTokenizer (DistilBert model)

contains albert: AlbertTokenizer (ALBERT model)

contains camembert: CamembertTokenizer (CamemBERT model)

contains xlm-roberta: XLMRobertaTokenizer (XLM-RoBERTa model)

contains roberta: RobertaTokenizer (RoBERTa model)

contains bert-base-japanese: BertJapaneseTokenizer (Bert model)

contains bert: BertTokenizer (Bert model)

contains openai-gpt: OpenAIGPTTokenizer (OpenAI GPT model)

contains gpt2: GPT2Tokenizer (OpenAI GPT-2 model)

contains transfo-xl: TransfoXLTokenizer (Transformer-XL model)

contains xlnet: XLNetTokenizer (XLNet model)

contains xlm: XLMTokenizer (XLM model)

contains ctrl: CTRLTokenizer (Salesforce CTRL model)

contains electra: ElectraTokenizer (Google ELECTRA model)

Params:

pretrained_model_name_or_path: either:

a string with the shortcut name of a predefined tokenizer to load from cache or download, e.g.: bert-base-uncased.

a string with the identifier name of a predefined tokenizer that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.

a path to a directory containing vocabulary files required by the tokenizer, for instance saved using the save_pretrained() method, e.g.: ./my_model_directory/.

(not applicable to all derived classes) a path or url to a single saved vocabulary file if and only if the tokenizer only requires a single vocabulary file (e.g. Bert, XLNet), e.g.: ./my_model_directory/vocab.txt.

cache_dir: (optional) string:: Path to a directory in which a downloaded predefined tokenizer vocabulary files should be cached if the standard cache should not be used.
force_download: (optional) boolean, default False:: Force to (re-)download the vocabulary files and override the cached versions if they exists.
resume_download: (optional) boolean, default False:: Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.
proxies: (optional) dict, default None:: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
use_fast: (optional) boolean, default False:: Indicate if transformers should try to load the fast version of the tokenizer (True) or use the Python one (False).

inputs: (optional) positional arguments: will be passed to the Tokenizer __init__ method.

kwargs: (optional) keyword arguments: will be passed to the Tokenizer __init__ method. Can be used to set special tokens like bos_token, eos_token, unk_token, sep_token, pad_token, cls_token, mask_token, additional_special_tokens. See parameters in the doc string of PreTrainedTokenizer for details.

Examples:

# Download vocabulary from S3 and cache.
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

# Download vocabulary from S3 (user-uploaded) and cache.
tokenizer = AutoTokenizer.from_pretrained('dbmdz/bert-base-german-cased')

# If vocabulary files are in a directory (e.g. tokenizer was saved using `save_pretrained('./test/saved_model/')`)
tokenizer = AutoTokenizer.from_pretrained('./test/bert_saved_model/')

`AutoModel`¶

class transformers.AutoModel[source]¶

AutoModel is a generic model class that will be instantiated as one of the base model classes of the library when created with the AutoModel.from_pretrained(pretrained_model_name_or_path) or the AutoModel.from_config(config) class methods.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

isInstance of distilbert configuration class: DistilBertModel (DistilBERT model)
isInstance of roberta configuration class: RobertaModel (RoBERTa model)
isInstance of bert configuration class: BertModel (Bert model)
isInstance of openai-gpt configuration class: OpenAIGPTModel (OpenAI GPT model)
isInstance of gpt2 configuration class: GPT2Model (OpenAI GPT-2 model)
isInstance of ctrl configuration class: CTRLModel (Salesforce CTRL model)
isInstance of transfo-xl configuration class: TransfoXLModel (Transformer-XL model)
isInstance of xlnet configuration class: XLNetModel (XLNet model)
isInstance of xlm configuration class: XLMModel (XLM model)
isInstance of flaubert configuration class: FlaubertModel (Flaubert model)
isInstance of electra configuration class: ElectraModel (Electra model)

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModel.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the base model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The base model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

contains t5: T5Model (T5 model)

contains distilbert: DistilBertModel (DistilBERT model)

contains albert: AlbertModel (ALBERT model)

contains camembert: CamembertModel (CamemBERT model)

contains xlm-roberta: XLMRobertaModel (XLM-RoBERTa model)

contains roberta: RobertaModel (RoBERTa model)

contains bert: BertModel (Bert model)

contains openai-gpt: OpenAIGPTModel (OpenAI GPT model)

contains gpt2: GPT2Model (OpenAI GPT-2 model)

contains transfo-xl: TransfoXLModel (Transformer-XL model)

contains xlnet: XLNetModel (XLNet model)

contains xlm: XLMModel (XLM model)

contains ctrl: CTRLModel (Salesforce CTRL model)

contains flaubert: FlaubertModel (Flaubert model)

contains electra: ElectraModel (Electra model)

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
either:
- a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.
- a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.
- a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.
- a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method
config –
(optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:
- the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or
- the model was saved using save_pretrained() and is reloaded by suppling the save directory.
- the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict – (optional) dict: an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.
force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.
resume_download – (optional) boolean, default False: Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.
proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
output_loading_info – (optional) boolean: Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.
kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.

Examples:

model = AutoModel.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModel.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModel.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

`AutoModelForPreTraining`¶

class transformers.AutoModelForPreTraining[source]¶

AutoModelForPreTraining is a generic model class that will be instantiated as one of the model classes of the library -with the architecture used for pretraining this model– when created with the AutoModelForPreTraining.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

isInstance of distilbert configuration class: DistilBertForMaskedLM (DistilBERT model)
isInstance of roberta configuration class: RobertaForMaskedLM (RoBERTa model)
isInstance of bert configuration class: BertForPreTraining (Bert model)
isInstance of openai-gpt configuration class: OpenAIGPTLMHeadModel (OpenAI GPT model)
isInstance of gpt2 configuration class: GPT2LMHeadModel (OpenAI GPT-2 model)
isInstance of ctrl configuration class: CTRLLMHeadModel (Salesforce CTRL model)
isInstance of transfo-xl configuration class: TransfoXLLMHeadModel (Transformer-XL model)
isInstance of xlnet configuration class: XLNetLMHeadModel (XLNet model)
isInstance of xlm configuration class: XLMWithLMHeadModel (XLM model)
isInstance of flaubert configuration class: FlaubertWithLMHeadModel (Flaubert model)
isInstance of electra configuration class: ElectraForPreTraining (Electra model)

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelForPreTraining.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the model classes of the library -with the architecture used for pretraining this model– from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

contains t5: T5ModelWithLMHead (T5 model)

contains distilbert: DistilBertForMaskedLM (DistilBERT model)

contains albert: AlbertForMaskedLM (ALBERT model)

contains camembert: CamembertForMaskedLM (CamemBERT model)

contains xlm-roberta: XLMRobertaForMaskedLM (XLM-RoBERTa model)

contains roberta: RobertaForMaskedLM (RoBERTa model)

contains bert: BertForPreTraining (Bert model)

contains openai-gpt: OpenAIGPTLMHeadModel (OpenAI GPT model)

contains gpt2: GPT2LMHeadModel (OpenAI GPT-2 model)

contains transfo-xl: TransfoXLLMHeadModel (Transformer-XL model)

contains xlnet: XLNetLMHeadModel (XLNet model)

contains xlm: XLMWithLMHeadModel (XLM model)

contains ctrl: CTRLLMHeadModel (Salesforce CTRL model)

contains flaubert: FlaubertWithLMHeadModel (Flaubert model)

contains electra: ElectraForPreTraining (Electra model)

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
Either:
- a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.
- a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.
- a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.
- a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method
config –
(optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:
- the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or
- the model was saved using save_pretrained() and is reloaded by suppling the save directory.
- the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict – (optional) dict: an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.
force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.
resume_download – (optional) boolean, default False: Do not delete incompletely received file. Attempt to resume the download if such a file exists.
proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
output_loading_info – (optional) boolean: Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.
kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.

Examples:

model = AutoModelForPreTraining.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelForPreTraining.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelForPreTraining.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

`AutoModelWithLMHead`¶

class transformers.AutoModelWithLMHead[source]¶

AutoModelWithLMHead is a generic model class that will be instantiated as one of the language modeling model classes of the library when created with the AutoModelWithLMHead.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

isInstance of distilbert configuration class: DistilBertForMaskedLM (DistilBERT model)
isInstance of roberta configuration class: RobertaForMaskedLM (RoBERTa model)
isInstance of bert configuration class: BertForMaskedLM (Bert model)
isInstance of openai-gpt configuration class: OpenAIGPTLMHeadModel (OpenAI GPT model)
isInstance of gpt2 configuration class: GPT2LMHeadModel (OpenAI GPT-2 model)
isInstance of ctrl configuration class: CTRLLMHeadModel (Salesforce CTRL model)
isInstance of transfo-xl configuration class: TransfoXLLMHeadModel (Transformer-XL model)
isInstance of xlnet configuration class: XLNetLMHeadModel (XLNet model)
isInstance of xlm configuration class: XLMWithLMHeadModel (XLM model)
isInstance of flaubert configuration class: FlaubertWithLMHeadModel (Flaubert model)
isInstance of electra configuration class: ElectraForMaskedLM (Electra model)

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelWithLMHead.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the language modeling model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

contains t5: T5ModelWithLMHead (T5 model)

contains distilbert: DistilBertForMaskedLM (DistilBERT model)

contains albert: AlbertForMaskedLM (ALBERT model)

contains camembert: CamembertForMaskedLM (CamemBERT model)

contains xlm-roberta: XLMRobertaForMaskedLM (XLM-RoBERTa model)

contains roberta: RobertaForMaskedLM (RoBERTa model)

contains bert: BertForMaskedLM (Bert model)

contains openai-gpt: OpenAIGPTLMHeadModel (OpenAI GPT model)

contains gpt2: GPT2LMHeadModel (OpenAI GPT-2 model)

contains transfo-xl: TransfoXLLMHeadModel (Transformer-XL model)

contains xlnet: XLNetLMHeadModel (XLNet model)

contains xlm: XLMWithLMHeadModel (XLM model)

contains ctrl: CTRLLMHeadModel (Salesforce CTRL model)

contains flaubert: FlaubertWithLMHeadModel (Flaubert model)

contains electra: ElectraForMaskedLM (Electra model)

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
Either:
- a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.
- a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.
- a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.
- a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method
config –
(optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:
- the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or
- the model was saved using save_pretrained() and is reloaded by suppling the save directory.
- the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict – (optional) dict: an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.
force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.
resume_download – (optional) boolean, default False: Do not delete incompletely received file. Attempt to resume the download if such a file exists.
proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
output_loading_info – (optional) boolean: Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.
kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.

Examples:

model = AutoModelWithLMHead.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelWithLMHead.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelWithLMHead.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

`AutoModelForSequenceClassification`¶

class transformers.AutoModelForSequenceClassification[source]¶

AutoModelForSequenceClassification is a generic model class that will be instantiated as one of the sequence classification model classes of the library when created with the AutoModelForSequenceClassification.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

isInstance of distilbert configuration class: DistilBertForSequenceClassification (DistilBERT model)
isInstance of albert configuration class: AlbertForSequenceClassification (ALBERT model)
isInstance of camembert configuration class: CamembertForSequenceClassification (CamemBERT model)
isInstance of xlm roberta configuration class: XLMRobertaForSequenceClassification (XLM-RoBERTa model)
isInstance of roberta configuration class: RobertaForSequenceClassification (RoBERTa model)
isInstance of bert configuration class: BertForSequenceClassification (Bert model)
isInstance of xlnet configuration class: XLNetForSequenceClassification (XLNet model)
isInstance of xlm configuration class: XLMForSequenceClassification (XLM model)
isInstance of flaubert configuration class: FlaubertForSequenceClassification (Flaubert model)

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelForSequenceClassification.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the sequence classification model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

contains distilbert: DistilBertForSequenceClassification (DistilBERT model)

contains albert: AlbertForSequenceClassification (ALBERT model)

contains camembert: CamembertForSequenceClassification (CamemBERT model)

contains xlm-roberta: XLMRobertaForSequenceClassification (XLM-RoBERTa model)

contains roberta: RobertaForSequenceClassification (RoBERTa model)

contains bert: BertForSequenceClassification (Bert model)

contains xlnet: XLNetForSequenceClassification (XLNet model)

contains flaubert: FlaubertForSequenceClassification (Flaubert model)

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
either:
- a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.
- a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.
- a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.
- a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args – (optional) Sequence of positional arguments: All remaining positional arguments will be passed to the underlying model’s __init__ method
config –
(optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:
- the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or
- the model was saved using save_pretrained() and is reloaded by suppling the save directory.
- the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict – (optional) dict: an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.
force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.
resume_download – (optional) boolean, default False: Do not delete incompletely recieved file. Attempt to resume the download if such a file exists.
proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
output_loading_info – (optional) boolean: Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.
kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.

Examples:

model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelForSequenceClassification.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelForSequenceClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

`AutoModelForQuestionAnswering`¶

class transformers.AutoModelForQuestionAnswering[source]¶

AutoModelForQuestionAnswering is a generic model class that will be instantiated as one of the question answering model classes of the library when created with the AutoModelForQuestionAnswering.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

isInstance of distilbert configuration class: DistilBertForQuestionAnswering (DistilBERT model)
isInstance of albert configuration class: AlbertForQuestionAnswering (ALBERT model)
isInstance of bert configuration class: BertModelForQuestionAnswering (Bert model)
isInstance of xlnet configuration class: XLNetForQuestionAnswering (XLNet model)
isInstance of xlm configuration class: XLMForQuestionAnswering (XLM model)
isInstance of flaubert configuration class: FlaubertForQuestionAnswering (XLM model)

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelForQuestionAnswering.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the question answering model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

contains distilbert: DistilBertForQuestionAnswering (DistilBERT model)

contains albert: AlbertForQuestionAnswering (ALBERT model)

contains bert: BertForQuestionAnswering (Bert model)

contains xlnet: XLNetForQuestionAnswering (XLNet model)

contains xlm: XLMForQuestionAnswering (XLM model)

contains flaubert: FlaubertForQuestionAnswering (XLM model)

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
either:
- a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.
- a string with the identifier name of a pre-trained model that was user-uploaded to our S3, e.g.: dbmdz/bert-base-german-cased.
- a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.
- a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method
config –
(optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:
- the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or
- the model was saved using save_pretrained() and is reloaded by suppling the save directory.
- the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict – (optional) dict: an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.
force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.
proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
output_loading_info – (optional) boolean: Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.
kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.

Examples:

model = AutoModelForQuestionAnswering.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelForQuestionAnswering.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelForQuestionAnswering.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

`AutoModelForTokenClassification`¶

class transformers.AutoModelForTokenClassification[source]¶

AutoModelForTokenClassification is a generic model class that will be instantiated as one of the token classification model classes of the library when created with the AutoModelForTokenClassification.from_pretrained(pretrained_model_name_or_path) class method.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

isInstance of distilbert configuration class: DistilBertModelForTokenClassification (DistilBERT model)
isInstance of xlm configuration class: XLMForTokenClassification (XLM model)
isInstance of xlm roberta configuration class: XLMRobertaModelForTokenClassification (XLMRoberta model)
isInstance of bert configuration class: BertModelForTokenClassification (Bert model)
isInstance of albert configuration class: AlbertForTokenClassification (AlBert model)
isInstance of xlnet configuration class: XLNetModelForTokenClassification (XLNet model)
isInstance of camembert configuration class: CamembertModelForTokenClassification (Camembert model)
isInstance of roberta configuration class: RobertaModelForTokenClassification (Roberta model)
isInstance of electra configuration class: ElectraForTokenClassification (Electra model)

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')    # Download configuration from S3 and cache.
model = AutoModelForTokenClassification.from_config(config)  # E.g. model was saved using `save_pretrained('./test/saved_model/')`

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the question answering model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

contains distilbert: DistilBertForTokenClassification (DistilBERT model)

contains xlm: XLMForTokenClassification (XLM model)

contains xlm-roberta: XLMRobertaForTokenClassification (XLM-RoBERTa?Para model)

contains camembert: CamembertForTokenClassification (Camembert model)

contains bert: BertForTokenClassification (Bert model)

contains xlnet: XLNetForTokenClassification (XLNet model)

contains roberta: RobertaForTokenClassification (Roberta model)

contains electra: ElectraForTokenClassification (Electra model)

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
Either:
- a string with the shortcut name of a pre-trained model to load from cache or download, e.g.: bert-base-uncased.
- a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.
- a path or url to a tensorflow index checkpoint file (e.g. ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args – (optional) Sequence of positional arguments: All remaning positional arguments will be passed to the underlying model’s __init__ method
config –
(optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuation. Configuration can be automatically loaded when:
- the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or
- the model was saved using save_pretrained() and is reloaded by suppling the save directory.
- the model is loaded by suppling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict – (optional) dict: an optional state dictionnary for the model to use instead of a state dictionary loaded from saved weights file. This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.
force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.
proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
output_loading_info – (optional) boolean: Set to True to also return a dictionnary containing missing keys, unexpected keys and error messages.
kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.

Examples:

model = AutoModelForTokenClassification.from_pretrained('bert-base-uncased')    # Download model and configuration from S3 and cache.
model = AutoModelForTokenClassification.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True
# Loading from a TF checkpoint file instead of a PyTorch model (slower)
config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
model = AutoModelForTokenClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModels¶

AutoConfig¶

AutoTokenizer¶

AutoModel¶

AutoModelForPreTraining¶

AutoModelWithLMHead¶

AutoModelForSequenceClassification¶

AutoModelForQuestionAnswering¶

AutoModelForTokenClassification¶

`AutoConfig`¶

`AutoTokenizer`¶

`AutoModel`¶

`AutoModelForPreTraining`¶

`AutoModelWithLMHead`¶

`AutoModelForSequenceClassification`¶

`AutoModelForQuestionAnswering`¶

`AutoModelForTokenClassification`¶