Auto Classes¶

In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you are supplying to the from_pretrained() method. AutoClasses are here to do this job for you so that you automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary.

Instantiating one of AutoConfig, AutoModel, and AutoTokenizer will directly create a class of the relevant architecture. For instance

model = AutoModel.from_pretrained('bert-base-cased')

will create a model that is an instance of BertModel.

There is one class of AutoModel for each task, and for each backend (PyTorch or TensorFlow).

AutoConfig¶

class transformers.AutoConfig[source]¶

This is a generic configuration class that will be instantiated as one of the configuration classes of the library when created with the from_pretrained() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_pretrained(pretrained_model_name_or_path, **kwargs)[source]¶

Instantiate one of the configuration classes of the library from a pretrained model configuration.

The configuration class to instantiate is selected based on the model_type property of the config object that is loaded, or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

speech_to_text – Speech2TextConfig (Speech2Text model)

wav2vec2 – Wav2Vec2Config (Wav2Vec2 model)

m2m_100 – M2M100Config (M2M100 model)

convbert – ConvBertConfig (ConvBERT model)

led – LEDConfig (LED model)

blenderbot-small – BlenderbotSmallConfig (BlenderbotSmall model)

retribert – RetriBertConfig (RetriBERT model)

ibert – IBertConfig (I-BERT model)

mt5 – MT5Config (mT5 model)

t5 – T5Config (T5 model)

mobilebert – MobileBertConfig (MobileBERT model)

distilbert – DistilBertConfig (DistilBERT model)

albert – AlbertConfig (ALBERT model)

bert-generation – BertGenerationConfig (Bert Generation model)

camembert – CamembertConfig (CamemBERT model)

xlm-roberta – XLMRobertaConfig (XLM-RoBERTa model)

pegasus – PegasusConfig (Pegasus model)

marian – MarianConfig (Marian model)

mbart – MBartConfig (mBART model)

mpnet – MPNetConfig (MPNet model)

bart – BartConfig (BART model)

blenderbot – BlenderbotConfig (Blenderbot model)

reformer – ReformerConfig (Reformer model)

longformer – LongformerConfig (Longformer model)

roberta – RobertaConfig (RoBERTa model)

deberta-v2 – DebertaV2Config (DeBERTa-v2 model)

deberta – DebertaConfig (DeBERTa model)

flaubert – FlaubertConfig (FlauBERT model)

fsmt – FSMTConfig (FairSeq Machine-Translation model)

squeezebert – SqueezeBertConfig (SqueezeBERT model)

bert – BertConfig (BERT model)

openai-gpt – OpenAIGPTConfig (OpenAI GPT model)

gpt2 – GPT2Config (OpenAI GPT-2 model)

transfo-xl – TransfoXLConfig (Transformer-XL model)

xlnet – XLNetConfig (XLNet model)

xlm-prophetnet – XLMProphetNetConfig (XLMProphetNet model)

prophetnet – ProphetNetConfig (ProphetNet model)

xlm – XLMConfig (XLM model)

ctrl – CTRLConfig (CTRL model)

electra – ElectraConfig (ELECTRA model)

encoder-decoder – EncoderDecoderConfig (Encoder decoder model)

funnel – FunnelConfig (Funnel Transformer model)

lxmert – LxmertConfig (LXMERT model)

dpr – DPRConfig (DPR model)

layoutlm – LayoutLMConfig (LayoutLM model)

rag – RagConfig (RAG model)

tapas – TapasConfig (TAPAS model)

Parameters

pretrained_model_name_or_path (str or os.PathLike) –
Can be either:
- A string, the model id of a pretrained model configuration hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing a configuration file saved using the save_pretrained() method, or the save_pretrained() method, e.g., ./my_model_directory/.
- A path or url to a saved configuration JSON file, e.g., ./my_model_directory/configuration.json.
cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
return_unused_kwargs (bool, optional, defaults to False) –
If False, then this function returns just the final configuration object.

If True, then this functions returns a Tuple(config, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part of kwargs which has not been used to update config and is otherwise ignored.
kwargs (additional keyword arguments, optional) – The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not configuration attributes is controlled by the return_unused_kwargs keyword parameter.

Examples:

>>> from transformers import AutoConfig

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')

>>> # Download configuration from huggingface.co (user-uploaded) and cache.
>>> config = AutoConfig.from_pretrained('dbmdz/bert-base-german-cased')

>>> # If configuration file is in a directory (e.g., was saved using `save_pretrained('./test/saved_model/')`).
>>> config = AutoConfig.from_pretrained('./test/bert_saved_model/')

>>> # Load a specific configuration file.
>>> config = AutoConfig.from_pretrained('./test/bert_saved_model/my_configuration.json')

>>> # Change some config attributes when loading a pretrained config.
>>> config = AutoConfig.from_pretrained('bert-base-uncased', output_attentions=True, foo=False)
>>> config.output_attentions
True
>>> config, unused_kwargs = AutoConfig.from_pretrained('bert-base-uncased', output_attentions=True, foo=False, return_unused_kwargs=True)
>>> config.output_attentions
True
>>> config.unused_kwargs
{'foo': False}

AutoTokenizer¶

class transformers.AutoTokenizer[source]¶

This is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when created with the AutoTokenizer.from_pretrained() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)[source]¶

Instantiate one of the tokenizer classes of the library from a pretrained model vocabulary.

The tokenizer class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

speech_to_text – Speech2TextTokenizer (Speech2Text model)

wav2vec2 – Wav2Vec2CTCTokenizer (Wav2Vec2 model)

m2m_100 – M2M100Tokenizer (M2M100 model)

convbert – ConvBertTokenizer (ConvBERT model)

led – LEDTokenizer (LED model)

blenderbot-small – BlenderbotSmallTokenizer (BlenderbotSmall model)

retribert – RetriBertTokenizer (RetriBERT model)

ibert – RobertaTokenizer (I-BERT model)

mt5 – T5Tokenizer (mT5 model)

t5 – T5Tokenizer (T5 model)

mobilebert – MobileBertTokenizer (MobileBERT model)

distilbert – DistilBertTokenizer (DistilBERT model)

albert – AlbertTokenizer (ALBERT model)

bert-generation – BertGenerationTokenizer (Bert Generation model)

camembert – CamembertTokenizer (CamemBERT model)

xlm-roberta – XLMRobertaTokenizer (XLM-RoBERTa model)

pegasus – PegasusTokenizer (Pegasus model)

marian – MarianTokenizer (Marian model)

mbart – MBartTokenizer (mBART model)

mpnet – MPNetTokenizer (MPNet model)

bart – BartTokenizer (BART model)

blenderbot – BlenderbotTokenizer (Blenderbot model)

reformer – ReformerTokenizer (Reformer model)

longformer – LongformerTokenizer (Longformer model)

roberta – RobertaTokenizer (RoBERTa model)

deberta-v2 – DebertaV2Tokenizer (DeBERTa-v2 model)

deberta – DebertaTokenizer (DeBERTa model)

flaubert – FlaubertTokenizer (FlauBERT model)

fsmt – FSMTTokenizer (FairSeq Machine-Translation model)

squeezebert – SqueezeBertTokenizer (SqueezeBERT model)

bert – BertTokenizer (BERT model)

openai-gpt – OpenAIGPTTokenizer (OpenAI GPT model)

gpt2 – GPT2Tokenizer (OpenAI GPT-2 model)

transfo-xl – TransfoXLTokenizer (Transformer-XL model)

xlnet – XLNetTokenizer (XLNet model)

xlm-prophetnet – XLMProphetNetTokenizer (XLMProphetNet model)

prophetnet – ProphetNetTokenizer (ProphetNet model)

xlm – XLMTokenizer (XLM model)

ctrl – CTRLTokenizer (CTRL model)

electra – ElectraTokenizer (ELECTRA model)

funnel – FunnelTokenizer (Funnel Transformer model)

lxmert – LxmertTokenizer (LXMERT model)

dpr – DPRQuestionEncoderTokenizer (DPR model)

layoutlm – LayoutLMTokenizer (LayoutLM model)

rag – RagTokenizer (RAG model)

tapas – TapasTokenizer (TAPAS model)

Params:

pretrained_model_name_or_path (str or os.PathLike):

Can be either:

A string, the model id of a predefined tokenizer hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.

A path to a directory containing vocabulary files required by the tokenizer, for instance saved using the save_pretrained() method, e.g., ./my_model_directory/.

A path or url to a single saved vocabulary file if and only if the tokenizer only requires a single vocabulary file (like Bert or XLNet), e.g.: ./my_model_directory/vocab.txt. (Not applicable to all derived classes)

inputs (additional positional arguments, optional):

Will be passed along to the Tokenizer __init__() method.

config (PreTrainedConfig, optional)

The configuration object used to dertermine the tokenizer class to instantiate.

cache_dir (str or os.PathLike, optional):

Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (bool, optional, defaults to False):

Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.

resume_download (bool, optional, defaults to False):

Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.

proxies (Dict[str, str], optional):

A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.

revision(str, optional, defaults to "main"):

The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.

subfolder (str, optional):

In case the relevant files are located inside a subfolder of the model repo on huggingface.co (e.g. for facebook/rag-token-base), specify it here.

use_fast (bool, optional, defaults to True):

Whether or not to try to load the fast version of the tokenizer.

kwargs (additional keyword arguments, optional):

Will be passed to the Tokenizer __init__() method. Can be used to set special tokens like bos_token, eos_token, unk_token, sep_token, pad_token, cls_token, mask_token, additional_special_tokens. See parameters in the __init__() for more details.

Examples:

>>> from transformers import AutoTokenizer

>>> # Download vocabulary from huggingface.co and cache.
>>> tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

>>> # Download vocabulary from huggingface.co (user-uploaded) and cache.
>>> tokenizer = AutoTokenizer.from_pretrained('dbmdz/bert-base-german-cased')

>>> # If vocabulary files are in a directory (e.g. tokenizer was saved using `save_pretrained('./test/saved_model/')`)
>>> tokenizer = AutoTokenizer.from_pretrained('./test/bert_saved_model/')

AutoModel¶

class transformers.AutoModel[source]¶

This is a generic model class that will be instantiated as one of the base model classes of the library when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

Speech2TextConfig configuration class: Speech2TextModel (Speech2Text model)
Wav2Vec2Config configuration class: Wav2Vec2Model (Wav2Vec2 model)
M2M100Config configuration class: M2M100Model (M2M100 model)
ConvBertConfig configuration class: ConvBertModel (ConvBERT model)
LEDConfig configuration class: LEDModel (LED model)
BlenderbotSmallConfig configuration class: BlenderbotSmallModel (BlenderbotSmall model)
RetriBertConfig configuration class: RetriBertModel (RetriBERT model)
MT5Config configuration class: MT5Model (mT5 model)
T5Config configuration class: T5Model (T5 model)
PegasusConfig configuration class: PegasusModel (Pegasus model)
MarianConfig configuration class: MarianModel (Marian model)
MBartConfig configuration class: MBartModel (mBART model)
BlenderbotConfig configuration class: BlenderbotModel (Blenderbot model)
DistilBertConfig configuration class: DistilBertModel (DistilBERT model)
AlbertConfig configuration class: AlbertModel (ALBERT model)
CamembertConfig configuration class: CamembertModel (CamemBERT model)
XLMRobertaConfig configuration class: XLMRobertaModel (XLM-RoBERTa model)
BartConfig configuration class: BartModel (BART model)
LongformerConfig configuration class: LongformerModel (Longformer model)
RobertaConfig configuration class: RobertaModel (RoBERTa model)
LayoutLMConfig configuration class: LayoutLMModel (LayoutLM model)
SqueezeBertConfig configuration class: SqueezeBertModel (SqueezeBERT model)
BertConfig configuration class: BertModel (BERT model)
OpenAIGPTConfig configuration class: OpenAIGPTModel (OpenAI GPT model)
GPT2Config configuration class: GPT2Model (OpenAI GPT-2 model)
MobileBertConfig configuration class: MobileBertModel (MobileBERT model)
TransfoXLConfig configuration class: TransfoXLModel (Transformer-XL model)
XLNetConfig configuration class: XLNetModel (XLNet model)
FlaubertConfig configuration class: FlaubertModel (FlauBERT model)
FSMTConfig configuration class: FSMTModel (FairSeq Machine-Translation model)
XLMConfig configuration class: XLMModel (XLM model)
CTRLConfig configuration class: CTRLModel (CTRL model)
ElectraConfig configuration class: ElectraModel (ELECTRA model)
ReformerConfig configuration class: ReformerModel (Reformer model)
FunnelConfig configuration class: FunnelModel (Funnel Transformer model)
LxmertConfig configuration class: LxmertModel (LXMERT model)
BertGenerationConfig configuration class: BertGenerationEncoder (Bert Generation model)
DebertaConfig configuration class: DebertaModel (DeBERTa model)
DebertaV2Config configuration class: DebertaV2Model (DeBERTa-v2 model)
DPRConfig configuration class: DPRQuestionEncoder (DPR model)
XLMProphetNetConfig configuration class: XLMProphetNetModel (XLMProphetNet model)
ProphetNetConfig configuration class: ProphetNetModel (ProphetNet model)
MPNetConfig configuration class: MPNetModel (MPNet model)
TapasConfig configuration class: TapasModel (TAPAS model)
IBertConfig configuration class: IBertModel (I-BERT model)

Examples:

>>> from transformers import AutoConfig, AutoModel
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = AutoModel.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the base model classes of the library from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

speech_to_text – Speech2TextModel (Speech2Text model)

wav2vec2 – Wav2Vec2Model (Wav2Vec2 model)

m2m_100 – M2M100Model (M2M100 model)

convbert – ConvBertModel (ConvBERT model)

led – LEDModel (LED model)

blenderbot-small – BlenderbotSmallModel (BlenderbotSmall model)

retribert – RetriBertModel (RetriBERT model)

ibert – IBertModel (I-BERT model)

mt5 – MT5Model (mT5 model)

t5 – T5Model (T5 model)

mobilebert – MobileBertModel (MobileBERT model)

distilbert – DistilBertModel (DistilBERT model)

albert – AlbertModel (ALBERT model)

bert-generation – BertGenerationEncoder (Bert Generation model)

camembert – CamembertModel (CamemBERT model)

xlm-roberta – XLMRobertaModel (XLM-RoBERTa model)

pegasus – PegasusModel (Pegasus model)

marian – MarianModel (Marian model)

mbart – MBartModel (mBART model)

mpnet – MPNetModel (MPNet model)

bart – BartModel (BART model)

blenderbot – BlenderbotModel (Blenderbot model)

reformer – ReformerModel (Reformer model)

longformer – LongformerModel (Longformer model)

roberta – RobertaModel (RoBERTa model)

deberta-v2 – DebertaV2Model (DeBERTa-v2 model)

deberta – DebertaModel (DeBERTa model)

flaubert – FlaubertModel (FlauBERT model)

fsmt – FSMTModel (FairSeq Machine-Translation model)

squeezebert – SqueezeBertModel (SqueezeBERT model)

bert – BertModel (BERT model)

openai-gpt – OpenAIGPTModel (OpenAI GPT model)

gpt2 – GPT2Model (OpenAI GPT-2 model)

transfo-xl – TransfoXLModel (Transformer-XL model)

xlnet – XLNetModel (XLNet model)

xlm-prophetnet – XLMProphetNetModel (XLMProphetNet model)

prophetnet – ProphetNetModel (ProphetNet model)

xlm – XLMModel (XLM model)

ctrl – CTRLModel (CTRL model)

electra – ElectraModel (ELECTRA model)

funnel – FunnelModel (Funnel Transformer model)

lxmert – LxmertModel (LXMERT model)

dpr – DPRQuestionEncoder (DPR model)

layoutlm – LayoutLMModel (LayoutLM model)

tapas – TapasModel (TAPAS model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path (str or os.PathLike) –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, AutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModel.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = AutoModel.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
>>> model = AutoModel.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForPreTraining¶

class transformers.AutoModelForPreTraining[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with the architecture used for pretraining this model—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with the architecture used for pretraining this model—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

LayoutLMConfig configuration class: LayoutLMForMaskedLM (LayoutLM model)
RetriBertConfig configuration class: RetriBertModel (RetriBERT model)
T5Config configuration class: T5ForConditionalGeneration (T5 model)
DistilBertConfig configuration class: DistilBertForMaskedLM (DistilBERT model)
AlbertConfig configuration class: AlbertForPreTraining (ALBERT model)
CamembertConfig configuration class: CamembertForMaskedLM (CamemBERT model)
XLMRobertaConfig configuration class: XLMRobertaForMaskedLM (XLM-RoBERTa model)
BartConfig configuration class: BartForConditionalGeneration (BART model)
FSMTConfig configuration class: FSMTForConditionalGeneration (FairSeq Machine-Translation model)
LongformerConfig configuration class: LongformerForMaskedLM (Longformer model)
RobertaConfig configuration class: RobertaForMaskedLM (RoBERTa model)
SqueezeBertConfig configuration class: SqueezeBertForMaskedLM (SqueezeBERT model)
BertConfig configuration class: BertForPreTraining (BERT model)
OpenAIGPTConfig configuration class: OpenAIGPTLMHeadModel (OpenAI GPT model)
GPT2Config configuration class: GPT2LMHeadModel (OpenAI GPT-2 model)
MobileBertConfig configuration class: MobileBertForPreTraining (MobileBERT model)
TransfoXLConfig configuration class: TransfoXLLMHeadModel (Transformer-XL model)
XLNetConfig configuration class: XLNetLMHeadModel (XLNet model)
FlaubertConfig configuration class: FlaubertWithLMHeadModel (FlauBERT model)
XLMConfig configuration class: XLMWithLMHeadModel (XLM model)
CTRLConfig configuration class: CTRLLMHeadModel (CTRL model)
ElectraConfig configuration class: ElectraForPreTraining (ELECTRA model)
LxmertConfig configuration class: LxmertForPreTraining (LXMERT model)
FunnelConfig configuration class: FunnelForPreTraining (Funnel Transformer model)
MPNetConfig configuration class: MPNetForMaskedLM (MPNet model)
TapasConfig configuration class: TapasForMaskedLM (TAPAS model)
IBertConfig configuration class: IBertForMaskedLM (I-BERT model)
DebertaConfig configuration class: DebertaForMaskedLM (DeBERTa model)
DebertaV2Config configuration class: DebertaV2ForMaskedLM (DeBERTa-v2 model)

Examples:

>>> from transformers import AutoConfig, AutoModelForPreTraining
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = AutoModelForPreTraining.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with the architecture used for pretraining this model—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

retribert – RetriBertModel (RetriBERT model)

ibert – IBertForMaskedLM (I-BERT model)

t5 – T5ForConditionalGeneration (T5 model)

mobilebert – MobileBertForPreTraining (MobileBERT model)

distilbert – DistilBertForMaskedLM (DistilBERT model)

albert – AlbertForPreTraining (ALBERT model)

camembert – CamembertForMaskedLM (CamemBERT model)

xlm-roberta – XLMRobertaForMaskedLM (XLM-RoBERTa model)

mpnet – MPNetForMaskedLM (MPNet model)

bart – BartForConditionalGeneration (BART model)

longformer – LongformerForMaskedLM (Longformer model)

roberta – RobertaForMaskedLM (RoBERTa model)

deberta-v2 – DebertaV2ForMaskedLM (DeBERTa-v2 model)

deberta – DebertaForMaskedLM (DeBERTa model)

flaubert – FlaubertWithLMHeadModel (FlauBERT model)

fsmt – FSMTForConditionalGeneration (FairSeq Machine-Translation model)

squeezebert – SqueezeBertForMaskedLM (SqueezeBERT model)

bert – BertForPreTraining (BERT model)

openai-gpt – OpenAIGPTLMHeadModel (OpenAI GPT model)

gpt2 – GPT2LMHeadModel (OpenAI GPT-2 model)

transfo-xl – TransfoXLLMHeadModel (Transformer-XL model)

xlnet – XLNetLMHeadModel (XLNet model)

xlm – XLMWithLMHeadModel (XLM model)

ctrl – CTRLLMHeadModel (CTRL model)

electra – ElectraForPreTraining (ELECTRA model)

funnel – FunnelForPreTraining (Funnel Transformer model)

lxmert – LxmertForPreTraining (LXMERT model)

layoutlm – LayoutLMForMaskedLM (LayoutLM model)

tapas – TapasForMaskedLM (TAPAS model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path (str or os.PathLike) –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForPreTraining.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = AutoModelForPreTraining.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
>>> model = AutoModelForPreTraining.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForCausalLM¶

class transformers.AutoModelForCausalLM[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a causal language modeling head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a causal language modeling head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

CamembertConfig configuration class: CamembertForCausalLM (CamemBERT model)
XLMRobertaConfig configuration class: XLMRobertaForCausalLM (XLM-RoBERTa model)
RobertaConfig configuration class: RobertaForCausalLM (RoBERTa model)
BertConfig configuration class: BertLMHeadModel (BERT model)
OpenAIGPTConfig configuration class: OpenAIGPTLMHeadModel (OpenAI GPT model)
GPT2Config configuration class: GPT2LMHeadModel (OpenAI GPT-2 model)
TransfoXLConfig configuration class: TransfoXLLMHeadModel (Transformer-XL model)
XLNetConfig configuration class: XLNetLMHeadModel (XLNet model)
XLMConfig configuration class: XLMWithLMHeadModel (XLM model)
CTRLConfig configuration class: CTRLLMHeadModel (CTRL model)
ReformerConfig configuration class: ReformerModelWithLMHead (Reformer model)
BertGenerationConfig configuration class: BertGenerationDecoder (Bert Generation model)
XLMProphetNetConfig configuration class: XLMProphetNetForCausalLM (XLMProphetNet model)
ProphetNetConfig configuration class: ProphetNetForCausalLM (ProphetNet model)
BartConfig configuration class: BartForCausalLM (BART model)
MBartConfig configuration class: MBartForCausalLM (mBART model)
PegasusConfig configuration class: PegasusForCausalLM (Pegasus model)
MarianConfig configuration class: MarianForCausalLM (Marian model)
BlenderbotConfig configuration class: BlenderbotForCausalLM (Blenderbot model)
BlenderbotSmallConfig configuration class: BlenderbotSmallForCausalLM (BlenderbotSmall model)

Examples:

>>> from transformers import AutoConfig, AutoModelForCausalLM
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('gpt2')
>>> model = AutoModelForCausalLM.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a causal language modeling head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

blenderbot-small – BlenderbotSmallForCausalLM (BlenderbotSmall model)

bert-generation – BertGenerationDecoder (Bert Generation model)

camembert – CamembertForCausalLM (CamemBERT model)

xlm-roberta – XLMRobertaForCausalLM (XLM-RoBERTa model)

pegasus – PegasusForCausalLM (Pegasus model)

marian – MarianForCausalLM (Marian model)

mbart – MBartForCausalLM (mBART model)

bart – BartForCausalLM (BART model)

blenderbot – BlenderbotForCausalLM (Blenderbot model)

reformer – ReformerModelWithLMHead (Reformer model)

roberta – RobertaForCausalLM (RoBERTa model)

bert – BertLMHeadModel (BERT model)

openai-gpt – OpenAIGPTLMHeadModel (OpenAI GPT model)

gpt2 – GPT2LMHeadModel (OpenAI GPT-2 model)

transfo-xl – TransfoXLLMHeadModel (Transformer-XL model)

xlnet – XLNetLMHeadModel (XLNet model)

xlm-prophetnet – XLMProphetNetForCausalLM (XLMProphetNet model)

prophetnet – ProphetNetForCausalLM (ProphetNet model)

xlm – XLMWithLMHeadModel (XLM model)

ctrl – CTRLLMHeadModel (CTRL model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path (str or os.PathLike) –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCausalLM.from_pretrained('gpt2')

>>> # Update configuration during loading
>>> model = AutoModelForCausalLM.from_pretrained('gpt2', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_json_file('./tf_model/gpt2_tf_model_config.json')
>>> model = AutoModelForCausalLM.from_pretrained('./tf_model/gpt2_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForMaskedLM¶

class transformers.AutoModelForMaskedLM[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a masked language modeling head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a masked language modeling head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

Wav2Vec2Config configuration class: Wav2Vec2ForMaskedLM (Wav2Vec2 model)
ConvBertConfig configuration class: ConvBertForMaskedLM (ConvBERT model)
LayoutLMConfig configuration class: LayoutLMForMaskedLM (LayoutLM model)
DistilBertConfig configuration class: DistilBertForMaskedLM (DistilBERT model)
AlbertConfig configuration class: AlbertForMaskedLM (ALBERT model)
BartConfig configuration class: BartForConditionalGeneration (BART model)
MBartConfig configuration class: MBartForConditionalGeneration (mBART model)
CamembertConfig configuration class: CamembertForMaskedLM (CamemBERT model)
XLMRobertaConfig configuration class: XLMRobertaForMaskedLM (XLM-RoBERTa model)
LongformerConfig configuration class: LongformerForMaskedLM (Longformer model)
RobertaConfig configuration class: RobertaForMaskedLM (RoBERTa model)
SqueezeBertConfig configuration class: SqueezeBertForMaskedLM (SqueezeBERT model)
BertConfig configuration class: BertForMaskedLM (BERT model)
MobileBertConfig configuration class: MobileBertForMaskedLM (MobileBERT model)
FlaubertConfig configuration class: FlaubertWithLMHeadModel (FlauBERT model)
XLMConfig configuration class: XLMWithLMHeadModel (XLM model)
ElectraConfig configuration class: ElectraForMaskedLM (ELECTRA model)
ReformerConfig configuration class: ReformerForMaskedLM (Reformer model)
FunnelConfig configuration class: FunnelForMaskedLM (Funnel Transformer model)
MPNetConfig configuration class: MPNetForMaskedLM (MPNet model)
TapasConfig configuration class: TapasForMaskedLM (TAPAS model)
DebertaConfig configuration class: DebertaForMaskedLM (DeBERTa model)
DebertaV2Config configuration class: DebertaV2ForMaskedLM (DeBERTa-v2 model)
IBertConfig configuration class: IBertForMaskedLM (I-BERT model)

Examples:

>>> from transformers import AutoConfig, AutoModelForMaskedLM
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = AutoModelForMaskedLM.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a masked language modeling head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

wav2vec2 – Wav2Vec2ForMaskedLM (Wav2Vec2 model)

convbert – ConvBertForMaskedLM (ConvBERT model)

ibert – IBertForMaskedLM (I-BERT model)

mobilebert – MobileBertForMaskedLM (MobileBERT model)

distilbert – DistilBertForMaskedLM (DistilBERT model)

albert – AlbertForMaskedLM (ALBERT model)

camembert – CamembertForMaskedLM (CamemBERT model)

xlm-roberta – XLMRobertaForMaskedLM (XLM-RoBERTa model)

mbart – MBartForConditionalGeneration (mBART model)

mpnet – MPNetForMaskedLM (MPNet model)

bart – BartForConditionalGeneration (BART model)

reformer – ReformerForMaskedLM (Reformer model)

longformer – LongformerForMaskedLM (Longformer model)

roberta – RobertaForMaskedLM (RoBERTa model)

deberta-v2 – DebertaV2ForMaskedLM (DeBERTa-v2 model)

deberta – DebertaForMaskedLM (DeBERTa model)

flaubert – FlaubertWithLMHeadModel (FlauBERT model)

squeezebert – SqueezeBertForMaskedLM (SqueezeBERT model)

bert – BertForMaskedLM (BERT model)

xlm – XLMWithLMHeadModel (XLM model)

electra – ElectraForMaskedLM (ELECTRA model)

funnel – FunnelForMaskedLM (Funnel Transformer model)

layoutlm – LayoutLMForMaskedLM (LayoutLM model)

tapas – TapasForMaskedLM (TAPAS model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path (str or os.PathLike) –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedLM.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = AutoModelForMaskedLM.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
>>> model = AutoModelForMaskedLM.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForSeq2SeqLM¶

class transformers.AutoModelForSeq2SeqLM[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a sequence-to-sequence language modeling head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a sequence-to-sequence language modeling head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

M2M100Config configuration class: M2M100ForConditionalGeneration (M2M100 model)
LEDConfig configuration class: LEDForConditionalGeneration (LED model)
BlenderbotSmallConfig configuration class: BlenderbotSmallForConditionalGeneration (BlenderbotSmall model)
MT5Config configuration class: MT5ForConditionalGeneration (mT5 model)
T5Config configuration class: T5ForConditionalGeneration (T5 model)
PegasusConfig configuration class: PegasusForConditionalGeneration (Pegasus model)
MarianConfig configuration class: MarianMTModel (Marian model)
MBartConfig configuration class: MBartForConditionalGeneration (mBART model)
BlenderbotConfig configuration class: BlenderbotForConditionalGeneration (Blenderbot model)
BartConfig configuration class: BartForConditionalGeneration (BART model)
FSMTConfig configuration class: FSMTForConditionalGeneration (FairSeq Machine-Translation model)
EncoderDecoderConfig configuration class: EncoderDecoderModel (Encoder decoder model)
XLMProphetNetConfig configuration class: XLMProphetNetForConditionalGeneration (XLMProphetNet model)
ProphetNetConfig configuration class: ProphetNetForConditionalGeneration (ProphetNet model)

Examples:

>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('t5')
>>> model = AutoModelForSeq2SeqLM.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a sequence-to-sequence language modeling head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

m2m_100 – M2M100ForConditionalGeneration (M2M100 model)

led – LEDForConditionalGeneration (LED model)

blenderbot-small – BlenderbotSmallForConditionalGeneration (BlenderbotSmall model)

mt5 – MT5ForConditionalGeneration (mT5 model)

t5 – T5ForConditionalGeneration (T5 model)

pegasus – PegasusForConditionalGeneration (Pegasus model)

marian – MarianMTModel (Marian model)

mbart – MBartForConditionalGeneration (mBART model)

bart – BartForConditionalGeneration (BART model)

blenderbot – BlenderbotForConditionalGeneration (Blenderbot model)

fsmt – FSMTForConditionalGeneration (FairSeq Machine-Translation model)

xlm-prophetnet – XLMProphetNetForConditionalGeneration (XLMProphetNet model)

prophetnet – ProphetNetForConditionalGeneration (ProphetNet model)

encoder-decoder – EncoderDecoderModel (Encoder decoder model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path (str or os.PathLike) –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSeq2SeqLM.from_pretrained('t5-base')

>>> # Update configuration during loading
>>> model = AutoModelForSeq2SeqLM.from_pretrained('t5-base', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_json_file('./tf_model/t5_tf_model_config.json')
>>> model = AutoModelForSeq2SeqLM.from_pretrained('./tf_model/t5_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForSequenceClassification¶

class transformers.AutoModelForSequenceClassification[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a sequence classification head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a sequence classification head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

ConvBertConfig configuration class: ConvBertForSequenceClassification (ConvBERT model)
LEDConfig configuration class: LEDForSequenceClassification (LED model)
DistilBertConfig configuration class: DistilBertForSequenceClassification (DistilBERT model)
AlbertConfig configuration class: AlbertForSequenceClassification (ALBERT model)
CamembertConfig configuration class: CamembertForSequenceClassification (CamemBERT model)
XLMRobertaConfig configuration class: XLMRobertaForSequenceClassification (XLM-RoBERTa model)
MBartConfig configuration class: MBartForSequenceClassification (mBART model)
BartConfig configuration class: BartForSequenceClassification (BART model)
LongformerConfig configuration class: LongformerForSequenceClassification (Longformer model)
RobertaConfig configuration class: RobertaForSequenceClassification (RoBERTa model)
SqueezeBertConfig configuration class: SqueezeBertForSequenceClassification (SqueezeBERT model)
LayoutLMConfig configuration class: LayoutLMForSequenceClassification (LayoutLM model)
BertConfig configuration class: BertForSequenceClassification (BERT model)
XLNetConfig configuration class: XLNetForSequenceClassification (XLNet model)
MobileBertConfig configuration class: MobileBertForSequenceClassification (MobileBERT model)
FlaubertConfig configuration class: FlaubertForSequenceClassification (FlauBERT model)
XLMConfig configuration class: XLMForSequenceClassification (XLM model)
ElectraConfig configuration class: ElectraForSequenceClassification (ELECTRA model)
FunnelConfig configuration class: FunnelForSequenceClassification (Funnel Transformer model)
DebertaConfig configuration class: DebertaForSequenceClassification (DeBERTa model)
DebertaV2Config configuration class: DebertaV2ForSequenceClassification (DeBERTa-v2 model)
GPT2Config configuration class: GPT2ForSequenceClassification (OpenAI GPT-2 model)
OpenAIGPTConfig configuration class: OpenAIGPTForSequenceClassification (OpenAI GPT model)
ReformerConfig configuration class: ReformerForSequenceClassification (Reformer model)
CTRLConfig configuration class: CTRLForSequenceClassification (CTRL model)
TransfoXLConfig configuration class: TransfoXLForSequenceClassification (Transformer-XL model)
MPNetConfig configuration class: MPNetForSequenceClassification (MPNet model)
TapasConfig configuration class: TapasForSequenceClassification (TAPAS model)
IBertConfig configuration class: IBertForSequenceClassification (I-BERT model)

Examples:

>>> from transformers import AutoConfig, AutoModelForSequenceClassification
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = AutoModelForSequenceClassification.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a sequence classification head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

convbert – ConvBertForSequenceClassification (ConvBERT model)

led – LEDForSequenceClassification (LED model)

ibert – IBertForSequenceClassification (I-BERT model)

mobilebert – MobileBertForSequenceClassification (MobileBERT model)

distilbert – DistilBertForSequenceClassification (DistilBERT model)

albert – AlbertForSequenceClassification (ALBERT model)

camembert – CamembertForSequenceClassification (CamemBERT model)

xlm-roberta – XLMRobertaForSequenceClassification (XLM-RoBERTa model)

mbart – MBartForSequenceClassification (mBART model)

mpnet – MPNetForSequenceClassification (MPNet model)

bart – BartForSequenceClassification (BART model)

reformer – ReformerForSequenceClassification (Reformer model)

longformer – LongformerForSequenceClassification (Longformer model)

roberta – RobertaForSequenceClassification (RoBERTa model)

deberta-v2 – DebertaV2ForSequenceClassification (DeBERTa-v2 model)

deberta – DebertaForSequenceClassification (DeBERTa model)

flaubert – FlaubertForSequenceClassification (FlauBERT model)

squeezebert – SqueezeBertForSequenceClassification (SqueezeBERT model)

bert – BertForSequenceClassification (BERT model)

openai-gpt – OpenAIGPTForSequenceClassification (OpenAI GPT model)

gpt2 – GPT2ForSequenceClassification (OpenAI GPT-2 model)

transfo-xl – TransfoXLForSequenceClassification (Transformer-XL model)

xlnet – XLNetForSequenceClassification (XLNet model)

xlm – XLMForSequenceClassification (XLM model)

ctrl – CTRLForSequenceClassification (CTRL model)

electra – ElectraForSequenceClassification (ELECTRA model)

funnel – FunnelForSequenceClassification (Funnel Transformer model)

layoutlm – LayoutLMForSequenceClassification (LayoutLM model)

tapas – TapasForSequenceClassification (TAPAS model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path (str or os.PathLike) –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
>>> model = AutoModelForSequenceClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForMultipleChoice¶

class transformers.AutoModelForMultipleChoice[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a multiple choice classification head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a multiple choice classification head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

ConvBertConfig configuration class: ConvBertForMultipleChoice (ConvBERT model)
CamembertConfig configuration class: CamembertForMultipleChoice (CamemBERT model)
ElectraConfig configuration class: ElectraForMultipleChoice (ELECTRA model)
XLMRobertaConfig configuration class: XLMRobertaForMultipleChoice (XLM-RoBERTa model)
LongformerConfig configuration class: LongformerForMultipleChoice (Longformer model)
RobertaConfig configuration class: RobertaForMultipleChoice (RoBERTa model)
SqueezeBertConfig configuration class: SqueezeBertForMultipleChoice (SqueezeBERT model)
BertConfig configuration class: BertForMultipleChoice (BERT model)
DistilBertConfig configuration class: DistilBertForMultipleChoice (DistilBERT model)
MobileBertConfig configuration class: MobileBertForMultipleChoice (MobileBERT model)
XLNetConfig configuration class: XLNetForMultipleChoice (XLNet model)
AlbertConfig configuration class: AlbertForMultipleChoice (ALBERT model)
XLMConfig configuration class: XLMForMultipleChoice (XLM model)
FlaubertConfig configuration class: FlaubertForMultipleChoice (FlauBERT model)
FunnelConfig configuration class: FunnelForMultipleChoice (Funnel Transformer model)
MPNetConfig configuration class: MPNetForMultipleChoice (MPNet model)
IBertConfig configuration class: IBertForMultipleChoice (I-BERT model)

Examples:

>>> from transformers import AutoConfig, AutoModelForMultipleChoice
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = AutoModelForMultipleChoice.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a multiple choice classification head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

convbert – ConvBertForMultipleChoice (ConvBERT model)

ibert – IBertForMultipleChoice (I-BERT model)

mobilebert – MobileBertForMultipleChoice (MobileBERT model)

distilbert – DistilBertForMultipleChoice (DistilBERT model)

albert – AlbertForMultipleChoice (ALBERT model)

camembert – CamembertForMultipleChoice (CamemBERT model)

xlm-roberta – XLMRobertaForMultipleChoice (XLM-RoBERTa model)

mpnet – MPNetForMultipleChoice (MPNet model)

longformer – LongformerForMultipleChoice (Longformer model)

roberta – RobertaForMultipleChoice (RoBERTa model)

flaubert – FlaubertForMultipleChoice (FlauBERT model)

squeezebert – SqueezeBertForMultipleChoice (SqueezeBERT model)

bert – BertForMultipleChoice (BERT model)

xlnet – XLNetForMultipleChoice (XLNet model)

xlm – XLMForMultipleChoice (XLM model)

electra – ElectraForMultipleChoice (ELECTRA model)

funnel – FunnelForMultipleChoice (Funnel Transformer model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path (str or os.PathLike) –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMultipleChoice.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = AutoModelForMultipleChoice.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
>>> model = AutoModelForMultipleChoice.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForNextSentencePrediction¶

class transformers.AutoModelForNextSentencePrediction[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a next sentence prediction head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a multiple choice classification head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

BertConfig configuration class: BertForNextSentencePrediction (BERT model)
MobileBertConfig configuration class: MobileBertForNextSentencePrediction (MobileBERT model)

Examples:

>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = AutoModelForNextSentencePrediction.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a multiple choice classification head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

mobilebert – MobileBertForNextSentencePrediction (MobileBERT model)

bert – BertForNextSentencePrediction (BERT model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path (str or os.PathLike) –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForNextSentencePrediction.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = AutoModelForNextSentencePrediction.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
>>> model = AutoModelForNextSentencePrediction.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForTokenClassification¶

class transformers.AutoModelForTokenClassification[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a token classification head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a token classification head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

ConvBertConfig configuration class: ConvBertForTokenClassification (ConvBERT model)
LayoutLMConfig configuration class: LayoutLMForTokenClassification (LayoutLM model)
DistilBertConfig configuration class: DistilBertForTokenClassification (DistilBERT model)
CamembertConfig configuration class: CamembertForTokenClassification (CamemBERT model)
FlaubertConfig configuration class: FlaubertForTokenClassification (FlauBERT model)
XLMConfig configuration class: XLMForTokenClassification (XLM model)
XLMRobertaConfig configuration class: XLMRobertaForTokenClassification (XLM-RoBERTa model)
LongformerConfig configuration class: LongformerForTokenClassification (Longformer model)
RobertaConfig configuration class: RobertaForTokenClassification (RoBERTa model)
SqueezeBertConfig configuration class: SqueezeBertForTokenClassification (SqueezeBERT model)
BertConfig configuration class: BertForTokenClassification (BERT model)
MobileBertConfig configuration class: MobileBertForTokenClassification (MobileBERT model)
XLNetConfig configuration class: XLNetForTokenClassification (XLNet model)
AlbertConfig configuration class: AlbertForTokenClassification (ALBERT model)
ElectraConfig configuration class: ElectraForTokenClassification (ELECTRA model)
FunnelConfig configuration class: FunnelForTokenClassification (Funnel Transformer model)
MPNetConfig configuration class: MPNetForTokenClassification (MPNet model)
DebertaConfig configuration class: DebertaForTokenClassification (DeBERTa model)
DebertaV2Config configuration class: DebertaV2ForTokenClassification (DeBERTa-v2 model)
IBertConfig configuration class: IBertForTokenClassification (I-BERT model)

Examples:

>>> from transformers import AutoConfig, AutoModelForTokenClassification
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = AutoModelForTokenClassification.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a token classification head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

convbert – ConvBertForTokenClassification (ConvBERT model)

ibert – IBertForTokenClassification (I-BERT model)

mobilebert – MobileBertForTokenClassification (MobileBERT model)

distilbert – DistilBertForTokenClassification (DistilBERT model)

albert – AlbertForTokenClassification (ALBERT model)

camembert – CamembertForTokenClassification (CamemBERT model)

xlm-roberta – XLMRobertaForTokenClassification (XLM-RoBERTa model)

mpnet – MPNetForTokenClassification (MPNet model)

longformer – LongformerForTokenClassification (Longformer model)

roberta – RobertaForTokenClassification (RoBERTa model)

deberta-v2 – DebertaV2ForTokenClassification (DeBERTa-v2 model)

deberta – DebertaForTokenClassification (DeBERTa model)

flaubert – FlaubertForTokenClassification (FlauBERT model)

squeezebert – SqueezeBertForTokenClassification (SqueezeBERT model)

bert – BertForTokenClassification (BERT model)

xlnet – XLNetForTokenClassification (XLNet model)

xlm – XLMForTokenClassification (XLM model)

electra – ElectraForTokenClassification (ELECTRA model)

funnel – FunnelForTokenClassification (Funnel Transformer model)

layoutlm – LayoutLMForTokenClassification (LayoutLM model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path (str or os.PathLike) –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, AutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTokenClassification.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = AutoModelForTokenClassification.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
>>> model = AutoModelForTokenClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForQuestionAnswering¶

class transformers.AutoModelForQuestionAnswering[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a question answering head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a question answering head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

ConvBertConfig configuration class: ConvBertForQuestionAnswering (ConvBERT model)
LEDConfig configuration class: LEDForQuestionAnswering (LED model)
DistilBertConfig configuration class: DistilBertForQuestionAnswering (DistilBERT model)
AlbertConfig configuration class: AlbertForQuestionAnswering (ALBERT model)
CamembertConfig configuration class: CamembertForQuestionAnswering (CamemBERT model)
BartConfig configuration class: BartForQuestionAnswering (BART model)
MBartConfig configuration class: MBartForQuestionAnswering (mBART model)
LongformerConfig configuration class: LongformerForQuestionAnswering (Longformer model)
XLMRobertaConfig configuration class: XLMRobertaForQuestionAnswering (XLM-RoBERTa model)
RobertaConfig configuration class: RobertaForQuestionAnswering (RoBERTa model)
SqueezeBertConfig configuration class: SqueezeBertForQuestionAnswering (SqueezeBERT model)
BertConfig configuration class: BertForQuestionAnswering (BERT model)
XLNetConfig configuration class: XLNetForQuestionAnsweringSimple (XLNet model)
FlaubertConfig configuration class: FlaubertForQuestionAnsweringSimple (FlauBERT model)
MobileBertConfig configuration class: MobileBertForQuestionAnswering (MobileBERT model)
XLMConfig configuration class: XLMForQuestionAnsweringSimple (XLM model)
ElectraConfig configuration class: ElectraForQuestionAnswering (ELECTRA model)
ReformerConfig configuration class: ReformerForQuestionAnswering (Reformer model)
FunnelConfig configuration class: FunnelForQuestionAnswering (Funnel Transformer model)
LxmertConfig configuration class: LxmertForQuestionAnswering (LXMERT model)
MPNetConfig configuration class: MPNetForQuestionAnswering (MPNet model)
DebertaConfig configuration class: DebertaForQuestionAnswering (DeBERTa model)
DebertaV2Config configuration class: DebertaV2ForQuestionAnswering (DeBERTa-v2 model)
IBertConfig configuration class: IBertForQuestionAnswering (I-BERT model)

Examples:

>>> from transformers import AutoConfig, AutoModelForQuestionAnswering
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = AutoModelForQuestionAnswering.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a question answering head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

convbert – ConvBertForQuestionAnswering (ConvBERT model)

led – LEDForQuestionAnswering (LED model)

ibert – IBertForQuestionAnswering (I-BERT model)

mobilebert – MobileBertForQuestionAnswering (MobileBERT model)

distilbert – DistilBertForQuestionAnswering (DistilBERT model)

albert – AlbertForQuestionAnswering (ALBERT model)

camembert – CamembertForQuestionAnswering (CamemBERT model)

xlm-roberta – XLMRobertaForQuestionAnswering (XLM-RoBERTa model)

mbart – MBartForQuestionAnswering (mBART model)

mpnet – MPNetForQuestionAnswering (MPNet model)

bart – BartForQuestionAnswering (BART model)

reformer – ReformerForQuestionAnswering (Reformer model)

longformer – LongformerForQuestionAnswering (Longformer model)

roberta – RobertaForQuestionAnswering (RoBERTa model)

deberta-v2 – DebertaV2ForQuestionAnswering (DeBERTa-v2 model)

deberta – DebertaForQuestionAnswering (DeBERTa model)

flaubert – FlaubertForQuestionAnsweringSimple (FlauBERT model)

squeezebert – SqueezeBertForQuestionAnswering (SqueezeBERT model)

bert – BertForQuestionAnswering (BERT model)

xlnet – XLNetForQuestionAnsweringSimple (XLNet model)

xlm – XLMForQuestionAnsweringSimple (XLM model)

electra – ElectraForQuestionAnswering (ELECTRA model)

funnel – FunnelForQuestionAnswering (Funnel Transformer model)

lxmert – LxmertForQuestionAnswering (LXMERT model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path (str or os.PathLike) –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, AutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForQuestionAnswering.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = AutoModelForQuestionAnswering.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json')
>>> model = AutoModelForQuestionAnswering.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)

AutoModelForTableQuestionAnswering¶

class transformers.AutoModelForTableQuestionAnswering[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a table question answering head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a table question answering head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

TapasConfig configuration class: TapasForQuestionAnswering (TAPAS model)

Examples:

>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('google/tapas-base-finetuned-wtq')
>>> model = AutoModelForTableQuestionAnswering.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a table question answering head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

tapas – TapasForQuestionAnswering (TAPAS model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path (str or os.PathLike) –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a tensorflow index checkpoint file (e.g, ./tf_model/model.ckpt.index). In this case, from_tf should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTableQuestionAnswering.from_pretrained('google/tapas-base-finetuned-wtq')

>>> # Update configuration during loading
>>> model = AutoModelForTableQuestionAnswering.from_pretrained('google/tapas-base-finetuned-wtq', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_json_file('./tf_model/tapas_tf_checkpoint.json')
>>> model = AutoModelForQuestionAnswering.from_pretrained('./tf_model/tapas_tf_checkpoint.ckpt.index', from_tf=True, config=config)

TFAutoModel¶

class transformers.TFAutoModel[source]¶

This is a generic model class that will be instantiated as one of the base model classes of the library when created with the when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config, **kwargs)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

ConvBertConfig configuration class: TFConvBertModel (ConvBERT model)
LEDConfig configuration class: TFLEDModel (LED model)
LxmertConfig configuration class: TFLxmertModel (LXMERT model)
MT5Config configuration class: TFMT5Model (mT5 model)
T5Config configuration class: TFT5Model (T5 model)
DistilBertConfig configuration class: TFDistilBertModel (DistilBERT model)
AlbertConfig configuration class: TFAlbertModel (ALBERT model)
BartConfig configuration class: TFBartModel (BART model)
CamembertConfig configuration class: TFCamembertModel (CamemBERT model)
XLMRobertaConfig configuration class: TFXLMRobertaModel (XLM-RoBERTa model)
LongformerConfig configuration class: TFLongformerModel (Longformer model)
RobertaConfig configuration class: TFRobertaModel (RoBERTa model)
BertConfig configuration class: TFBertModel (BERT model)
OpenAIGPTConfig configuration class: TFOpenAIGPTModel (OpenAI GPT model)
GPT2Config configuration class: TFGPT2Model (OpenAI GPT-2 model)
MobileBertConfig configuration class: TFMobileBertModel (MobileBERT model)
TransfoXLConfig configuration class: TFTransfoXLModel (Transformer-XL model)
XLNetConfig configuration class: TFXLNetModel (XLNet model)
FlaubertConfig configuration class: TFFlaubertModel (FlauBERT model)
XLMConfig configuration class: TFXLMModel (XLM model)
CTRLConfig configuration class: TFCTRLModel (CTRL model)
ElectraConfig configuration class: TFElectraModel (ELECTRA model)
FunnelConfig configuration class: TFFunnelModel (Funnel Transformer model)
DPRConfig configuration class: TFDPRQuestionEncoder (DPR model)
MPNetConfig configuration class: TFMPNetModel (MPNet model)
MBartConfig configuration class: TFMBartModel (mBART model)
MarianConfig configuration class: TFMarianModel (Marian model)
PegasusConfig configuration class: TFPegasusModel (Pegasus model)
BlenderbotConfig configuration class: TFBlenderbotModel (Blenderbot model)
BlenderbotSmallConfig configuration class: TFBlenderbotSmallModel (BlenderbotSmall model)

Examples:

>>> from transformers import AutoConfig, TFAutoModel
>>> # Download configuration from huggingface.co and cache.
>>> config = TFAutoConfig.from_pretrained('bert-base-uncased')
>>> model = TFAutoModel.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the base model classes of the library from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

convbert – TFConvBertModel (ConvBERT model)

led – TFLEDModel (LED model)

blenderbot-small – TFBlenderbotSmallModel (BlenderbotSmall model)

mt5 – TFMT5Model (mT5 model)

t5 – TFT5Model (T5 model)

mobilebert – TFMobileBertModel (MobileBERT model)

distilbert – TFDistilBertModel (DistilBERT model)

albert – TFAlbertModel (ALBERT model)

camembert – TFCamembertModel (CamemBERT model)

xlm-roberta – TFXLMRobertaModel (XLM-RoBERTa model)

pegasus – TFPegasusModel (Pegasus model)

marian – TFMarianModel (Marian model)

mbart – TFMBartModel (mBART model)

mpnet – TFMPNetModel (MPNet model)

bart – TFBartModel (BART model)

blenderbot – TFBlenderbotModel (Blenderbot model)

longformer – TFLongformerModel (Longformer model)

roberta – TFRobertaModel (RoBERTa model)

flaubert – TFFlaubertModel (FlauBERT model)

bert – TFBertModel (BERT model)

openai-gpt – TFOpenAIGPTModel (OpenAI GPT model)

gpt2 – TFGPT2Model (OpenAI GPT-2 model)

transfo-xl – TFTransfoXLModel (Transformer-XL model)

xlnet – TFXLNetModel (XLNet model)

xlm – TFXLMModel (XLM model)

ctrl – TFCTRLModel (CTRL model)

electra – TFElectraModel (ELECTRA model)

funnel – TFFunnelModel (Funnel Transformer model)

lxmert – TFLxmertModel (LXMERT model)

dpr – TFDPRQuestionEncoder (DPR model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by suppyling the save directory.
- The model is loaded by suppyling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, AutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModel.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = TFAutoModel.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json')
>>> model = TFAutoModel.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)

TFAutoModelForPreTraining¶

class transformers.TFAutoModelForPreTraining[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with the architecture used for pretraining this model—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with the architecture used for pretraining this model—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

LxmertConfig configuration class: TFLxmertForPreTraining (LXMERT model)
T5Config configuration class: TFT5ForConditionalGeneration (T5 model)
DistilBertConfig configuration class: TFDistilBertForMaskedLM (DistilBERT model)
AlbertConfig configuration class: TFAlbertForPreTraining (ALBERT model)
BartConfig configuration class: TFBartForConditionalGeneration (BART model)
CamembertConfig configuration class: TFCamembertForMaskedLM (CamemBERT model)
XLMRobertaConfig configuration class: TFXLMRobertaForMaskedLM (XLM-RoBERTa model)
RobertaConfig configuration class: TFRobertaForMaskedLM (RoBERTa model)
BertConfig configuration class: TFBertForPreTraining (BERT model)
OpenAIGPTConfig configuration class: TFOpenAIGPTLMHeadModel (OpenAI GPT model)
GPT2Config configuration class: TFGPT2LMHeadModel (OpenAI GPT-2 model)
MobileBertConfig configuration class: TFMobileBertForPreTraining (MobileBERT model)
TransfoXLConfig configuration class: TFTransfoXLLMHeadModel (Transformer-XL model)
XLNetConfig configuration class: TFXLNetLMHeadModel (XLNet model)
FlaubertConfig configuration class: TFFlaubertWithLMHeadModel (FlauBERT model)
XLMConfig configuration class: TFXLMWithLMHeadModel (XLM model)
CTRLConfig configuration class: TFCTRLLMHeadModel (CTRL model)
ElectraConfig configuration class: TFElectraForPreTraining (ELECTRA model)
FunnelConfig configuration class: TFFunnelForPreTraining (Funnel Transformer model)
MPNetConfig configuration class: TFMPNetForMaskedLM (MPNet model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForPreTraining
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = TFAutoModelForPreTraining.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with the architecture used for pretraining this model—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

t5 – TFT5ForConditionalGeneration (T5 model)

mobilebert – TFMobileBertForPreTraining (MobileBERT model)

distilbert – TFDistilBertForMaskedLM (DistilBERT model)

albert – TFAlbertForPreTraining (ALBERT model)

camembert – TFCamembertForMaskedLM (CamemBERT model)

xlm-roberta – TFXLMRobertaForMaskedLM (XLM-RoBERTa model)

mpnet – TFMPNetForMaskedLM (MPNet model)

bart – TFBartForConditionalGeneration (BART model)

roberta – TFRobertaForMaskedLM (RoBERTa model)

flaubert – TFFlaubertWithLMHeadModel (FlauBERT model)

bert – TFBertForPreTraining (BERT model)

openai-gpt – TFOpenAIGPTLMHeadModel (OpenAI GPT model)

gpt2 – TFGPT2LMHeadModel (OpenAI GPT-2 model)

transfo-xl – TFTransfoXLLMHeadModel (Transformer-XL model)

xlnet – TFXLNetLMHeadModel (XLNet model)

xlm – TFXLMWithLMHeadModel (XLM model)

ctrl – TFCTRLLMHeadModel (CTRL model)

electra – TFElectraForPreTraining (ELECTRA model)

funnel – TFFunnelForPreTraining (Funnel Transformer model)

lxmert – TFLxmertForPreTraining (LXMERT model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by suppyling the save directory.
- The model is loaded by suppyling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForPreTraining.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = TFAutoModelForPreTraining.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json')
>>> model = TFAutoModelForPreTraining.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)

TFAutoModelForCausalLM¶

class transformers.TFAutoModelForCausalLM[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a causal language modeling head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a causal language modeling head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

BertConfig configuration class: TFBertLMHeadModel (BERT model)
OpenAIGPTConfig configuration class: TFOpenAIGPTLMHeadModel (OpenAI GPT model)
GPT2Config configuration class: TFGPT2LMHeadModel (OpenAI GPT-2 model)
TransfoXLConfig configuration class: TFTransfoXLLMHeadModel (Transformer-XL model)
XLNetConfig configuration class: TFXLNetLMHeadModel (XLNet model)
XLMConfig configuration class: TFXLMWithLMHeadModel (XLM model)
CTRLConfig configuration class: TFCTRLLMHeadModel (CTRL model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForCausalLM
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('gpt2')
>>> model = TFAutoModelForCausalLM.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a causal language modeling head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

bert – TFBertLMHeadModel (BERT model)

openai-gpt – TFOpenAIGPTLMHeadModel (OpenAI GPT model)

gpt2 – TFGPT2LMHeadModel (OpenAI GPT-2 model)

transfo-xl – TFTransfoXLLMHeadModel (Transformer-XL model)

xlnet – TFXLNetLMHeadModel (XLNet model)

xlm – TFXLMWithLMHeadModel (XLM model)

ctrl – TFCTRLLMHeadModel (CTRL model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by suppyling the save directory.
- The model is loaded by suppyling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForCausalLM.from_pretrained('gpt2')

>>> # Update configuration during loading
>>> model = TFAutoModelForCausalLM.from_pretrained('gpt2', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_json_file('./pt_model/gpt2_pt_model_config.json')
>>> model = TFAutoModelForCausalLM.from_pretrained('./pt_model/gpt2_pytorch_model.bin', from_pt=True, config=config)

TFAutoModelForMaskedLM¶

class transformers.TFAutoModelForMaskedLM[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a masked language modeling head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a masked language modeling head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

ConvBertConfig configuration class: TFConvBertForMaskedLM (ConvBERT model)
DistilBertConfig configuration class: TFDistilBertForMaskedLM (DistilBERT model)
AlbertConfig configuration class: TFAlbertForMaskedLM (ALBERT model)
CamembertConfig configuration class: TFCamembertForMaskedLM (CamemBERT model)
XLMRobertaConfig configuration class: TFXLMRobertaForMaskedLM (XLM-RoBERTa model)
LongformerConfig configuration class: TFLongformerForMaskedLM (Longformer model)
RobertaConfig configuration class: TFRobertaForMaskedLM (RoBERTa model)
BertConfig configuration class: TFBertForMaskedLM (BERT model)
MobileBertConfig configuration class: TFMobileBertForMaskedLM (MobileBERT model)
FlaubertConfig configuration class: TFFlaubertWithLMHeadModel (FlauBERT model)
XLMConfig configuration class: TFXLMWithLMHeadModel (XLM model)
ElectraConfig configuration class: TFElectraForMaskedLM (ELECTRA model)
FunnelConfig configuration class: TFFunnelForMaskedLM (Funnel Transformer model)
MPNetConfig configuration class: TFMPNetForMaskedLM (MPNet model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForMaskedLM
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = TFAutoModelForMaskedLM.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a masked language modeling head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

convbert – TFConvBertForMaskedLM (ConvBERT model)

mobilebert – TFMobileBertForMaskedLM (MobileBERT model)

distilbert – TFDistilBertForMaskedLM (DistilBERT model)

albert – TFAlbertForMaskedLM (ALBERT model)

camembert – TFCamembertForMaskedLM (CamemBERT model)

xlm-roberta – TFXLMRobertaForMaskedLM (XLM-RoBERTa model)

mpnet – TFMPNetForMaskedLM (MPNet model)

longformer – TFLongformerForMaskedLM (Longformer model)

roberta – TFRobertaForMaskedLM (RoBERTa model)

flaubert – TFFlaubertWithLMHeadModel (FlauBERT model)

bert – TFBertForMaskedLM (BERT model)

xlm – TFXLMWithLMHeadModel (XLM model)

electra – TFElectraForMaskedLM (ELECTRA model)

funnel – TFFunnelForMaskedLM (Funnel Transformer model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by suppyling the save directory.
- The model is loaded by suppyling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMaskedLM.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = TFAutoModelForMaskedLM.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json')
>>> model = TFAutoModelForMaskedLM.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)

TFAutoModelForSeq2SeqLM¶

class transformers.TFAutoModelForSeq2SeqLM[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a sequence-to-sequence language modeling head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config, **kwargs)[source]¶

Instantiates one of the model classes of the library—with a sequence-to-sequence language modeling head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

LEDConfig configuration class: TFLEDForConditionalGeneration (LED model)
MT5Config configuration class: TFMT5ForConditionalGeneration (mT5 model)
T5Config configuration class: TFT5ForConditionalGeneration (T5 model)
MarianConfig configuration class: TFMarianMTModel (Marian model)
MBartConfig configuration class: TFMBartForConditionalGeneration (mBART model)
PegasusConfig configuration class: TFPegasusForConditionalGeneration (Pegasus model)
BlenderbotConfig configuration class: TFBlenderbotForConditionalGeneration (Blenderbot model)
BlenderbotSmallConfig configuration class: TFBlenderbotSmallForConditionalGeneration (BlenderbotSmall model)
BartConfig configuration class: TFBartForConditionalGeneration (BART model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('t5')
>>> model = TFAutoModelForSeq2SeqLM.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a sequence-to-sequence language modeling head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

LEDConfig configuration class: TFLEDForConditionalGeneration (LED model)
MT5Config configuration class: TFMT5ForConditionalGeneration (mT5 model)
T5Config configuration class: TFT5ForConditionalGeneration (T5 model)
MarianConfig configuration class: TFMarianMTModel (Marian model)
MBartConfig configuration class: TFMBartForConditionalGeneration (mBART model)
PegasusConfig configuration class: TFPegasusForConditionalGeneration (Pegasus model)
BlenderbotConfig configuration class: TFBlenderbotForConditionalGeneration (Blenderbot model)
BlenderbotSmallConfig configuration class: TFBlenderbotSmallForConditionalGeneration (BlenderbotSmall model)
BartConfig configuration class: TFBartForConditionalGeneration (BART model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by suppyling the save directory.
- The model is loaded by suppyling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained('t5-base')

>>> # Update configuration during loading
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained('t5-base', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_json_file('./pt_model/t5_pt_model_config.json')
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained('./pt_model/t5_pytorch_model.bin', from_pt=True, config=config)

TFAutoModelForSequenceClassification¶

class transformers.TFAutoModelForSequenceClassification[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a sequence classification head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a sequence classification head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

ConvBertConfig configuration class: TFConvBertForSequenceClassification (ConvBERT model)
DistilBertConfig configuration class: TFDistilBertForSequenceClassification (DistilBERT model)
AlbertConfig configuration class: TFAlbertForSequenceClassification (ALBERT model)
CamembertConfig configuration class: TFCamembertForSequenceClassification (CamemBERT model)
XLMRobertaConfig configuration class: TFXLMRobertaForSequenceClassification (XLM-RoBERTa model)
LongformerConfig configuration class: TFLongformerForSequenceClassification (Longformer model)
RobertaConfig configuration class: TFRobertaForSequenceClassification (RoBERTa model)
BertConfig configuration class: TFBertForSequenceClassification (BERT model)
XLNetConfig configuration class: TFXLNetForSequenceClassification (XLNet model)
MobileBertConfig configuration class: TFMobileBertForSequenceClassification (MobileBERT model)
FlaubertConfig configuration class: TFFlaubertForSequenceClassification (FlauBERT model)
XLMConfig configuration class: TFXLMForSequenceClassification (XLM model)
ElectraConfig configuration class: TFElectraForSequenceClassification (ELECTRA model)
FunnelConfig configuration class: TFFunnelForSequenceClassification (Funnel Transformer model)
GPT2Config configuration class: TFGPT2ForSequenceClassification (OpenAI GPT-2 model)
MPNetConfig configuration class: TFMPNetForSequenceClassification (MPNet model)
OpenAIGPTConfig configuration class: TFOpenAIGPTForSequenceClassification (OpenAI GPT model)
TransfoXLConfig configuration class: TFTransfoXLForSequenceClassification (Transformer-XL model)
CTRLConfig configuration class: TFCTRLForSequenceClassification (CTRL model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = TFAutoModelForSequenceClassification.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a sequence classification head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

convbert – TFConvBertForSequenceClassification (ConvBERT model)

mobilebert – TFMobileBertForSequenceClassification (MobileBERT model)

distilbert – TFDistilBertForSequenceClassification (DistilBERT model)

albert – TFAlbertForSequenceClassification (ALBERT model)

camembert – TFCamembertForSequenceClassification (CamemBERT model)

xlm-roberta – TFXLMRobertaForSequenceClassification (XLM-RoBERTa model)

mpnet – TFMPNetForSequenceClassification (MPNet model)

longformer – TFLongformerForSequenceClassification (Longformer model)

roberta – TFRobertaForSequenceClassification (RoBERTa model)

flaubert – TFFlaubertForSequenceClassification (FlauBERT model)

bert – TFBertForSequenceClassification (BERT model)

openai-gpt – TFOpenAIGPTForSequenceClassification (OpenAI GPT model)

gpt2 – TFGPT2ForSequenceClassification (OpenAI GPT-2 model)

transfo-xl – TFTransfoXLForSequenceClassification (Transformer-XL model)

xlnet – TFXLNetForSequenceClassification (XLNet model)

xlm – TFXLMForSequenceClassification (XLM model)

ctrl – TFCTRLForSequenceClassification (CTRL model)

electra – TFElectraForSequenceClassification (ELECTRA model)

funnel – TFFunnelForSequenceClassification (Funnel Transformer model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by suppyling the save directory.
- The model is loaded by suppyling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSequenceClassification.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = TFAutoModelForSequenceClassification.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json')
>>> model = TFAutoModelForSequenceClassification.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)

TFAutoModelForMultipleChoice¶

class transformers.TFAutoModelForMultipleChoice[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a multiple choice classification head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a multiple choice classification head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

ConvBertConfig configuration class: TFConvBertForMultipleChoice (ConvBERT model)
CamembertConfig configuration class: TFCamembertForMultipleChoice (CamemBERT model)
XLMConfig configuration class: TFXLMForMultipleChoice (XLM model)
XLMRobertaConfig configuration class: TFXLMRobertaForMultipleChoice (XLM-RoBERTa model)
LongformerConfig configuration class: TFLongformerForMultipleChoice (Longformer model)
RobertaConfig configuration class: TFRobertaForMultipleChoice (RoBERTa model)
BertConfig configuration class: TFBertForMultipleChoice (BERT model)
DistilBertConfig configuration class: TFDistilBertForMultipleChoice (DistilBERT model)
MobileBertConfig configuration class: TFMobileBertForMultipleChoice (MobileBERT model)
XLNetConfig configuration class: TFXLNetForMultipleChoice (XLNet model)
FlaubertConfig configuration class: TFFlaubertForMultipleChoice (FlauBERT model)
AlbertConfig configuration class: TFAlbertForMultipleChoice (ALBERT model)
ElectraConfig configuration class: TFElectraForMultipleChoice (ELECTRA model)
FunnelConfig configuration class: TFFunnelForMultipleChoice (Funnel Transformer model)
MPNetConfig configuration class: TFMPNetForMultipleChoice (MPNet model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = TFAutoModelForMultipleChoice.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a multiple choice classification head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

convbert – TFConvBertForMultipleChoice (ConvBERT model)

mobilebert – TFMobileBertForMultipleChoice (MobileBERT model)

distilbert – TFDistilBertForMultipleChoice (DistilBERT model)

albert – TFAlbertForMultipleChoice (ALBERT model)

camembert – TFCamembertForMultipleChoice (CamemBERT model)

xlm-roberta – TFXLMRobertaForMultipleChoice (XLM-RoBERTa model)

mpnet – TFMPNetForMultipleChoice (MPNet model)

longformer – TFLongformerForMultipleChoice (Longformer model)

roberta – TFRobertaForMultipleChoice (RoBERTa model)

flaubert – TFFlaubertForMultipleChoice (FlauBERT model)

bert – TFBertForMultipleChoice (BERT model)

xlnet – TFXLNetForMultipleChoice (XLNet model)

xlm – TFXLMForMultipleChoice (XLM model)

electra – TFElectraForMultipleChoice (ELECTRA model)

funnel – TFFunnelForMultipleChoice (Funnel Transformer model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by suppyling the save directory.
- The model is loaded by suppyling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMultipleChoice.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = TFAutoModelForMultipleChoice.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json')
>>> model = TFAutoModelForMultipleChoice.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)

TFAutoModelForTokenClassification¶

class transformers.TFAutoModelForTokenClassification[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a token classification head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a token classification head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

ConvBertConfig configuration class: TFConvBertForTokenClassification (ConvBERT model)
DistilBertConfig configuration class: TFDistilBertForTokenClassification (DistilBERT model)
AlbertConfig configuration class: TFAlbertForTokenClassification (ALBERT model)
CamembertConfig configuration class: TFCamembertForTokenClassification (CamemBERT model)
FlaubertConfig configuration class: TFFlaubertForTokenClassification (FlauBERT model)
XLMConfig configuration class: TFXLMForTokenClassification (XLM model)
XLMRobertaConfig configuration class: TFXLMRobertaForTokenClassification (XLM-RoBERTa model)
LongformerConfig configuration class: TFLongformerForTokenClassification (Longformer model)
RobertaConfig configuration class: TFRobertaForTokenClassification (RoBERTa model)
BertConfig configuration class: TFBertForTokenClassification (BERT model)
MobileBertConfig configuration class: TFMobileBertForTokenClassification (MobileBERT model)
XLNetConfig configuration class: TFXLNetForTokenClassification (XLNet model)
ElectraConfig configuration class: TFElectraForTokenClassification (ELECTRA model)
FunnelConfig configuration class: TFFunnelForTokenClassification (Funnel Transformer model)
MPNetConfig configuration class: TFMPNetForTokenClassification (MPNet model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForTokenClassification
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = TFAutoModelForTokenClassification.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a token classification head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

convbert – TFConvBertForTokenClassification (ConvBERT model)

mobilebert – TFMobileBertForTokenClassification (MobileBERT model)

distilbert – TFDistilBertForTokenClassification (DistilBERT model)

albert – TFAlbertForTokenClassification (ALBERT model)

camembert – TFCamembertForTokenClassification (CamemBERT model)

xlm-roberta – TFXLMRobertaForTokenClassification (XLM-RoBERTa model)

mpnet – TFMPNetForTokenClassification (MPNet model)

longformer – TFLongformerForTokenClassification (Longformer model)

roberta – TFRobertaForTokenClassification (RoBERTa model)

flaubert – TFFlaubertForTokenClassification (FlauBERT model)

bert – TFBertForTokenClassification (BERT model)

xlnet – TFXLNetForTokenClassification (XLNet model)

xlm – TFXLMForTokenClassification (XLM model)

electra – TFElectraForTokenClassification (ELECTRA model)

funnel – TFFunnelForTokenClassification (Funnel Transformer model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by suppyling the save directory.
- The model is loaded by suppyling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForTokenClassification.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = TFAutoModelForTokenClassification.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json')
>>> model = TFAutoModelForTokenClassification.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)

TFAutoModelForQuestionAnswering¶

class transformers.TFAutoModelForQuestionAnswering[source]¶

This is a generic model class that will be instantiated as one of the model classes of the library—with a question answering head—when created with the from_pretrained() class method or the from_config() class method.

This class cannot be instantiated directly using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the model classes of the library—with a question answering head—from a configuration.

Note

Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

ConvBertConfig configuration class: TFConvBertForQuestionAnswering (ConvBERT model)
DistilBertConfig configuration class: TFDistilBertForQuestionAnswering (DistilBERT model)
AlbertConfig configuration class: TFAlbertForQuestionAnswering (ALBERT model)
CamembertConfig configuration class: TFCamembertForQuestionAnswering (CamemBERT model)
XLMRobertaConfig configuration class: TFXLMRobertaForQuestionAnswering (XLM-RoBERTa model)
LongformerConfig configuration class: TFLongformerForQuestionAnswering (Longformer model)
RobertaConfig configuration class: TFRobertaForQuestionAnswering (RoBERTa model)
BertConfig configuration class: TFBertForQuestionAnswering (BERT model)
XLNetConfig configuration class: TFXLNetForQuestionAnsweringSimple (XLNet model)
MobileBertConfig configuration class: TFMobileBertForQuestionAnswering (MobileBERT model)
FlaubertConfig configuration class: TFFlaubertForQuestionAnsweringSimple (FlauBERT model)
XLMConfig configuration class: TFXLMForQuestionAnsweringSimple (XLM model)
ElectraConfig configuration class: TFElectraForQuestionAnswering (ELECTRA model)
FunnelConfig configuration class: TFFunnelForQuestionAnswering (Funnel Transformer model)
MPNetConfig configuration class: TFMPNetForQuestionAnswering (MPNet model)

Examples:

>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained('bert-base-uncased')
>>> model = TFAutoModelForQuestionAnswering.from_config(config)

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiate one of the model classes of the library—with a question answering head—from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:

convbert – TFConvBertForQuestionAnswering (ConvBERT model)

mobilebert – TFMobileBertForQuestionAnswering (MobileBERT model)

distilbert – TFDistilBertForQuestionAnswering (DistilBERT model)

albert – TFAlbertForQuestionAnswering (ALBERT model)

camembert – TFCamembertForQuestionAnswering (CamemBERT model)

xlm-roberta – TFXLMRobertaForQuestionAnswering (XLM-RoBERTa model)

mpnet – TFMPNetForQuestionAnswering (MPNet model)

longformer – TFLongformerForQuestionAnswering (Longformer model)

roberta – TFRobertaForQuestionAnswering (RoBERTa model)

flaubert – TFFlaubertForQuestionAnsweringSimple (FlauBERT model)

bert – TFBertForQuestionAnswering (BERT model)

xlnet – TFXLNetForQuestionAnsweringSimple (XLNet model)

xlm – TFXLMForQuestionAnsweringSimple (XLM model)

electra – TFElectraForQuestionAnswering (ELECTRA model)

funnel – TFFunnelForQuestionAnswering (Funnel Transformer model)

The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
Can be either:
- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- A path to a directory containing model weights saved using save_pretrained(), e.g., ./my_model_directory/.
- A path or url to a PyTorch state_dict save file (e.g, ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model __init__() method.
config (PretrainedConfig, optional) –
Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by suppyling the save directory.
- The model is loaded by suppyling a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.

This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (str, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
from_tf (bool, optional, defaults to False) – Load the model weights from a TensorFlow checkpoint save file (see docstring of pretrained_model_name_or_path argument).
force_download (bool, optional, defaults to False) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
proxies (Dict[str, str], `optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.
output_loading_info (bool, optional, defaults to False) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.
local_files_only (bool, optional, defaults to False) – Whether or not to only look at local files (e.g., not try downloading the model).
revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., output_attentions=True). Behaves differently depending on whether a config is provided or automatically loaded:
- If a configuration is provided with config, **kwargs will be directly passed to the underlying model’s __init__ method (we assume all relevant updates to the configuration have already been done)
- If a configuration is not provided, kwargs will be first passed to the configuration class initialization function (from_pretrained()). Each key of kwargs that corresponds to a configuration attribute will be used to override said attribute with the supplied kwargs value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s __init__ function.

Examples:

>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForQuestionAnswering.from_pretrained('bert-base-uncased')

>>> # Update configuration during loading
>>> model = TFAutoModelForQuestionAnswering.from_pretrained('bert-base-uncased', output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json')
>>> model = TFAutoModelForQuestionAnswering.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)

FlaxAutoModel¶

class transformers.FlaxAutoModel[source]¶

FlaxAutoModel is a generic model class that will be instantiated as one of the base model classes of the library when created with the FlaxAutoModel.from_pretrained(pretrained_model_name_or_path) or the FlaxAutoModel.from_config(config) class methods.

This class cannot be instantiated using __init__() (throws an error).

classmethod from_config(config)[source]¶

Instantiates one of the base model classes of the library from a configuration.

Parameters

config (PretrainedConfig) –

The model class to instantiate is selected based on the configuration class:

isInstance of roberta configuration class: FlaxRobertaModel (RoBERTa model)
isInstance of bert configuration class: FlaxBertModel (Bert model

Examples:

config = BertConfig.from_pretrained('bert-base-uncased')
# Download configuration from huggingface.co and cache.
model = FlaxAutoModel.from_config(config)
# E.g. model was saved using `save_pretrained('./test/saved_model/')`

classmethod from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶

Instantiates one of the base model classes of the library from a pre-trained model configuration.

The from_pretrained() method takes care of returning the correct model class instance based on the model_type property of the config object, or when it’s missing, falling back to using pattern matching on the pretrained_model_name_or_path string.

The base model class to instantiate is selected as the first pattern matching in the pretrained_model_name_or_path string (in the following order):

contains roberta: FlaxRobertaModel (RoBERTa model)

contains bert: FlaxBertModel (Bert model)

The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated) To train the model, you should first set it back in training mode with model.train()

Parameters

pretrained_model_name_or_path –
either:
- a string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
- a path to a directory containing model weights saved using save_pretrained(), e.g.: ./my_model_directory/.
- a path or url to a pytorch index checkpoint file (e.g. ./pt_model/pytorch_model.bin). In this case, from_pt should be set to True and a configuration object should be provided as config argument.
model_args – (optional) Sequence of positional arguments: All remaining positional arguments will be passed to the underlying model’s __init__ method
config –
(optional) instance of a class derived from PretrainedConfig: Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
- the model is a model provided by the library (loaded with the shortcut-name string of a pretrained model), or
- the model was saved using save_pretrained() and is reloaded by supplying the save directory.
- the model is loaded by supplying a local directory as pretrained_model_name_or_path and a configuration JSON file named config.json is found in the directory.
cache_dir – (optional) string: Path to a directory in which a downloaded pre-trained model configuration should be cached if the standard cache should not be used.
force_download – (optional) boolean, default False: Force to (re-)download the model weights and configuration files and override the cached versions if they exists.
resume_download – (optional) boolean, default False: Do not delete incompletely received file. Attempt to resume the download if such a file exists.
proxies – (optional) dict, default None: A dictionary of proxy servers to use by protocol or endpoint, e.g.: {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
output_loading_info – (optional) boolean: Set to True to also return a dictionary containing missing keys, unexpected keys and error messages.
kwargs – (optional) Remaining dictionary of keyword arguments: These arguments will be passed to the configuration and the model.

Examples:

model = FlaxAutoModel.from_pretrained('bert-base-uncased')    # Download model and configuration from huggingface.co and cache.
model = FlaxAutoModel.from_pretrained('./test/bert_model/')  # E.g. model was saved using `save_pretrained('./test/saved_model/')`
assert model.config.output_attention == True