Auto Classes¶
In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you
are supplying to the from_pretrained()
method. AutoClasses are here to do this job for you so that you
automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary.
Instantiating one of AutoConfig
, AutoModel
, and
AutoTokenizer
will directly create a class of the relevant architecture. For instance
model = AutoModel.from_pretrained('bert-base-cased')
will create a model that is an instance of BertModel
.
There is one class of AutoModel
for each task, and for each backend (PyTorch or TensorFlow).
AutoConfig¶
-
class
transformers.
AutoConfig
[source]¶ This is a generic configuration class that will be instantiated as one of the configuration classes of the library when created with the
from_pretrained()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_pretrained
(pretrained_model_name_or_path, **kwargs)[source]¶ Instantiate one of the configuration classes of the library from a pretrained model configuration.
The configuration class to instantiate is selected based on the
model_type
property of the config object that is loaded, or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:retribert –
RetriBertConfig
(RetriBERT model)mt5 –
MT5Config
(mT5 model)t5 –
T5Config
(T5 model)mobilebert –
MobileBertConfig
(MobileBERT model)distilbert –
DistilBertConfig
(DistilBERT model)albert –
AlbertConfig
(ALBERT model)bert-generation –
BertGenerationConfig
(Bert Generation model)camembert –
CamembertConfig
(CamemBERT model)xlm-roberta –
XLMRobertaConfig
(XLM-RoBERTa model)pegasus –
PegasusConfig
(Pegasus model)marian –
MarianConfig
(Marian model)mbart –
MBartConfig
(mBART model)bart –
BartConfig
(BART model)blenderbot –
BlenderbotConfig
(Blenderbot model)reformer –
ReformerConfig
(Reformer model)longformer –
LongformerConfig
(Longformer model)roberta –
RobertaConfig
(RoBERTa model)deberta –
DebertaConfig
(DeBERTa model)flaubert –
FlaubertConfig
(FlauBERT model)fsmt –
FSMTConfig
(FairSeq Machine-Translation model)squeezebert –
SqueezeBertConfig
(SqueezeBERT model)bert –
BertConfig
(BERT model)openai-gpt –
OpenAIGPTConfig
(OpenAI GPT model)gpt2 –
GPT2Config
(OpenAI GPT-2 model)transfo-xl –
TransfoXLConfig
(Transformer-XL model)xlnet –
XLNetConfig
(XLNet model)xlm-prophetnet –
XLMProphetNetConfig
(XLMProphetNet model)prophetnet –
ProphetNetConfig
(ProphetNet model)xlm –
XLMConfig
(XLM model)ctrl –
CTRLConfig
(CTRL model)electra –
ElectraConfig
(ELECTRA model)encoder-decoder –
EncoderDecoderConfig
(Encoder decoder model)funnel –
FunnelConfig
(Funnel Transformer model)lxmert –
LxmertConfig
(LXMERT model)dpr –
DPRConfig
(DPR model)layoutlm –
LayoutLMConfig
(LayoutLM model)rag –
RagConfig
(RAG model)
- Parameters
pretrained_model_name_or_path (
str
) –Can be either:
A string, the model id of a pretrained model configuration hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing a configuration file saved using the
save_pretrained()
method, or thesave_pretrained()
method, e.g.,./my_model_directory/
.A path or url to a saved configuration JSON file, e.g.,
./my_model_directory/configuration.json
.
cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str]
, optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.return_unused_kwargs (
bool
, optional, defaults toFalse
) –If
False
, then this function returns just the final configuration object.If
True
, then this functions returns aTuple(config, unused_kwargs)
where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part ofkwargs
which has not been used to updateconfig
and is otherwise ignored.kwargs (additional keyword arguments, optional) – The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not configuration attributes is controlled by the
return_unused_kwargs
keyword parameter.
Examples:
>>> from transformers import AutoConfig >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> # Download configuration from huggingface.co (user-uploaded) and cache. >>> config = AutoConfig.from_pretrained('dbmdz/bert-base-german-cased') >>> # If configuration file is in a directory (e.g., was saved using `save_pretrained('./test/saved_model/')`). >>> config = AutoConfig.from_pretrained('./test/bert_saved_model/') >>> # Load a specific configuration file. >>> config = AutoConfig.from_pretrained('./test/bert_saved_model/my_configuration.json') >>> # Change some config attributes when loading a pretrained config. >>> config = AutoConfig.from_pretrained('bert-base-uncased', output_attentions=True, foo=False) >>> config.output_attentions True >>> config, unused_kwargs = AutoConfig.from_pretrained('bert-base-uncased', output_attentions=True, foo=False, return_unused_kwargs=True) >>> config.output_attentions True >>> config.unused_kwargs {'foo': False}
-
classmethod
AutoTokenizer¶
-
class
transformers.
AutoTokenizer
[source]¶ This is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when created with the
AutoTokenizer.from_pretrained()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_pretrained
(pretrained_model_name_or_path, *inputs, **kwargs)[source]¶ Instantiate one of the tokenizer classes of the library from a pretrained model vocabulary.
The tokenizer class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:retribert –
RetriBertTokenizer
(RetriBERT model)mt5 –
T5Tokenizer
(mT5 model)t5 –
T5Tokenizer
(T5 model)mobilebert –
MobileBertTokenizer
(MobileBERT model)distilbert –
DistilBertTokenizer
(DistilBERT model)albert –
AlbertTokenizer
(ALBERT model)bert-generation –
BertGenerationTokenizer
(Bert Generation model)camembert –
CamembertTokenizer
(CamemBERT model)xlm-roberta –
XLMRobertaTokenizer
(XLM-RoBERTa model)pegasus –
PegasusTokenizer
(Pegasus model)marian –
MarianTokenizer
(Marian model)mbart –
MBartTokenizer
(mBART model)bart –
BartTokenizer
(BART model)blenderbot –
BlenderbotSmallTokenizer
(Blenderbot model)reformer –
ReformerTokenizer
(Reformer model)longformer –
LongformerTokenizer
(Longformer model)roberta –
RobertaTokenizer
(RoBERTa model)deberta –
DebertaTokenizer
(DeBERTa model)flaubert –
FlaubertTokenizer
(FlauBERT model)fsmt –
FSMTTokenizer
(FairSeq Machine-Translation model)squeezebert –
SqueezeBertTokenizer
(SqueezeBERT model)bert –
BertTokenizer
(BERT model)openai-gpt –
OpenAIGPTTokenizer
(OpenAI GPT model)gpt2 –
GPT2Tokenizer
(OpenAI GPT-2 model)transfo-xl –
TransfoXLTokenizer
(Transformer-XL model)xlnet –
XLNetTokenizer
(XLNet model)xlm-prophetnet –
XLMProphetNetTokenizer
(XLMProphetNet model)prophetnet –
ProphetNetTokenizer
(ProphetNet model)xlm –
XLMTokenizer
(XLM model)ctrl –
CTRLTokenizer
(CTRL model)electra –
ElectraTokenizer
(ELECTRA model)funnel –
FunnelTokenizer
(Funnel Transformer model)lxmert –
LxmertTokenizer
(LXMERT model)dpr –
DPRQuestionEncoderTokenizer
(DPR model)layoutlm –
LayoutLMTokenizer
(LayoutLM model)rag –
RagTokenizer
(RAG model)
- Params:
- pretrained_model_name_or_path (
str
): Can be either:
A string, the model id of a predefined tokenizer hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing vocabulary files required by the tokenizer, for instance saved using the
save_pretrained()
method, e.g.,./my_model_directory/
.A path or url to a single saved vocabulary file if and only if the tokenizer only requires a single vocabulary file (like Bert or XLNet), e.g.:
./my_model_directory/vocab.txt
. (Not applicable to all derived classes)
- inputs (additional positional arguments, optional):
Will be passed along to the Tokenizer
__init__()
method.- config (
PreTrainedConfig
, optional) The configuration object used to dertermine the tokenizer class to instantiate.
- cache_dir (
str
, optional): Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
- force_download (
bool
, optional, defaults toFalse
): Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.
- resume_download (
bool
, optional, defaults toFalse
): Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.
- proxies (
Dict[str, str]
, optional): A dictionary of proxy servers to use by protocol or endpoint, e.g.,
{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.- revision(
str
, optional, defaults to"main"
): The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so
revision
can be any identifier allowed by git.- subfolder (
str
, optional): In case the relevant files are located inside a subfolder of the model repo on huggingface.co (e.g. for facebook/rag-token-base), specify it here.
- use_fast (
bool
, optional, defaults toTrue
): Whether or not to try to load the fast version of the tokenizer.
- kwargs (additional keyword arguments, optional):
Will be passed to the Tokenizer
__init__()
method. Can be used to set special tokens likebos_token
,eos_token
,unk_token
,sep_token
,pad_token
,cls_token
,mask_token
,additional_special_tokens
. See parameters in the__init__()
for more details.
- pretrained_model_name_or_path (
Examples:
>>> from transformers import AutoTokenizer >>> # Download vocabulary from huggingface.co and cache. >>> tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased') >>> # Download vocabulary from huggingface.co (user-uploaded) and cache. >>> tokenizer = AutoTokenizer.from_pretrained('dbmdz/bert-base-german-cased') >>> # If vocabulary files are in a directory (e.g. tokenizer was saved using `save_pretrained('./test/saved_model/')`) >>> tokenizer = AutoTokenizer.from_pretrained('./test/bert_saved_model/')
-
classmethod
AutoModel¶
-
class
transformers.
AutoModel
[source]¶ This is a generic model class that will be instantiated as one of the base model classes of the library when created with the
from_pretrained()
class method or thefrom_config()
class methods.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the base model classes of the library from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
RetriBertConfig
configuration class:RetriBertModel
(RetriBERT model)DistilBertConfig
configuration class:DistilBertModel
(DistilBERT model)AlbertConfig
configuration class:AlbertModel
(ALBERT model)CamembertConfig
configuration class:CamembertModel
(CamemBERT model)XLMRobertaConfig
configuration class:XLMRobertaModel
(XLM-RoBERTa model)BartConfig
configuration class:BartModel
(BART model)LongformerConfig
configuration class:LongformerModel
(Longformer model)RobertaConfig
configuration class:RobertaModel
(RoBERTa model)LayoutLMConfig
configuration class:LayoutLMModel
(LayoutLM model)SqueezeBertConfig
configuration class:SqueezeBertModel
(SqueezeBERT model)BertConfig
configuration class:BertModel
(BERT model)OpenAIGPTConfig
configuration class:OpenAIGPTModel
(OpenAI GPT model)GPT2Config
configuration class:GPT2Model
(OpenAI GPT-2 model)MobileBertConfig
configuration class:MobileBertModel
(MobileBERT model)TransfoXLConfig
configuration class:TransfoXLModel
(Transformer-XL model)XLNetConfig
configuration class:XLNetModel
(XLNet model)FlaubertConfig
configuration class:FlaubertModel
(FlauBERT model)FSMTConfig
configuration class:FSMTModel
(FairSeq Machine-Translation model)CTRLConfig
configuration class:CTRLModel
(CTRL model)ElectraConfig
configuration class:ElectraModel
(ELECTRA model)ReformerConfig
configuration class:ReformerModel
(Reformer model)FunnelConfig
configuration class:FunnelModel
(Funnel Transformer model)LxmertConfig
configuration class:LxmertModel
(LXMERT model)BertGenerationConfig
configuration class:BertGenerationEncoder
(Bert Generation model)DebertaConfig
configuration class:DebertaModel
(DeBERTa model)DPRConfig
configuration class:DPRQuestionEncoder
(DPR model)XLMProphetNetConfig
configuration class:XLMProphetNetModel
(XLMProphetNet model)ProphetNetConfig
configuration class:ProphetNetModel
(ProphetNet model)
Examples:
>>> from transformers import AutoConfig, AutoModel >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModel.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the base model classes of the library from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:retribert –
RetriBertModel
(RetriBERT model)mt5 –
MT5Model
(mT5 model)t5 –
T5Model
(T5 model)mobilebert –
MobileBertModel
(MobileBERT model)distilbert –
DistilBertModel
(DistilBERT model)albert –
AlbertModel
(ALBERT model)bert-generation –
BertGenerationEncoder
(Bert Generation model)camembert –
CamembertModel
(CamemBERT model)xlm-roberta –
XLMRobertaModel
(XLM-RoBERTa model)bart –
BartModel
(BART model)reformer –
ReformerModel
(Reformer model)longformer –
LongformerModel
(Longformer model)roberta –
RobertaModel
(RoBERTa model)deberta –
DebertaModel
(DeBERTa model)flaubert –
FlaubertModel
(FlauBERT model)fsmt –
FSMTModel
(FairSeq Machine-Translation model)squeezebert –
SqueezeBertModel
(SqueezeBERT model)bert –
BertModel
(BERT model)openai-gpt –
OpenAIGPTModel
(OpenAI GPT model)gpt2 –
GPT2Model
(OpenAI GPT-2 model)transfo-xl –
TransfoXLModel
(Transformer-XL model)xlnet –
XLNetModel
(XLNet model)xlm-prophetnet –
XLMProphetNetModel
(XLMProphetNet model)prophetnet –
ProphetNetModel
(ProphetNet model)xlm –
XLMModel
(XLM model)ctrl –
CTRLModel
(CTRL model)electra –
ElectraModel
(ELECTRA model)funnel –
FunnelModel
(Funnel Transformer model)lxmert –
LxmertModel
(LXMERT model)dpr –
DPRQuestionEncoder
(DPR model)layoutlm –
LayoutLMModel
(LayoutLM model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, AutoModel >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModel.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModel.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModel.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForPreTraining¶
-
class
transformers.
AutoModelForPreTraining
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with the architecture used for pretraining this model—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with the architecture used for pretraining this model—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
LayoutLMConfig
configuration class:LayoutLMForMaskedLM
(LayoutLM model)RetriBertConfig
configuration class:RetriBertModel
(RetriBERT model)T5Config
configuration class:T5ForConditionalGeneration
(T5 model)DistilBertConfig
configuration class:DistilBertForMaskedLM
(DistilBERT model)AlbertConfig
configuration class:AlbertForPreTraining
(ALBERT model)CamembertConfig
configuration class:CamembertForMaskedLM
(CamemBERT model)XLMRobertaConfig
configuration class:XLMRobertaForMaskedLM
(XLM-RoBERTa model)BartConfig
configuration class:BartForConditionalGeneration
(BART model)FSMTConfig
configuration class:FSMTForConditionalGeneration
(FairSeq Machine-Translation model)LongformerConfig
configuration class:LongformerForMaskedLM
(Longformer model)RobertaConfig
configuration class:RobertaForMaskedLM
(RoBERTa model)SqueezeBertConfig
configuration class:SqueezeBertForMaskedLM
(SqueezeBERT model)BertConfig
configuration class:BertForPreTraining
(BERT model)OpenAIGPTConfig
configuration class:OpenAIGPTLMHeadModel
(OpenAI GPT model)GPT2Config
configuration class:GPT2LMHeadModel
(OpenAI GPT-2 model)MobileBertConfig
configuration class:MobileBertForPreTraining
(MobileBERT model)TransfoXLConfig
configuration class:TransfoXLLMHeadModel
(Transformer-XL model)XLNetConfig
configuration class:XLNetLMHeadModel
(XLNet model)FlaubertConfig
configuration class:FlaubertWithLMHeadModel
(FlauBERT model)XLMConfig
configuration class:XLMWithLMHeadModel
(XLM model)CTRLConfig
configuration class:CTRLLMHeadModel
(CTRL model)ElectraConfig
configuration class:ElectraForPreTraining
(ELECTRA model)LxmertConfig
configuration class:LxmertForPreTraining
(LXMERT model)FunnelConfig
configuration class:FunnelForPreTraining
(Funnel Transformer model)
Examples:
>>> from transformers import AutoConfig, AutoModelForPreTraining >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForPreTraining.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with the architecture used for pretraining this model—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:retribert –
RetriBertModel
(RetriBERT model)t5 –
T5ForConditionalGeneration
(T5 model)mobilebert –
MobileBertForPreTraining
(MobileBERT model)distilbert –
DistilBertForMaskedLM
(DistilBERT model)albert –
AlbertForPreTraining
(ALBERT model)camembert –
CamembertForMaskedLM
(CamemBERT model)xlm-roberta –
XLMRobertaForMaskedLM
(XLM-RoBERTa model)bart –
BartForConditionalGeneration
(BART model)longformer –
LongformerForMaskedLM
(Longformer model)roberta –
RobertaForMaskedLM
(RoBERTa model)flaubert –
FlaubertWithLMHeadModel
(FlauBERT model)fsmt –
FSMTForConditionalGeneration
(FairSeq Machine-Translation model)squeezebert –
SqueezeBertForMaskedLM
(SqueezeBERT model)bert –
BertForPreTraining
(BERT model)openai-gpt –
OpenAIGPTLMHeadModel
(OpenAI GPT model)gpt2 –
GPT2LMHeadModel
(OpenAI GPT-2 model)transfo-xl –
TransfoXLLMHeadModel
(Transformer-XL model)xlnet –
XLNetLMHeadModel
(XLNet model)xlm –
XLMWithLMHeadModel
(XLM model)ctrl –
CTRLLMHeadModel
(CTRL model)electra –
ElectraForPreTraining
(ELECTRA model)funnel –
FunnelForPreTraining
(Funnel Transformer model)lxmert –
LxmertForPreTraining
(LXMERT model)layoutlm –
LayoutLMForMaskedLM
(LayoutLM model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, AutoModelForPreTraining >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForPreTraining.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForPreTraining.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForPreTraining.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForCausalLM¶
-
class
transformers.
AutoModelForCausalLM
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a causal language modeling head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a causal language modeling head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
CamembertConfig
configuration class:CamembertForCausalLM
(CamemBERT model)XLMRobertaConfig
configuration class:XLMRobertaForCausalLM
(XLM-RoBERTa model)RobertaConfig
configuration class:RobertaForCausalLM
(RoBERTa model)BertConfig
configuration class:BertLMHeadModel
(BERT model)OpenAIGPTConfig
configuration class:OpenAIGPTLMHeadModel
(OpenAI GPT model)GPT2Config
configuration class:GPT2LMHeadModel
(OpenAI GPT-2 model)TransfoXLConfig
configuration class:TransfoXLLMHeadModel
(Transformer-XL model)XLNetConfig
configuration class:XLNetLMHeadModel
(XLNet model)XLMConfig
configuration class:XLMWithLMHeadModel
(XLM model)CTRLConfig
configuration class:CTRLLMHeadModel
(CTRL model)ReformerConfig
configuration class:ReformerModelWithLMHead
(Reformer model)BertGenerationConfig
configuration class:BertGenerationDecoder
(Bert Generation model)XLMProphetNetConfig
configuration class:XLMProphetNetForCausalLM
(XLMProphetNet model)ProphetNetConfig
configuration class:ProphetNetForCausalLM
(ProphetNet model)
Examples:
>>> from transformers import AutoConfig, AutoModelForCausalLM >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('gpt2') >>> model = AutoModelForCausalLM.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a causal language modeling head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:bert-generation –
BertGenerationDecoder
(Bert Generation model)camembert –
CamembertForCausalLM
(CamemBERT model)xlm-roberta –
XLMRobertaForCausalLM
(XLM-RoBERTa model)reformer –
ReformerModelWithLMHead
(Reformer model)roberta –
RobertaForCausalLM
(RoBERTa model)bert –
BertLMHeadModel
(BERT model)openai-gpt –
OpenAIGPTLMHeadModel
(OpenAI GPT model)gpt2 –
GPT2LMHeadModel
(OpenAI GPT-2 model)transfo-xl –
TransfoXLLMHeadModel
(Transformer-XL model)xlnet –
XLNetLMHeadModel
(XLNet model)xlm-prophetnet –
XLMProphetNetForCausalLM
(XLMProphetNet model)prophetnet –
ProphetNetForCausalLM
(ProphetNet model)xlm –
XLMWithLMHeadModel
(XLM model)ctrl –
CTRLLMHeadModel
(CTRL model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, AutoModelForCausalLM >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForCausalLM.from_pretrained('gpt2') >>> # Update configuration during loading >>> model = AutoModelForCausalLM.from_pretrained('gpt2', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/gpt2_tf_model_config.json') >>> model = AutoModelForCausalLM.from_pretrained('./tf_model/gpt2_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForMaskedLM¶
-
class
transformers.
AutoModelForMaskedLM
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a masked language modeling head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a masked language modeling head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
LayoutLMConfig
configuration class:LayoutLMForMaskedLM
(LayoutLM model)DistilBertConfig
configuration class:DistilBertForMaskedLM
(DistilBERT model)AlbertConfig
configuration class:AlbertForMaskedLM
(ALBERT model)BartConfig
configuration class:BartForConditionalGeneration
(BART model)CamembertConfig
configuration class:CamembertForMaskedLM
(CamemBERT model)XLMRobertaConfig
configuration class:XLMRobertaForMaskedLM
(XLM-RoBERTa model)LongformerConfig
configuration class:LongformerForMaskedLM
(Longformer model)RobertaConfig
configuration class:RobertaForMaskedLM
(RoBERTa model)SqueezeBertConfig
configuration class:SqueezeBertForMaskedLM
(SqueezeBERT model)BertConfig
configuration class:BertForMaskedLM
(BERT model)MobileBertConfig
configuration class:MobileBertForMaskedLM
(MobileBERT model)FlaubertConfig
configuration class:FlaubertWithLMHeadModel
(FlauBERT model)XLMConfig
configuration class:XLMWithLMHeadModel
(XLM model)ElectraConfig
configuration class:ElectraForMaskedLM
(ELECTRA model)ReformerConfig
configuration class:ReformerForMaskedLM
(Reformer model)FunnelConfig
configuration class:FunnelForMaskedLM
(Funnel Transformer model)
Examples:
>>> from transformers import AutoConfig, AutoModelForMaskedLM >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForMaskedLM.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a masked language modeling head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:mobilebert –
MobileBertForMaskedLM
(MobileBERT model)distilbert –
DistilBertForMaskedLM
(DistilBERT model)albert –
AlbertForMaskedLM
(ALBERT model)camembert –
CamembertForMaskedLM
(CamemBERT model)xlm-roberta –
XLMRobertaForMaskedLM
(XLM-RoBERTa model)bart –
BartForConditionalGeneration
(BART model)reformer –
ReformerForMaskedLM
(Reformer model)longformer –
LongformerForMaskedLM
(Longformer model)roberta –
RobertaForMaskedLM
(RoBERTa model)flaubert –
FlaubertWithLMHeadModel
(FlauBERT model)squeezebert –
SqueezeBertForMaskedLM
(SqueezeBERT model)bert –
BertForMaskedLM
(BERT model)xlm –
XLMWithLMHeadModel
(XLM model)electra –
ElectraForMaskedLM
(ELECTRA model)funnel –
FunnelForMaskedLM
(Funnel Transformer model)layoutlm –
LayoutLMForMaskedLM
(LayoutLM model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, AutoModelForMaskedLM >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForMaskedLM.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForMaskedLM.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForMaskedLM.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForSeq2SeqLM¶
-
class
transformers.
AutoModelForSeq2SeqLM
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a sequence-to-sequence language modeling head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a sequence-to-sequence language modeling head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
MT5Config
configuration class:MT5ForConditionalGeneration
(mT5 model)T5Config
configuration class:T5ForConditionalGeneration
(T5 model)PegasusConfig
configuration class:PegasusForConditionalGeneration
(Pegasus model)MarianConfig
configuration class:MarianMTModel
(Marian model)MBartConfig
configuration class:MBartForConditionalGeneration
(mBART model)BlenderbotConfig
configuration class:BlenderbotForConditionalGeneration
(Blenderbot model)BartConfig
configuration class:BartForConditionalGeneration
(BART model)FSMTConfig
configuration class:FSMTForConditionalGeneration
(FairSeq Machine-Translation model)EncoderDecoderConfig
configuration class:EncoderDecoderModel
(Encoder decoder model)XLMProphetNetConfig
configuration class:XLMProphetNetForConditionalGeneration
(XLMProphetNet model)ProphetNetConfig
configuration class:ProphetNetForConditionalGeneration
(ProphetNet model)
Examples:
>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('t5') >>> model = AutoModelForSeq2SeqLM.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a sequence-to-sequence language modeling head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:mt5 –
MT5ForConditionalGeneration
(mT5 model)t5 –
T5ForConditionalGeneration
(T5 model)pegasus –
PegasusForConditionalGeneration
(Pegasus model)marian –
MarianMTModel
(Marian model)mbart –
MBartForConditionalGeneration
(mBART model)bart –
BartForConditionalGeneration
(BART model)blenderbot –
BlenderbotForConditionalGeneration
(Blenderbot model)fsmt –
FSMTForConditionalGeneration
(FairSeq Machine-Translation model)xlm-prophetnet –
XLMProphetNetForConditionalGeneration
(XLMProphetNet model)prophetnet –
ProphetNetForConditionalGeneration
(ProphetNet model)encoder-decoder –
EncoderDecoderModel
(Encoder decoder model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForSeq2SeqLM.from_pretrained('t5-base') >>> # Update configuration during loading >>> model = AutoModelForSeq2SeqLM.from_pretrained('t5-base', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/t5_tf_model_config.json') >>> model = AutoModelForSeq2SeqLM.from_pretrained('./tf_model/t5_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForSequenceClassification¶
-
class
transformers.
AutoModelForSequenceClassification
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a sequence classification head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a sequence classification head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
DistilBertConfig
configuration class:DistilBertForSequenceClassification
(DistilBERT model)AlbertConfig
configuration class:AlbertForSequenceClassification
(ALBERT model)CamembertConfig
configuration class:CamembertForSequenceClassification
(CamemBERT model)XLMRobertaConfig
configuration class:XLMRobertaForSequenceClassification
(XLM-RoBERTa model)BartConfig
configuration class:BartForSequenceClassification
(BART model)LongformerConfig
configuration class:LongformerForSequenceClassification
(Longformer model)RobertaConfig
configuration class:RobertaForSequenceClassification
(RoBERTa model)SqueezeBertConfig
configuration class:SqueezeBertForSequenceClassification
(SqueezeBERT model)BertConfig
configuration class:BertForSequenceClassification
(BERT model)XLNetConfig
configuration class:XLNetForSequenceClassification
(XLNet model)MobileBertConfig
configuration class:MobileBertForSequenceClassification
(MobileBERT model)FlaubertConfig
configuration class:FlaubertForSequenceClassification
(FlauBERT model)XLMConfig
configuration class:XLMForSequenceClassification
(XLM model)ElectraConfig
configuration class:ElectraForSequenceClassification
(ELECTRA model)FunnelConfig
configuration class:FunnelForSequenceClassification
(Funnel Transformer model)DebertaConfig
configuration class:DebertaForSequenceClassification
(DeBERTa model)GPT2Config
configuration class:GPT2ForSequenceClassification
(OpenAI GPT-2 model)OpenAIGPTConfig
configuration class:OpenAIGPTForSequenceClassification
(OpenAI GPT model)ReformerConfig
configuration class:ReformerForSequenceClassification
(Reformer model)
Examples:
>>> from transformers import AutoConfig, AutoModelForSequenceClassification >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForSequenceClassification.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a sequence classification head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:mobilebert –
MobileBertForSequenceClassification
(MobileBERT model)distilbert –
DistilBertForSequenceClassification
(DistilBERT model)albert –
AlbertForSequenceClassification
(ALBERT model)camembert –
CamembertForSequenceClassification
(CamemBERT model)xlm-roberta –
XLMRobertaForSequenceClassification
(XLM-RoBERTa model)bart –
BartForSequenceClassification
(BART model)reformer –
ReformerForSequenceClassification
(Reformer model)longformer –
LongformerForSequenceClassification
(Longformer model)roberta –
RobertaForSequenceClassification
(RoBERTa model)deberta –
DebertaForSequenceClassification
(DeBERTa model)flaubert –
FlaubertForSequenceClassification
(FlauBERT model)squeezebert –
SqueezeBertForSequenceClassification
(SqueezeBERT model)bert –
BertForSequenceClassification
(BERT model)openai-gpt –
OpenAIGPTForSequenceClassification
(OpenAI GPT model)gpt2 –
GPT2ForSequenceClassification
(OpenAI GPT-2 model)xlnet –
XLNetForSequenceClassification
(XLNet model)xlm –
XLMForSequenceClassification
(XLM model)electra –
ElectraForSequenceClassification
(ELECTRA model)funnel –
FunnelForSequenceClassification
(Funnel Transformer model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, AutoModelForSequenceClassification >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForSequenceClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForMultipleChoice¶
-
class
transformers.
AutoModelForMultipleChoice
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a multiple choice classification head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a multiple choice classification head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
CamembertConfig
configuration class:CamembertForMultipleChoice
(CamemBERT model)ElectraConfig
configuration class:ElectraForMultipleChoice
(ELECTRA model)XLMRobertaConfig
configuration class:XLMRobertaForMultipleChoice
(XLM-RoBERTa model)LongformerConfig
configuration class:LongformerForMultipleChoice
(Longformer model)RobertaConfig
configuration class:RobertaForMultipleChoice
(RoBERTa model)SqueezeBertConfig
configuration class:SqueezeBertForMultipleChoice
(SqueezeBERT model)BertConfig
configuration class:BertForMultipleChoice
(BERT model)DistilBertConfig
configuration class:DistilBertForMultipleChoice
(DistilBERT model)MobileBertConfig
configuration class:MobileBertForMultipleChoice
(MobileBERT model)XLNetConfig
configuration class:XLNetForMultipleChoice
(XLNet model)AlbertConfig
configuration class:AlbertForMultipleChoice
(ALBERT model)XLMConfig
configuration class:XLMForMultipleChoice
(XLM model)FlaubertConfig
configuration class:FlaubertForMultipleChoice
(FlauBERT model)FunnelConfig
configuration class:FunnelForMultipleChoice
(Funnel Transformer model)
Examples:
>>> from transformers import AutoConfig, AutoModelForMultipleChoice >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForMultipleChoice.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a multiple choice classification head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:mobilebert –
MobileBertForMultipleChoice
(MobileBERT model)distilbert –
DistilBertForMultipleChoice
(DistilBERT model)albert –
AlbertForMultipleChoice
(ALBERT model)camembert –
CamembertForMultipleChoice
(CamemBERT model)xlm-roberta –
XLMRobertaForMultipleChoice
(XLM-RoBERTa model)longformer –
LongformerForMultipleChoice
(Longformer model)roberta –
RobertaForMultipleChoice
(RoBERTa model)flaubert –
FlaubertForMultipleChoice
(FlauBERT model)squeezebert –
SqueezeBertForMultipleChoice
(SqueezeBERT model)bert –
BertForMultipleChoice
(BERT model)xlnet –
XLNetForMultipleChoice
(XLNet model)xlm –
XLMForMultipleChoice
(XLM model)electra –
ElectraForMultipleChoice
(ELECTRA model)funnel –
FunnelForMultipleChoice
(Funnel Transformer model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, AutoModelForMultipleChoice >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForMultipleChoice.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForMultipleChoice.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForMultipleChoice.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForNextSentencePrediction¶
-
class
transformers.
AutoModelForNextSentencePrediction
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a multiple choice classification head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a multiple choice classification head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
BertConfig
configuration class:BertForNextSentencePrediction
(BERT model)MobileBertConfig
configuration class:MobileBertForNextSentencePrediction
(MobileBERT model)
Examples:
>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForNextSentencePrediction.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a multiple choice classification head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:mobilebert –
MobileBertForNextSentencePrediction
(MobileBERT model)bert –
BertForNextSentencePrediction
(BERT model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForNextSentencePrediction.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForNextSentencePrediction.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForNextSentencePrediction.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForTokenClassification¶
-
class
transformers.
AutoModelForTokenClassification
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a token classification head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a token classification head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
LayoutLMConfig
configuration class:LayoutLMForTokenClassification
(LayoutLM model)DistilBertConfig
configuration class:DistilBertForTokenClassification
(DistilBERT model)CamembertConfig
configuration class:CamembertForTokenClassification
(CamemBERT model)FlaubertConfig
configuration class:FlaubertForTokenClassification
(FlauBERT model)XLMConfig
configuration class:XLMForTokenClassification
(XLM model)XLMRobertaConfig
configuration class:XLMRobertaForTokenClassification
(XLM-RoBERTa model)LongformerConfig
configuration class:LongformerForTokenClassification
(Longformer model)RobertaConfig
configuration class:RobertaForTokenClassification
(RoBERTa model)SqueezeBertConfig
configuration class:SqueezeBertForTokenClassification
(SqueezeBERT model)BertConfig
configuration class:BertForTokenClassification
(BERT model)MobileBertConfig
configuration class:MobileBertForTokenClassification
(MobileBERT model)XLNetConfig
configuration class:XLNetForTokenClassification
(XLNet model)AlbertConfig
configuration class:AlbertForTokenClassification
(ALBERT model)ElectraConfig
configuration class:ElectraForTokenClassification
(ELECTRA model)FunnelConfig
configuration class:FunnelForTokenClassification
(Funnel Transformer model)
Examples:
>>> from transformers import AutoConfig, AutoModelForTokenClassification >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForTokenClassification.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a token classification head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:mobilebert –
MobileBertForTokenClassification
(MobileBERT model)distilbert –
DistilBertForTokenClassification
(DistilBERT model)albert –
AlbertForTokenClassification
(ALBERT model)camembert –
CamembertForTokenClassification
(CamemBERT model)xlm-roberta –
XLMRobertaForTokenClassification
(XLM-RoBERTa model)longformer –
LongformerForTokenClassification
(Longformer model)roberta –
RobertaForTokenClassification
(RoBERTa model)flaubert –
FlaubertForTokenClassification
(FlauBERT model)squeezebert –
SqueezeBertForTokenClassification
(SqueezeBERT model)bert –
BertForTokenClassification
(BERT model)xlnet –
XLNetForTokenClassification
(XLNet model)xlm –
XLMForTokenClassification
(XLM model)electra –
ElectraForTokenClassification
(ELECTRA model)funnel –
FunnelForTokenClassification
(Funnel Transformer model)layoutlm –
LayoutLMForTokenClassification
(LayoutLM model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, AutoModelForTokenClassification >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForTokenClassification.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForTokenClassification.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForTokenClassification.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
AutoModelForQuestionAnswering¶
-
class
transformers.
AutoModelForQuestionAnswering
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a question answering head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a question answering head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
DistilBertConfig
configuration class:DistilBertForQuestionAnswering
(DistilBERT model)AlbertConfig
configuration class:AlbertForQuestionAnswering
(ALBERT model)CamembertConfig
configuration class:CamembertForQuestionAnswering
(CamemBERT model)BartConfig
configuration class:BartForQuestionAnswering
(BART model)LongformerConfig
configuration class:LongformerForQuestionAnswering
(Longformer model)XLMRobertaConfig
configuration class:XLMRobertaForQuestionAnswering
(XLM-RoBERTa model)RobertaConfig
configuration class:RobertaForQuestionAnswering
(RoBERTa model)SqueezeBertConfig
configuration class:SqueezeBertForQuestionAnswering
(SqueezeBERT model)BertConfig
configuration class:BertForQuestionAnswering
(BERT model)XLNetConfig
configuration class:XLNetForQuestionAnsweringSimple
(XLNet model)FlaubertConfig
configuration class:FlaubertForQuestionAnsweringSimple
(FlauBERT model)MobileBertConfig
configuration class:MobileBertForQuestionAnswering
(MobileBERT model)XLMConfig
configuration class:XLMForQuestionAnsweringSimple
(XLM model)ElectraConfig
configuration class:ElectraForQuestionAnswering
(ELECTRA model)ReformerConfig
configuration class:ReformerForQuestionAnswering
(Reformer model)FunnelConfig
configuration class:FunnelForQuestionAnswering
(Funnel Transformer model)LxmertConfig
configuration class:LxmertForQuestionAnswering
(LXMERT model)
Examples:
>>> from transformers import AutoConfig, AutoModelForQuestionAnswering >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = AutoModelForQuestionAnswering.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a question answering head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:mobilebert –
MobileBertForQuestionAnswering
(MobileBERT model)distilbert –
DistilBertForQuestionAnswering
(DistilBERT model)albert –
AlbertForQuestionAnswering
(ALBERT model)camembert –
CamembertForQuestionAnswering
(CamemBERT model)xlm-roberta –
XLMRobertaForQuestionAnswering
(XLM-RoBERTa model)bart –
BartForQuestionAnswering
(BART model)reformer –
ReformerForQuestionAnswering
(Reformer model)longformer –
LongformerForQuestionAnswering
(Longformer model)roberta –
RobertaForQuestionAnswering
(RoBERTa model)flaubert –
FlaubertForQuestionAnsweringSimple
(FlauBERT model)squeezebert –
SqueezeBertForQuestionAnswering
(SqueezeBERT model)bert –
BertForQuestionAnswering
(BERT model)xlnet –
XLNetForQuestionAnsweringSimple
(XLNet model)xlm –
XLMForQuestionAnsweringSimple
(XLM model)electra –
ElectraForQuestionAnswering
(ELECTRA model)funnel –
FunnelForQuestionAnswering
(Funnel Transformer model)lxmert –
LxmertForQuestionAnswering
(LXMERT model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a tensorflow index checkpoint file (e.g,
./tf_model/model.ckpt.index
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by supplying the save directory.The model is loaded by supplying a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, AutoModelForQuestionAnswering >>> # Download model and configuration from huggingface.co and cache. >>> model = AutoModelForQuestionAnswering.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = AutoModelForQuestionAnswering.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a TF checkpoint file instead of a PyTorch model (slower) >>> config = AutoConfig.from_json_file('./tf_model/bert_tf_model_config.json') >>> model = AutoModelForQuestionAnswering.from_pretrained('./tf_model/bert_tf_checkpoint.ckpt.index', from_tf=True, config=config)
-
classmethod
TFAutoModel¶
-
class
transformers.
TFAutoModel
[source]¶ This is a generic model class that will be instantiated as one of the base model classes of the library when created with the when created with the
from_pretrained()
class method or thefrom_config()
class methods.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the base model classes of the library from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
LxmertConfig
configuration class:TFLxmertModel
(LXMERT model)MT5Config
configuration class:TFMT5Model
(mT5 model)DistilBertConfig
configuration class:TFDistilBertModel
(DistilBERT model)AlbertConfig
configuration class:TFAlbertModel
(ALBERT model)BartConfig
configuration class:TFBartModel
(BART model)CamembertConfig
configuration class:TFCamembertModel
(CamemBERT model)XLMRobertaConfig
configuration class:TFXLMRobertaModel
(XLM-RoBERTa model)LongformerConfig
configuration class:TFLongformerModel
(Longformer model)RobertaConfig
configuration class:TFRobertaModel
(RoBERTa model)BertConfig
configuration class:TFBertModel
(BERT model)OpenAIGPTConfig
configuration class:TFOpenAIGPTModel
(OpenAI GPT model)GPT2Config
configuration class:TFGPT2Model
(OpenAI GPT-2 model)MobileBertConfig
configuration class:TFMobileBertModel
(MobileBERT model)TransfoXLConfig
configuration class:TFTransfoXLModel
(Transformer-XL model)XLNetConfig
configuration class:TFXLNetModel
(XLNet model)FlaubertConfig
configuration class:TFFlaubertModel
(FlauBERT model)XLMConfig
configuration class:TFXLMModel
(XLM model)CTRLConfig
configuration class:TFCTRLModel
(CTRL model)ElectraConfig
configuration class:TFElectraModel
(ELECTRA model)FunnelConfig
configuration class:TFFunnelModel
(Funnel Transformer model)DPRConfig
configuration class:TFDPRQuestionEncoder
(DPR model)
Examples:
>>> from transformers import AutoConfig, TFAutoModel >>> # Download configuration from huggingface.co and cache. >>> config = TFAutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModel.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the base model classes of the library from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:mt5 –
TFMT5Model
(mT5 model)t5 –
TFT5Model
(T5 model)mobilebert –
TFMobileBertModel
(MobileBERT model)distilbert –
TFDistilBertModel
(DistilBERT model)albert –
TFAlbertModel
(ALBERT model)camembert –
TFCamembertModel
(CamemBERT model)xlm-roberta –
TFXLMRobertaModel
(XLM-RoBERTa model)bart –
TFBartModel
(BART model)longformer –
TFLongformerModel
(Longformer model)roberta –
TFRobertaModel
(RoBERTa model)flaubert –
TFFlaubertModel
(FlauBERT model)bert –
TFBertModel
(BERT model)openai-gpt –
TFOpenAIGPTModel
(OpenAI GPT model)gpt2 –
TFGPT2Model
(OpenAI GPT-2 model)transfo-xl –
TFTransfoXLModel
(Transformer-XL model)xlnet –
TFXLNetModel
(XLNet model)xlm –
TFXLMModel
(XLM model)ctrl –
TFCTRLModel
(CTRL model)electra –
TFElectraModel
(ELECTRA model)funnel –
TFFunnelModel
(Funnel Transformer model)lxmert –
TFLxmertModel
(LXMERT model)dpr –
TFDPRQuestionEncoder
(DPR model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, AutoModel >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModel.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModel.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModel.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForPreTraining¶
-
class
transformers.
TFAutoModelForPreTraining
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with the architecture used for pretraining this model—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with the architecture used for pretraining this model—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
LxmertConfig
configuration class:TFLxmertForPreTraining
(LXMERT model)T5Config
configuration class:TFT5ForConditionalGeneration
(T5 model)DistilBertConfig
configuration class:TFDistilBertForMaskedLM
(DistilBERT model)AlbertConfig
configuration class:TFAlbertForPreTraining
(ALBERT model)BartConfig
configuration class:TFBartForConditionalGeneration
(BART model)CamembertConfig
configuration class:TFCamembertForMaskedLM
(CamemBERT model)XLMRobertaConfig
configuration class:TFXLMRobertaForMaskedLM
(XLM-RoBERTa model)RobertaConfig
configuration class:TFRobertaForMaskedLM
(RoBERTa model)BertConfig
configuration class:TFBertForPreTraining
(BERT model)OpenAIGPTConfig
configuration class:TFOpenAIGPTLMHeadModel
(OpenAI GPT model)GPT2Config
configuration class:TFGPT2LMHeadModel
(OpenAI GPT-2 model)MobileBertConfig
configuration class:TFMobileBertForPreTraining
(MobileBERT model)TransfoXLConfig
configuration class:TFTransfoXLLMHeadModel
(Transformer-XL model)XLNetConfig
configuration class:TFXLNetLMHeadModel
(XLNet model)FlaubertConfig
configuration class:TFFlaubertWithLMHeadModel
(FlauBERT model)XLMConfig
configuration class:TFXLMWithLMHeadModel
(XLM model)CTRLConfig
configuration class:TFCTRLLMHeadModel
(CTRL model)ElectraConfig
configuration class:TFElectraForPreTraining
(ELECTRA model)FunnelConfig
configuration class:TFFunnelForPreTraining
(Funnel Transformer model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForPreTraining >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModelForPreTraining.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with the architecture used for pretraining this model—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:t5 –
TFT5ForConditionalGeneration
(T5 model)mobilebert –
TFMobileBertForPreTraining
(MobileBERT model)distilbert –
TFDistilBertForMaskedLM
(DistilBERT model)albert –
TFAlbertForPreTraining
(ALBERT model)camembert –
TFCamembertForMaskedLM
(CamemBERT model)xlm-roberta –
TFXLMRobertaForMaskedLM
(XLM-RoBERTa model)bart –
TFBartForConditionalGeneration
(BART model)roberta –
TFRobertaForMaskedLM
(RoBERTa model)flaubert –
TFFlaubertWithLMHeadModel
(FlauBERT model)bert –
TFBertForPreTraining
(BERT model)openai-gpt –
TFOpenAIGPTLMHeadModel
(OpenAI GPT model)gpt2 –
TFGPT2LMHeadModel
(OpenAI GPT-2 model)transfo-xl –
TFTransfoXLLMHeadModel
(Transformer-XL model)xlnet –
TFXLNetLMHeadModel
(XLNet model)xlm –
TFXLMWithLMHeadModel
(XLM model)ctrl –
TFCTRLLMHeadModel
(CTRL model)electra –
TFElectraForPreTraining
(ELECTRA model)funnel –
TFFunnelForPreTraining
(Funnel Transformer model)lxmert –
TFLxmertForPreTraining
(LXMERT model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForPreTraining >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForPreTraining.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModelForPreTraining.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModelForPreTraining.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForCausalLM¶
-
class
transformers.
TFAutoModelForCausalLM
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a causal language modeling head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a causal language modeling head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
BertConfig
configuration class:TFBertLMHeadModel
(BERT model)OpenAIGPTConfig
configuration class:TFOpenAIGPTLMHeadModel
(OpenAI GPT model)GPT2Config
configuration class:TFGPT2LMHeadModel
(OpenAI GPT-2 model)TransfoXLConfig
configuration class:TFTransfoXLLMHeadModel
(Transformer-XL model)XLNetConfig
configuration class:TFXLNetLMHeadModel
(XLNet model)XLMConfig
configuration class:TFXLMWithLMHeadModel
(XLM model)CTRLConfig
configuration class:TFCTRLLMHeadModel
(CTRL model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForCausalLM >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('gpt2') >>> model = TFAutoModelForCausalLM.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a causal language modeling head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:bert –
TFBertLMHeadModel
(BERT model)openai-gpt –
TFOpenAIGPTLMHeadModel
(OpenAI GPT model)gpt2 –
TFGPT2LMHeadModel
(OpenAI GPT-2 model)transfo-xl –
TFTransfoXLLMHeadModel
(Transformer-XL model)xlnet –
TFXLNetLMHeadModel
(XLNet model)xlm –
TFXLMWithLMHeadModel
(XLM model)ctrl –
TFCTRLLMHeadModel
(CTRL model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForCausalLM >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForCausalLM.from_pretrained('gpt2') >>> # Update configuration during loading >>> model = TFAutoModelForCausalLM.from_pretrained('gpt2', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/gpt2_pt_model_config.json') >>> model = TFAutoModelForCausalLM.from_pretrained('./pt_model/gpt2_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForMaskedLM¶
-
class
transformers.
TFAutoModelForMaskedLM
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a masked language modeling head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a masked language modeling head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
DistilBertConfig
configuration class:TFDistilBertForMaskedLM
(DistilBERT model)AlbertConfig
configuration class:TFAlbertForMaskedLM
(ALBERT model)CamembertConfig
configuration class:TFCamembertForMaskedLM
(CamemBERT model)XLMRobertaConfig
configuration class:TFXLMRobertaForMaskedLM
(XLM-RoBERTa model)LongformerConfig
configuration class:TFLongformerForMaskedLM
(Longformer model)RobertaConfig
configuration class:TFRobertaForMaskedLM
(RoBERTa model)BertConfig
configuration class:TFBertForMaskedLM
(BERT model)MobileBertConfig
configuration class:TFMobileBertForMaskedLM
(MobileBERT model)FlaubertConfig
configuration class:TFFlaubertWithLMHeadModel
(FlauBERT model)XLMConfig
configuration class:TFXLMWithLMHeadModel
(XLM model)ElectraConfig
configuration class:TFElectraForMaskedLM
(ELECTRA model)FunnelConfig
configuration class:TFFunnelForMaskedLM
(Funnel Transformer model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForMaskedLM >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModelForMaskedLM.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a masked language modeling head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:mobilebert –
TFMobileBertForMaskedLM
(MobileBERT model)distilbert –
TFDistilBertForMaskedLM
(DistilBERT model)albert –
TFAlbertForMaskedLM
(ALBERT model)camembert –
TFCamembertForMaskedLM
(CamemBERT model)xlm-roberta –
TFXLMRobertaForMaskedLM
(XLM-RoBERTa model)longformer –
TFLongformerForMaskedLM
(Longformer model)roberta –
TFRobertaForMaskedLM
(RoBERTa model)flaubert –
TFFlaubertWithLMHeadModel
(FlauBERT model)bert –
TFBertForMaskedLM
(BERT model)xlm –
TFXLMWithLMHeadModel
(XLM model)electra –
TFElectraForMaskedLM
(ELECTRA model)funnel –
TFFunnelForMaskedLM
(Funnel Transformer model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForMaskedLM >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForMaskedLM.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModelForMaskedLM.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModelForMaskedLM.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForSeq2SeqLM¶
-
class
transformers.
TFAutoModelForSeq2SeqLM
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a sequence-to-sequence language modeling head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a sequence-to-sequence language modeling head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
MT5Config
configuration class:TFMT5ForConditionalGeneration
(mT5 model)T5Config
configuration class:TFT5ForConditionalGeneration
(T5 model)MarianConfig
configuration class:TFMarianMTModel
(Marian model)MBartConfig
configuration class:TFMBartForConditionalGeneration
(mBART model)PegasusConfig
configuration class:TFPegasusForConditionalGeneration
(Pegasus model)BlenderbotConfig
configuration class:TFBlenderbotForConditionalGeneration
(Blenderbot model)BartConfig
configuration class:TFBartForConditionalGeneration
(BART model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('t5') >>> model = TFAutoModelForSeq2SeqLM.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a sequence-to-sequence language modeling head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:MT5Config
configuration class:TFMT5ForConditionalGeneration
(mT5 model)T5Config
configuration class:TFT5ForConditionalGeneration
(T5 model)MarianConfig
configuration class:TFMarianMTModel
(Marian model)MBartConfig
configuration class:TFMBartForConditionalGeneration
(mBART model)PegasusConfig
configuration class:TFPegasusForConditionalGeneration
(Pegasus model)BlenderbotConfig
configuration class:TFBlenderbotForConditionalGeneration
(Blenderbot model)BartConfig
configuration class:TFBartForConditionalGeneration
(BART model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForSeq2SeqLM.from_pretrained('t5-base') >>> # Update configuration during loading >>> model = TFAutoModelForSeq2SeqLM.from_pretrained('t5-base', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/t5_pt_model_config.json') >>> model = TFAutoModelForSeq2SeqLM.from_pretrained('./pt_model/t5_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForSequenceClassification¶
-
class
transformers.
TFAutoModelForSequenceClassification
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a sequence classification head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a sequence classification head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
DistilBertConfig
configuration class:TFDistilBertForSequenceClassification
(DistilBERT model)AlbertConfig
configuration class:TFAlbertForSequenceClassification
(ALBERT model)CamembertConfig
configuration class:TFCamembertForSequenceClassification
(CamemBERT model)XLMRobertaConfig
configuration class:TFXLMRobertaForSequenceClassification
(XLM-RoBERTa model)LongformerConfig
configuration class:TFLongformerForSequenceClassification
(Longformer model)RobertaConfig
configuration class:TFRobertaForSequenceClassification
(RoBERTa model)BertConfig
configuration class:TFBertForSequenceClassification
(BERT model)XLNetConfig
configuration class:TFXLNetForSequenceClassification
(XLNet model)MobileBertConfig
configuration class:TFMobileBertForSequenceClassification
(MobileBERT model)FlaubertConfig
configuration class:TFFlaubertForSequenceClassification
(FlauBERT model)XLMConfig
configuration class:TFXLMForSequenceClassification
(XLM model)ElectraConfig
configuration class:TFElectraForSequenceClassification
(ELECTRA model)FunnelConfig
configuration class:TFFunnelForSequenceClassification
(Funnel Transformer model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModelForSequenceClassification.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a sequence classification head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:mobilebert –
TFMobileBertForSequenceClassification
(MobileBERT model)distilbert –
TFDistilBertForSequenceClassification
(DistilBERT model)albert –
TFAlbertForSequenceClassification
(ALBERT model)camembert –
TFCamembertForSequenceClassification
(CamemBERT model)xlm-roberta –
TFXLMRobertaForSequenceClassification
(XLM-RoBERTa model)longformer –
TFLongformerForSequenceClassification
(Longformer model)roberta –
TFRobertaForSequenceClassification
(RoBERTa model)flaubert –
TFFlaubertForSequenceClassification
(FlauBERT model)bert –
TFBertForSequenceClassification
(BERT model)xlnet –
TFXLNetForSequenceClassification
(XLNet model)xlm –
TFXLMForSequenceClassification
(XLM model)electra –
TFElectraForSequenceClassification
(ELECTRA model)funnel –
TFFunnelForSequenceClassification
(Funnel Transformer model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForSequenceClassification.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModelForSequenceClassification.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModelForSequenceClassification.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForMultipleChoice¶
-
class
transformers.
TFAutoModelForMultipleChoice
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a multiple choice classification head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a multiple choice classification head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
CamembertConfig
configuration class:TFCamembertForMultipleChoice
(CamemBERT model)XLMConfig
configuration class:TFXLMForMultipleChoice
(XLM model)XLMRobertaConfig
configuration class:TFXLMRobertaForMultipleChoice
(XLM-RoBERTa model)LongformerConfig
configuration class:TFLongformerForMultipleChoice
(Longformer model)RobertaConfig
configuration class:TFRobertaForMultipleChoice
(RoBERTa model)BertConfig
configuration class:TFBertForMultipleChoice
(BERT model)DistilBertConfig
configuration class:TFDistilBertForMultipleChoice
(DistilBERT model)MobileBertConfig
configuration class:TFMobileBertForMultipleChoice
(MobileBERT model)XLNetConfig
configuration class:TFXLNetForMultipleChoice
(XLNet model)FlaubertConfig
configuration class:TFFlaubertForMultipleChoice
(FlauBERT model)AlbertConfig
configuration class:TFAlbertForMultipleChoice
(ALBERT model)ElectraConfig
configuration class:TFElectraForMultipleChoice
(ELECTRA model)FunnelConfig
configuration class:TFFunnelForMultipleChoice
(Funnel Transformer model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModelForMultipleChoice.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a multiple choice classification head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:mobilebert –
TFMobileBertForMultipleChoice
(MobileBERT model)distilbert –
TFDistilBertForMultipleChoice
(DistilBERT model)albert –
TFAlbertForMultipleChoice
(ALBERT model)camembert –
TFCamembertForMultipleChoice
(CamemBERT model)xlm-roberta –
TFXLMRobertaForMultipleChoice
(XLM-RoBERTa model)longformer –
TFLongformerForMultipleChoice
(Longformer model)roberta –
TFRobertaForMultipleChoice
(RoBERTa model)flaubert –
TFFlaubertForMultipleChoice
(FlauBERT model)bert –
TFBertForMultipleChoice
(BERT model)xlnet –
TFXLNetForMultipleChoice
(XLNet model)xlm –
TFXLMForMultipleChoice
(XLM model)electra –
TFElectraForMultipleChoice
(ELECTRA model)funnel –
TFFunnelForMultipleChoice
(Funnel Transformer model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForMultipleChoice.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModelForMultipleChoice.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModelForMultipleChoice.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForTokenClassification¶
-
class
transformers.
TFAutoModelForTokenClassification
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a token classification head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a token classification head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
DistilBertConfig
configuration class:TFDistilBertForTokenClassification
(DistilBERT model)AlbertConfig
configuration class:TFAlbertForTokenClassification
(ALBERT model)CamembertConfig
configuration class:TFCamembertForTokenClassification
(CamemBERT model)FlaubertConfig
configuration class:TFFlaubertForTokenClassification
(FlauBERT model)XLMConfig
configuration class:TFXLMForTokenClassification
(XLM model)XLMRobertaConfig
configuration class:TFXLMRobertaForTokenClassification
(XLM-RoBERTa model)LongformerConfig
configuration class:TFLongformerForTokenClassification
(Longformer model)RobertaConfig
configuration class:TFRobertaForTokenClassification
(RoBERTa model)BertConfig
configuration class:TFBertForTokenClassification
(BERT model)MobileBertConfig
configuration class:TFMobileBertForTokenClassification
(MobileBERT model)XLNetConfig
configuration class:TFXLNetForTokenClassification
(XLNet model)ElectraConfig
configuration class:TFElectraForTokenClassification
(ELECTRA model)FunnelConfig
configuration class:TFFunnelForTokenClassification
(Funnel Transformer model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForTokenClassification >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModelForTokenClassification.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a token classification head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:mobilebert –
TFMobileBertForTokenClassification
(MobileBERT model)distilbert –
TFDistilBertForTokenClassification
(DistilBERT model)albert –
TFAlbertForTokenClassification
(ALBERT model)camembert –
TFCamembertForTokenClassification
(CamemBERT model)xlm-roberta –
TFXLMRobertaForTokenClassification
(XLM-RoBERTa model)longformer –
TFLongformerForTokenClassification
(Longformer model)roberta –
TFRobertaForTokenClassification
(RoBERTa model)flaubert –
TFFlaubertForTokenClassification
(FlauBERT model)bert –
TFBertForTokenClassification
(BERT model)xlnet –
TFXLNetForTokenClassification
(XLNet model)xlm –
TFXLMForTokenClassification
(XLM model)electra –
TFElectraForTokenClassification
(ELECTRA model)funnel –
TFFunnelForTokenClassification
(Funnel Transformer model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForTokenClassification >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForTokenClassification.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModelForTokenClassification.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModelForTokenClassification.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod
TFAutoModelForQuestionAnswering¶
-
class
transformers.
TFAutoModelForQuestionAnswering
[source]¶ This is a generic model class that will be instantiated as one of the model classes of the library—with a question answering head—when created with the when created with the
from_pretrained()
class method or thefrom_config()
class method.This class cannot be instantiated directly using
__init__()
(throws an error).-
classmethod
from_config
(config)[source]¶ Instantiates one of the model classes of the library—with a question answering head—from a configuration.
Note
Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use
from_pretrained()
to load the model weights.- Parameters
config (
PretrainedConfig
) –The model class to instantiate is selected based on the configuration class:
DistilBertConfig
configuration class:TFDistilBertForQuestionAnswering
(DistilBERT model)AlbertConfig
configuration class:TFAlbertForQuestionAnswering
(ALBERT model)CamembertConfig
configuration class:TFCamembertForQuestionAnswering
(CamemBERT model)XLMRobertaConfig
configuration class:TFXLMRobertaForQuestionAnswering
(XLM-RoBERTa model)LongformerConfig
configuration class:TFLongformerForQuestionAnswering
(Longformer model)RobertaConfig
configuration class:TFRobertaForQuestionAnswering
(RoBERTa model)BertConfig
configuration class:TFBertForQuestionAnswering
(BERT model)XLNetConfig
configuration class:TFXLNetForQuestionAnsweringSimple
(XLNet model)MobileBertConfig
configuration class:TFMobileBertForQuestionAnswering
(MobileBERT model)FlaubertConfig
configuration class:TFFlaubertForQuestionAnsweringSimple
(FlauBERT model)XLMConfig
configuration class:TFXLMForQuestionAnsweringSimple
(XLM model)ElectraConfig
configuration class:TFElectraForQuestionAnswering
(ELECTRA model)FunnelConfig
configuration class:TFFunnelForQuestionAnswering
(Funnel Transformer model)
Examples:
>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering >>> # Download configuration from huggingface.co and cache. >>> config = AutoConfig.from_pretrained('bert-base-uncased') >>> model = TFAutoModelForQuestionAnswering.from_config(config)
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *model_args, **kwargs)[source]¶ Instantiate one of the model classes of the library—with a question answering head—from a pretrained model.
The model class to instantiate is selected based on the
model_type
property of the config object (either passed as an argument or loaded frompretrained_model_name_or_path
if possible), or when it’s missing, by falling back to using pattern matching onpretrained_model_name_or_path
:mobilebert –
TFMobileBertForQuestionAnswering
(MobileBERT model)distilbert –
TFDistilBertForQuestionAnswering
(DistilBERT model)albert –
TFAlbertForQuestionAnswering
(ALBERT model)camembert –
TFCamembertForQuestionAnswering
(CamemBERT model)xlm-roberta –
TFXLMRobertaForQuestionAnswering
(XLM-RoBERTa model)longformer –
TFLongformerForQuestionAnswering
(Longformer model)roberta –
TFRobertaForQuestionAnswering
(RoBERTa model)flaubert –
TFFlaubertForQuestionAnsweringSimple
(FlauBERT model)bert –
TFBertForQuestionAnswering
(BERT model)xlnet –
TFXLNetForQuestionAnsweringSimple
(XLNet model)xlm –
TFXLMForQuestionAnsweringSimple
(XLM model)electra –
TFElectraForQuestionAnswering
(ELECTRA model)funnel –
TFFunnelForQuestionAnswering
(Funnel Transformer model)
The model is set in evaluation mode by default using
model.eval()
(so for instance, dropout modules are deactivated). To train the model, you should first set it back in training mode withmodel.train()
- Parameters
pretrained_model_name_or_path –
Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
.A path to a directory containing model weights saved using
save_pretrained()
, e.g.,./my_model_directory/
.A path or url to a PyTorch state_dict save file (e.g,
./pt_model/pytorch_model.bin
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
model_args (additional positional arguments, optional) – Will be passed along to the underlying model
__init__()
method.config (
PretrainedConfig
, optional) –Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:
The model is a model provided by the library (loaded with the model id string of a pretrained model).
The model was saved using
save_pretrained()
and is reloaded by suppyling the save directory.The model is loaded by suppyling a local directory as
pretrained_model_name_or_path
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) –
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using
save_pretrained()
andfrom_pretrained()
is not a simpler option.cache_dir (
str
, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.from_tf (
bool
, optional, defaults toFalse
) – Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument).force_download (
bool
, optional, defaults toFalse
) – Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists.proxies (
Dict[str, str], `optional
) – A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request.output_loading_info (
bool
, optional, defaults toFalse
) – Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages.local_files_only (
bool
, optional, defaults toFalse
) – Whether or not to only look at local files (e.g., not try downloading the model).revision (
str
, optional, defaults to"main"
) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.kwargs (additional keyword arguments, optional) –
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True
). Behaves differently depending on whether aconfig
is provided or automatically loaded:If a configuration is provided with
config
,**kwargs
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done)If a configuration is not provided,
kwargs
will be first passed to the configuration class initialization function (from_pretrained()
). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
function.
Examples:
>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering >>> # Download model and configuration from huggingface.co and cache. >>> model = TFAutoModelForQuestionAnswering.from_pretrained('bert-base-uncased') >>> # Update configuration during loading >>> model = TFAutoModelForQuestionAnswering.from_pretrained('bert-base-uncased', output_attentions=True) >>> model.config.output_attentions True >>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower) >>> config = AutoConfig.from_json_file('./pt_model/bert_pt_model_config.json') >>> model = TFAutoModelForQuestionAnswering.from_pretrained('./pt_model/bert_pytorch_model.bin', from_pt=True, config=config)
-
classmethod