Hub Python Library documentation
Mixins & serialization methods
Mixins & serialization methods
Mixins
The huggingface_hub
library offers a range of mixins that can be used as a parent class for your
objects, in order to provide simple uploading and downloading functions.
PyTorch
A Generic Base Model Hub Mixin. Define your own mixin for anything by
inheriting from this class and overwriting _from_pretrained
and
_save_pretrained
to define custom logic for saving/loading your classes.
See huggingface_hub.PyTorchModelHubMixin
for an example.
from_pretrained
< source >( pretrained_model_name_or_path: str force_download: bool = False resume_download: bool = False proxies: typing.Dict = None use_auth_token: typing.Optional[str] = None cache_dir: typing.Optional[str] = None local_files_only: bool = False **model_kwargs )
Parameters
-
pretrained_model_name_or_path (
str
oros.PathLike
) — Can be either:- A string, the
model id
of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, likebert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - You can add
revision
by appending@
at the end of model_id simply like this:dbmdz/bert-base-german-cased@main
Revision is the specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git. - A path to a
directory
containing model weights saved usingsave_pretrained
, e.g.,./my_model_directory/
. None
if you are both providing the configuration and state dictionary (resp. with keyword argumentsconfig
andstate_dict
).
- A string, the
-
force_download (
bool
, optional, defaults toFalse
) — Whether to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
bool
, optional, defaults toFalse
) — Whether to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request. -
use_auth_token (
str
orbool
, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue
, will use the token generated when runningtransformers-cli login
(stored in~/.huggingface
). -
cache_dir (
Union[str, os.PathLike]
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
local_files_only(
bool
, optional, defaults toFalse
) — Whether to only look at local files (i.e., do not try to download the model). -
model_kwargs (
Dict
, optional) — model_kwargs will be passed to the model during initialization
Instantiate a pretrained PyTorch model from a pre-trained model
configuration from huggingface-hub. The model is set in
evaluation mode by default using model.eval()
(Dropout modules
are deactivated). To train the model, you should first set it
back in training mode with model.train()
.
Passing use_auth_token=True
is required when you want to use a
private model.
push_to_hub
< source >( repo_path_or_name: typing.Optional[str] = None repo_url: typing.Optional[str] = None commit_message: typing.Optional[str] = 'Add model' organization: typing.Optional[str] = None private: typing.Optional[bool] = None api_endpoint: typing.Optional[str] = None use_auth_token: typing.Union[bool, str, NoneType] = None git_user: typing.Optional[str] = None git_email: typing.Optional[str] = None config: typing.Optional[dict] = None skip_lfs_files: bool = False )
Parameters
-
repo_path_or_name (
str
, optional) — Can either be a repository name for your model or tokenizer in the Hub or a path to a local folder (in which case the repository will have the name of that local folder). If not specified, will default to the name given byrepo_url
and a local directory with that name will be created. -
repo_url (
str
, optional) — Specify this in case you want to push to an existing repository in the hub. If unspecified, a new repository will be created in your namespace (unless you specify anorganization
) withrepo_name
. -
commit_message (
str
, optional) — Message to commit while pushing. Will default to"add config"
,"add tokenizer"
or"add model"
depending on the type of the class. -
organization (
str
, optional) — Organization in which you want to push your model or tokenizer (you must be a member of this organization). -
private (
bool
, optional) — Whether the repository created should be private. -
api_endpoint (
str
, optional) — The API endpoint to use when pushing the model to the hub. -
use_auth_token (
bool
orstr
, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue
, will use the token generated when runningtransformers-cli login
(stored in~/.huggingface
). Will default toTrue
ifrepo_url
is not specified. -
git_user (
str
, optional) — will override thegit config user.name
for committing and pushing files to the hub. -
git_email (
str
, optional) — will override thegit config user.email
for committing and pushing files to the hub. -
config (
dict
, optional) — Configuration object to be saved alongside the model weights. -
skip_lfs_files (
bool
, optional, defaults toFalse
) — Whether to skip git-LFS files or not.
Upload model checkpoint or tokenizer files to the Hub while
synchronizing a local clone of the repo in repo_path_or_name
.
save_pretrained
< source >( save_directory: str config: typing.Optional[dict] = None push_to_hub: bool = False **kwargs )
Parameters
-
save_directory (
str
) — Specify directory in which you want to save weights. -
config (
dict
, optional) — specify config (must be dict) in case you want to save it. -
push_to_hub (
bool
, optional, defaults toFalse
) — Set it toTrue
in case you want to push your weights to huggingface_hub -
kwargs (
Dict
, optional) — kwargs will be passed topush_to_hub
Save weights in local directory.
Keras
huggingface_hub.from_pretrained_keras
< source >( *args **kwargs )
Parameters
-
pretrained_model_name_or_path (
str
oros.PathLike
) — Can be either:- A string, the
model id
of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, likebert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - You can add
revision
by appending@
at the end of model_id simply like this:dbmdz/bert-base-german-cased@main
Revision is the specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git. - A path to a
directory
containing model weights saved usingsave_pretrained
, e.g.,./my_model_directory/
. None
if you are both providing the configuration and state dictionary (resp. with keyword argumentsconfig
andstate_dict
).
- A string, the
-
force_download (
bool
, optional, defaults toFalse
) — Whether to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
bool
, optional, defaults toFalse
) — Whether to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}
. The proxies are used on each request. -
use_auth_token (
str
orbool
, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue
, will use the token generated when runningtransformers-cli login
(stored in~/.huggingface
). -
cache_dir (
Union[str, os.PathLike]
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
local_files_only(
bool
, optional, defaults toFalse
) — Whether to only look at local files (i.e., do not try to download the model). -
model_kwargs (
Dict
, optional) — model_kwargs will be passed to the model during initialization
Instantiate a pretrained Keras model from a pre-trained model from the Hub. The model is expected to be in SavedModel format.```
Passing use_auth_token=True
is required when you want to use a private
model.
huggingface_hub.push_to_hub_keras
< source >( model repo_path_or_name: typing.Optional[str] = None repo_url: typing.Optional[str] = None log_dir: typing.Optional[str] = None commit_message: typing.Optional[str] = 'Add model' organization: typing.Optional[str] = None private: typing.Optional[bool] = None api_endpoint: typing.Optional[str] = None use_auth_token: typing.Union[bool, str, NoneType] = True git_user: typing.Optional[str] = None git_email: typing.Optional[str] = None config: typing.Optional[dict] = None include_optimizer: typing.Optional[bool] = False tags: typing.Union[list, str, NoneType] = None plot_model: typing.Optional[bool] = True **model_save_kwargs )
Parameters
-
model (
Keras.Model
) — The Keras model you’d like to push to the Hub. The model must be compiled and built. -
repo_path_or_name (
str
, optional) — Can either be a repository name for your model or tokenizer in the Hub or a path to a local folder (in which case the repository will have the name of that local folder). If not specified, will default to the name given byrepo_url
and a local directory with that name will be created. -
repo_url (
str
, optional) — Specify this in case you want to push to an existing repository in the Hub. If unspecified, a new repository will be created in your namespace (unless you specify anorganization
) withrepo_name
. -
log_dir (
str
, optional) — TensorBoard logging directory to be pushed. The Hub automatically hosts and displays a TensorBoard instance if log files are included in the repository. -
commit_message (
str
, optional, defaults to “Add message”) — Message to commit while pushing. -
organization (
str
, optional) — Organization in which you want to push your model or tokenizer (you must be a member of this organization). -
private (
bool
, optional) — Whether the repository created should be private. -
api_endpoint (
str
, optional) — The API endpoint to use when pushing the model to the hub. -
use_auth_token (
bool
orstr
, optional, defaults toTrue
) — The token to use as HTTP bearer authorization for remote files. IfTrue
, will use the token generated when runningtransformers-cli login
(stored in~/.huggingface
). Will default toTrue
. -
git_user (
str
, optional) — will override thegit config user.name
for committing and pushing files to the Hub. -
git_email (
str
, optional) — will override thegit config user.email
for committing and pushing files to the Hub. -
config (
dict
, optional) — Configuration object to be saved alongside the model weights. -
include_optimizer (
bool
, optional, defaults toFalse
) — Whether or not to include optimizer during serialization. -
tags (Union[
list
,str
], optional) — List of tags that are related to model or string of a single tag. See example tags here. -
plot_model (
bool
, optional, defaults toTrue
) — Setting this toTrue
will plot the model and put it in the model card. Requires graphviz and pydot to be installed. -
model_save_kwargs(
dict
, optional) — model_save_kwargs will be passed totf.keras.models.save_model()
.
Upload model checkpoint or tokenizer files to the Hub while synchronizing a
local clone of the repo in repo_path_or_name
.
huggingface_hub.save_pretrained_keras
< source >( model save_directory: str config: typing.Union[typing.Dict[str, typing.Any], NoneType] = None include_optimizer: typing.Optional[bool] = False plot_model: typing.Optional[bool] = True tags: typing.Union[list, str, NoneType] = None **model_save_kwargs )
Parameters
-
model (
Keras.Model
) — The Keras model you’d like to save. The model must be compiled and built. -
save_directory (
str
) — Specify directory in which you want to save the Keras model. -
config (
dict
, optional) — Configuration object to be saved alongside the model weights. -
include_optimizer(
bool
, optional, defaults toFalse
) — Whether or not to include optimizer in serialization. -
plot_model (
bool
, optional, defaults toTrue
) — Setting this toTrue
will plot the model and put it in the model card. Requires graphviz and pydot to be installed. -
tags (Union[
str
,list
], optional) — List of tags that are related to model or string of a single tag. See example tags here. -
model_save_kwargs(
dict
, optional) — model_save_kwargs will be passed totf.keras.models.save_model()
.
Saves a Keras model to save_directory in SavedModel format. Use this if you’re using the Functional or Sequential APIs.
Mixin to provide model Hub upload/download capabilities to Keras models. Override this class to obtain the following internal methods:
_from_pretrained
, to load a model from the Hub or from local files._save_pretrained
, to save a model in theSavedModel
format.
Fastai
huggingface_hub.from_pretrained_fastai
< source >( repo_id: str revision: typing.Optional[str] = None )
Parameters
-
repo_id (
str
) — The location where the pickled fastai.Learner is. It can be either of the two:- Hosted on the Hugging Face Hub. E.g.: ‘espejelomar/fatai-pet-breeds-classification’ or ‘distilgpt2’.
You can add a
revision
by appending@
at the end ofrepo_id
. E.g.:dbmdz/bert-base-german-cased@main
. Revision is the specific model version to use. Since we use a git-based system for storing models and other artifacts on the Hugging Face Hub, it can be a branch name, a tag name, or a commit id. - Hosted locally.
repo_id
would be a directory containing the pickle and a pyproject.toml indicating the fastai and fastcore versions used to build thefastai.Learner
. E.g.:./my_model_directory/
.
- Hosted on the Hugging Face Hub. E.g.: ‘espejelomar/fatai-pet-breeds-classification’ or ‘distilgpt2’.
You can add a
-
revision (
str
, optional) — Revision at which the repo’s files are downloaded. See documentation ofsnapshot_download
.
Load pretrained fastai model from the Hub or from a local directory.
huggingface_hub.push_to_hub_fastai
< source >( learner repo_id: str commit_message: typing.Optional[str] = 'Add model' private: typing.Optional[bool] = None token: typing.Optional[str] = None config: typing.Optional[dict] = None **kwargs )
Parameters
- learner (Learner) — The *fastai.Learner’ you’d like to push to the Hub.
- repo_id (str) — The repository id for your model in Hub in the format of “namespace/repo_name”. The namespace can be your individual account or an organization to which you have write access (for example, ‘stanfordnlp/stanza-de’).
-
commit_message (str`, optional*) — Message to commit while pushing. Will default to
"add model"
. - private (bool, optional) — Whether or not the repository created should be private.
-
token (str, optional) —
The Hugging Face account token to use as HTTP bearer authorization for remote files. If
None
, the token will be asked by a prompt. - config (dict, optional) — Configuration object to be saved alongside the model weights.
Upload learner checkpoint files to the Hub while synchronizing a local clone of the repo in
repo_id
.
Keyword Args:
api_endpoint (str, optional):
The API endpoint to use when pushing the model to the hub.
git_user (str, optional):
Will override the git config user.name
for committing and pushing files to the hub.
git_email (str, optional):
Will override the git config user.email
for committing and pushing files to the hub.
Raises the following error:
- ValueError if the user is not log on to the Hugging Face Hub.