Hub Python Library documentation

Hugging Face Hub API

You are viewing v0.8.1 version. A newer version v0.22.2 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Hugging Face Hub API

Below is the documentation for the HfApi class, which serves as a Python wrapper for the Hugging Face Hub’s API.

All methods from the HfApi are also accessible from the package’s root directly, both approaches are detailed below.

The following approach uses the method from the root of the package:

from huggingface_hub import list_models

models = list_models()

The following approach uses the HfApi class:

from huggingface_hub import HfApi

hf_api = HfApi()
models = hf_api.list_models()

Using the HfApi class directly enables you to set a different endpoint to that of the Hugging Face’s Hub.

class huggingface_hub.HfApi

< >

( endpoint = None )

create_commit

< >

( repo_id: str operations: typing.Iterable[typing.Union[huggingface_hub._commit_api.CommitOperationAdd, huggingface_hub._commit_api.CommitOperationDelete]] commit_message: str commit_description: typing.Optional[str] = None token: typing.Optional[str] = None repo_type: typing.Optional[str] = None revision: typing.Optional[str] = None create_pr: typing.Optional[bool] = None num_threads: int = 5 ) str or None

Parameters

  • repo_id (str) — The repository in which the commit will be created, for example: "username/custom_transformers"
  • operations (Iterable of CommitOperation) — An iterable of operations to include in the commit, either:

    • CommitOperationAdd to upload a file
    • CommitOperationDelete to delete a file
  • commit_message (str) — The summary (first line) of the commit that will be created.
  • commit_description (str, optional) — The description of the commit that will be created
  • token (str, optional) — Authentication token, obtained with HfApi.login method. Will default to the stored token.
  • repo_type (str, optional) — Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.
  • revision (str, optional) — The git revision to commit from. Defaults to the head of the "main" branch.
  • create_pr (boolean, optional) — Whether or not to create a Pull Request from revision with that commit. Defaults to False. If set to True, this function will return the URL to the newly created Pull Request on the Hub.
  • num_threads (int, optional) — Number of concurrent threads for uploading files. Defaults to 5. Setting it to 2 means at most 2 files will be uploaded concurrently.

Returns

str or None

If create_pr is True, returns the URL to the newly created Pull Request on the Hub. Otherwise returns None.

Creates a commit in the given repo, deleting & uploading files as needed.

create_repo

< >

( repo_id: str = None token: typing.Optional[str] = None organization: typing.Optional[str] = None private: typing.Optional[bool] = None repo_type: typing.Optional[str] = None exist_ok: typing.Optional[bool] = False space_sdk: typing.Optional[str] = None name: typing.Optional[str] = None ) str

Parameters

  • repo_id (str) — A namespace (user or an organization) and a repo name separated by a /.

    Version added: 0.5

  • token (str, optional) — An authentication token (See https://huggingface.co/settings/token)
  • private (bool, optional) — Whether the model repo should be private.
  • repo_type (str, optional) — Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.
  • exist_ok (bool, optional, defaults to False) — If True, do not raise an error if repo already exists.
  • space_sdk (str, optional) — Choice of SDK to use if repo_type is “space”. Can be “streamlit”, “gradio”, or “static”.

Returns

str

URL to the newly created repo.

Create an empty repo on the HuggingFace Hub.

dataset_info

< >

( repo_id: str revision: typing.Optional[str] = None token: typing.Optional[str] = None timeout: typing.Optional[float] = None ) DatasetInfo

Parameters

  • repo_id (str) — A namespace (user or an organization) and a repo name separated by a /.
  • revision (str, optional) — The revision of the dataset repository from which to get the information.
  • token (str, optional) — An authentication token (See https://huggingface.co/settings/token)
  • timeout (float, optional) — Whether to set a timeout for the request to the Hub.

Returns

DatasetInfo

The dataset repository information.

Get info on one specific dataset on huggingface.co.

Dataset can be private if you pass an acceptable token.

Raises the following errors:

  • RepositoryNotFoundError If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.
  • RevisionNotFoundError If the revision to download from cannot be found.

delete_file

< >

( path_in_repo: str repo_id: str token: typing.Optional[str] = None repo_type: typing.Optional[str] = None revision: typing.Optional[str] = None commit_message: typing.Optional[str] = None commit_description: typing.Optional[str] = None create_pr: typing.Optional[bool] = None )

Parameters

  • path_in_repo (str) — Relative filepath in the repo, for example: "checkpoints/1fec34a/weights.bin"
  • repo_id (str) — The repository from which the file will be deleted, for example: "username/custom_transformers"
  • token (str, optional) — Authentication token, obtained with HfApi.login method. Will default to the stored token.
  • repo_type (str, optional) — Set to "dataset" or "space" if the file is in a dataset or space, None or "model" if in a model. Default is None.
  • revision (str, optional) — The git revision to commit from. Defaults to the head of the "main" branch.
  • commit_message (str, optional) — The summary / title / first line of the generated commit. Defaults to f"Delete {path_in_repo} with huggingface_hub".
  • commit_description (str optional) — The description of the generated commit
  • create_pr (boolean, optional) — Whether or not to create a Pull Request from revision with the changes. Defaults to False.

Deletes a file in the given repo.

Raises the following errors:

delete_repo

< >

( repo_id: str = None token: typing.Optional[str] = None repo_type: typing.Optional[str] = None )

Parameters

  • repo_id (str) — A namespace (user or an organization) and a repo name separated by a /.

    Version added: 0.5

  • token (str, optional) — An authentication token (See https://huggingface.co/settings/token)
  • repo_type (str, optional) — Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model.

Delete a repo from the HuggingFace Hub. CAUTION: this is irreversible.

Raises the following errors:

  • RepositoryNotFoundError If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

get_dataset_tags

< >

( )

Gets all valid dataset tags as a nested namespace object.

get_full_repo_name

< >

( model_id: str organization: typing.Optional[str] = None token: typing.Optional[str] = None ) str

Parameters

  • model_id (str) — The name of the model.
  • organization (str, optional) — If passed, the repository name will be in the organization namespace instead of the user namespace.
  • token (str, optional) — The Hugging Face authentication token

Returns

str

The repository name in the user’s namespace ({username}/{model_id}) if no organization is passed, and under the organization namespace ({organization}/{model_id}) otherwise.

Returns the repository name for a given model ID and optional organization.

get_model_tags

< >

( )

Gets all valid model tags as a nested namespace object

list_datasets

< >

( filter: typing.Union[huggingface_hub.utils.endpoint_helpers.DatasetFilter, str, typing.Iterable[str], NoneType] = None author: typing.Optional[str] = None search: typing.Optional[str] = None sort: typing.Union[typing.Literal['lastModified'], str, NoneType] = None direction: typing.Optional[typing.Literal[-1]] = None limit: typing.Optional[int] = None cardData: typing.Optional[bool] = None full: typing.Optional[bool] = None use_auth_token: typing.Optional[str] = None )

Parameters

  • filter (DatasetFilter or str or Iterable, optional) — A string or DatasetFilter which can be used to identify datasets on the hub.
  • author (str, optional) — A string which identify the author of the returned models
  • search (str, optional) — A string that will be contained in the returned models.
  • sort (Literal["lastModified"] or str, optional) — The key with which to sort the resulting datasets. Possible values are the properties of the DatasetInfo class.
  • direction (Literal[-1] or int, optional) — Direction in which to sort. The value -1 sorts by descending order while all other values sort by ascending order.
  • limit (int, optional) — The limit on the number of datasets fetched. Leaving this option to None fetches all datasets.
  • cardData (bool, optional) — Whether to grab the metadata for the dataset as well. Can contain useful information such as the PapersWithCode ID.
  • full (bool, optional) — Whether to fetch all dataset data, including the lastModified and the cardData.
  • use_auth_token (bool or str, optional) — Whether to use the auth_token provided from the huggingface_hub cli. If not logged in, a valid auth_token can be passed in as a string.

Get the public list of all the datasets on huggingface.co

Example usage with the filter argument:

>>> from huggingface_hub import HfApi

>>> api = HfApi()

>>> # List all datasets
>>> api.list_datasets()

>>> # Get all valid search arguments
>>> args = DatasetSearchArguments()

>>> # List only the text classification datasets
>>> api.list_datasets(filter="task_categories:text-classification")
>>> # Using the `DatasetFilter`
>>> filt = DatasetFilter(task_categories="text-classification")
>>> # With `DatasetSearchArguments`
>>> filt = DatasetFilter(task=args.task_categories.text_classification)
>>> api.list_models(filter=filt)

>>> # List only the datasets in russian for language modeling
>>> api.list_datasets(
...     filter=("languages:ru", "task_ids:language-modeling")
... )
>>> # Using the `DatasetFilter`
>>> filt = DatasetFilter(languages="ru", task_ids="language-modeling")
>>> # With `DatasetSearchArguments`
>>> filt = DatasetFilter(
...     languages=args.languages.ru,
...     task_ids=args.task_ids.language_modeling,
... )
>>> api.list_datasets(filter=filt)

Example usage with the search argument:

>>> from huggingface_hub import HfApi

>>> api = HfApi()

>>> # List all datasets with "text" in their name
>>> api.list_datasets(search="text")

>>> # List all datasets with "text" in their name made by google
>>> api.list_datasets(search="text", author="google")

list_metrics

< >

( ) List[MetricInfo]

Returns

List[MetricInfo]

a list of MetricInfo objects which.

Get the public list of all the metrics on huggingface.co

list_models

< >

( filter: typing.Union[huggingface_hub.utils.endpoint_helpers.ModelFilter, str, typing.Iterable[str], NoneType] = None author: typing.Optional[str] = None search: typing.Optional[str] = None emissions_thresholds: typing.Union[typing.Tuple[float, float], NoneType] = None sort: typing.Union[typing.Literal['lastModified'], str, NoneType] = None direction: typing.Optional[typing.Literal[-1]] = None limit: typing.Optional[int] = None full: typing.Optional[bool] = None cardData: typing.Optional[bool] = None fetch_config: typing.Optional[bool] = None use_auth_token: typing.Union[bool, str, NoneType] = None )

Parameters

  • filter (ModelFilter or str or Iterable, optional) — A string or ModelFilter which can be used to identify models on the Hub.
  • author (str, optional) — A string which identify the author (user or organization) of the returned models
  • search (str, optional) — A string that will be contained in the returned models Example usage:
  • emissions_thresholds (Tuple, optional) — A tuple of two ints or floats representing a minimum and maximum carbon footprint to filter the resulting models with in grams.
  • sort (Literal["lastModified"] or str, optional) — The key with which to sort the resulting models. Possible values are the properties of the ModelInfo class.
  • direction (Literal[-1] or int, optional) — Direction in which to sort. The value -1 sorts by descending order while all other values sort by ascending order.
  • limit (int, optional) — The limit on the number of models fetched. Leaving this option to None fetches all models.
  • full (bool, optional) — Whether to fetch all model data, including the lastModified, the sha, the files and the tags. This is set to True by default when using a filter.
  • cardData (bool, optional) — Whether to grab the metadata for the model as well. Can contain useful information such as carbon emissions, metrics, and datasets trained on.
  • fetch_config (bool, optional) — Whether to fetch the model configs as well. This is not included in full due to its size.
  • use_auth_token (bool or str, optional) — Whether to use the auth_token provided from the huggingface_hub cli. If not logged in, a valid auth_token can be passed in as a string.

Get the public list of all the models on huggingface.co

Example usage with the filter argument:

>>> from huggingface_hub import HfApi

>>> api = HfApi()

>>> # List all models
>>> api.list_models()

>>> # Get all valid search arguments
>>> args = ModelSearchArguments()

>>> # List only the text classification models
>>> api.list_models(filter="text-classification")
>>> # Using the `ModelFilter`
>>> filt = ModelFilter(task="text-classification")
>>> # With `ModelSearchArguments`
>>> filt = ModelFilter(task=args.pipeline_tags.TextClassification)
>>> api.list_models(filter=filt)

>>> # Using `ModelFilter` and `ModelSearchArguments` to find text classification in both PyTorch and TensorFlow
>>> filt = ModelFilter(
...     task=args.pipeline_tags.TextClassification,
...     library=[args.library.PyTorch, args.library.TensorFlow],
... )
>>> api.list_models(filter=filt)

>>> # List only models from the AllenNLP library
>>> api.list_models(filter="allennlp")
>>> # Using `ModelFilter` and `ModelSearchArguments`
>>> filt = ModelFilter(library=args.library.allennlp)

Example usage with the search argument:

>>> from huggingface_hub import HfApi

>>> api = HfApi()

>>> # List all models with "bert" in their name
>>> api.list_models(search="bert")

>>> # List all models with "bert" in their name made by google
>>> api.list_models(search="bert", author="google")

list_repo_files

< >

( repo_id: str revision: typing.Optional[str] = None repo_type: typing.Optional[str] = None token: typing.Optional[str] = None timeout: typing.Optional[float] = None ) List[str]

Parameters

  • repo_id (str) — A namespace (user or an organization) and a repo name separated by a /.
  • revision (str, optional) — The revision of the model repository from which to get the information.
  • repo_type (str, optional) — Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.
  • token (str, optional) — An authentication token (See https://huggingface.co/settings/token)
  • timeout (float, optional) — Whether to set a timeout for the request to the Hub.

Returns

List[str]

the list of files in a given repository.

Get the list of files in a given repo.

model_info

< >

( repo_id: str revision: typing.Optional[str] = None token: typing.Optional[str] = None timeout: typing.Optional[float] = None securityStatus: typing.Optional[bool] = None ) huggingface_hub.hf_api.ModelInfo

Parameters

  • repo_id (str) — A namespace (user or an organization) and a repo name separated by a /.
  • revision (str, optional) — The revision of the model repository from which to get the information.
  • token (str, optional) — An authentication token (See https://huggingface.co/settings/token)
  • timeout (float, optional) — Whether to set a timeout for the request to the Hub.
  • securityStatus (bool, optional) — Whether to retrieve the security status from the model repository as well.

Returns

huggingface_hub.hf_api.ModelInfo

The model repository information.

Get info on one specific model on huggingface.co

Model can be private if you pass an acceptable token or are logged in.

Raises the following errors:

  • RepositoryNotFoundError If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.
  • RevisionNotFoundError If the revision to download from cannot be found.

move_repo

< >

( from_id: str to_id: str repo_type: typing.Optional[str] = None token: typing.Optional[str] = None )

Parameters

  • from_id (str) — A namespace (user or an organization) and a repo name separated by a /. Original repository identifier.
  • to_id (str) — A namespace (user or an organization) and a repo name separated by a /. Final repository identifier.
  • repo_type (str, optional) — Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.
  • token (str, optional) — An authentication token (See https://huggingface.co/settings/token)

Moving a repository from namespace1/repo_name1 to namespace2/repo_name2

Note there are certain limitations. For more information about moving repositories, please see https://hf.co/docs/hub/main#how-can-i-rename-or-transfer-a-repo.

Raises the following errors:

  • RepositoryNotFoundError If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

repo_info

< >

( repo_id: str revision: typing.Optional[str] = None repo_type: typing.Optional[str] = None token: typing.Optional[str] = None timeout: typing.Optional[float] = None ) Union[SpaceInfo, DatasetInfo, ModelInfo]

Parameters

  • repo_id (str) — A namespace (user or an organization) and a repo name separated by a /.
  • revision (str, optional) — The revision of the repository from which to get the information.
  • token (str, optional) — An authentication token (See https://huggingface.co/settings/token)
  • timeout (float, optional) — Whether to set a timeout for the request to the Hub.

Returns

Union[SpaceInfo, DatasetInfo, ModelInfo]

The repository information.

Get the info object for a given repo of a given type.

Raises the following errors:

  • RepositoryNotFoundError If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.
  • RevisionNotFoundError If the revision to download from cannot be found.

set_access_token

< >

( access_token: str )

Parameters

  • access_token (str) — The access token to save.

Saves the passed access token so git can correctly authenticate the user.

space_info

< >

( repo_id: str revision: typing.Optional[str] = None token: typing.Optional[str] = None timeout: typing.Optional[float] = None ) SpaceInfo

Parameters

  • repo_id (str) — A namespace (user or an organization) and a repo name separated by a /.
  • revision (str, optional) — The revision of the space repository from which to get the information.
  • token (str, optional) — An authentication token (See https://huggingface.co/settings/token)
  • timeout (float, optional) — Whether to set a timeout for the request to the Hub.

Returns

SpaceInfo

The space repository information.

Get info on one specific Space on huggingface.co.

Space can be private if you pass an acceptable token.

Raises the following errors:

  • RepositoryNotFoundError If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.
  • RevisionNotFoundError If the revision to download from cannot be found.

unset_access_token

< >

( )

Resets the user’s access token.

update_repo_visibility

< >

( repo_id: str = None private: bool = False token: typing.Optional[str] = None organization: typing.Optional[str] = None repo_type: typing.Optional[str] = None name: str = None )

Parameters

  • repo_id (str, optional) — A namespace (user or an organization) and a repo name separated by a /.

    Version added: 0.5

  • private (bool, optional, defaults to False) — Whether the model repo should be private.
  • token (str, optional) — An authentication token (See https://huggingface.co/settings/token)
  • repo_type (str, optional) — Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

Update the visibility setting of a repository.

Raises the following errors:

  • RepositoryNotFoundError If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

upload_file

< >

( path_or_fileobj: typing.Union[str, bytes, typing.BinaryIO] path_in_repo: str repo_id: str token: typing.Optional[str] = None repo_type: typing.Optional[str] = None revision: typing.Optional[str] = None identical_ok: typing.Optional[bool] = None commit_message: typing.Optional[str] = None commit_description: typing.Optional[str] = None create_pr: typing.Optional[bool] = None ) str

Parameters

  • path_or_fileobj (str, bytes, or IO) — Path to a file on the local machine or binary data stream / fileobj / buffer.
  • path_in_repo (str) — Relative filepath in the repo, for example: "checkpoints/1fec34a/weights.bin"
  • repo_id (str) — The repository to which the file will be uploaded, for example: "username/custom_transformers"
  • token (str, optional) — Authentication token, obtained with HfApi.login method. Will default to the stored token.
  • repo_type (str, optional) — Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.
  • revision (str, optional) — The git revision to commit from. Defaults to the head of the "main" branch.
  • identical_ok (bool, optional, defaults to True) — Deprecated: will be removed in 0.11.0. Changing this value has no effect.
  • commit_message (str, optional) — The summary / title / first line of the generated commit
  • commit_description (str optional) — The description of the generated commit
  • create_pr (boolean, optional) — Whether or not to create a Pull Request from revision with that commit. Defaults to False.

Returns

str

The URL to visualize the uploaded file on the hub

Upload a local file (up to 50 GB) to the given repo. The upload is done through a HTTP post request, and doesn’t require git or git-lfs to be installed.

Raises the following errors:

  • HTTPError if the HuggingFace API returned an error
  • ValueError if some parameter value is invalid
  • RepositoryNotFoundError If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.
  • RevisionNotFoundError If the revision to download from cannot be found.

Example usage:

>>> with open("./local/filepath", "rb") as fobj:
...     upload_file(
...         path_or_fileobj=fileobj,
...         path_in_repo="remote/file/path.h5",
...         repo_id="username/my-dataset",
...         repo_type="datasets",
...         token="my_token",
...     )
"https://huggingface.co/datasets/username/my-dataset/blob/main/remote/file/path.h5"

>>> upload_file(
...     path_or_fileobj=".\\local\\file\\path",
...     path_in_repo="remote/file/path.h5",
...     repo_id="username/my-model",
...     token="my_token",
... )
"https://huggingface.co/username/my-model/blob/main/remote/file/path.h5"

>>> upload_file(
...     path_or_fileobj=".\\local\\file\\path",
...     path_in_repo="remote/file/path.h5",
...     repo_id="username/my-model",
...     token="my_token",
...     create_pr=True,
... )
"https://huggingface.co/username/my-model/blob/refs%2Fpr%2F1/remote/file/path.h5"

upload_folder

< >

( repo_id: str folder_path: str path_in_repo: str commit_message: typing.Optional[str] = None commit_description: typing.Optional[str] = None token: typing.Optional[str] = None repo_type: typing.Optional[str] = None revision: typing.Optional[str] = None create_pr: typing.Optional[bool] = None ) str

Parameters

  • repo_id (str) — The repository to which the file will be uploaded, for example: "username/custom_transformers"
  • folder_path (str) — Path to the folder to upload on the local file system
  • path_in_repo (str) — Relative path of the directory in the repo, for example: "checkpoints/1fec34a/results"
  • token (str, optional) — Authentication token, obtained with HfApi.login method. Will default to the stored token.
  • repo_type (str, optional) — Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.
  • revision (str, optional) — The git revision to commit from. Defaults to the head of the "main" branch.
  • commit_message (str, optional) — The summary / title / first line of the generated commit. Defaults to: f"Upload {path_in_repo} with huggingface_hub"
  • commit_description (str optional) — The description of the generated commit
  • create_pr (boolean, optional) — Whether or not to create a Pull Request from the pushed changes. Defaults to False.

Returns

str

A URL to visualize the uploaded folder on the hub

Upload a local folder to the given repo. The upload is done through a HTTP requests, and doesn’t require git or git-lfs to be installed.

The structure of the folder will be preserved. Files with the same name already present in the repository will be overwritten, others will be left untouched.

Uses HfApi.create_commit under the hood.

Raises the following errors:

  • HTTPError if the HuggingFace API returned an error
  • ValueError if some parameter value is invalid

Example usage:

>>> upload_file(
...     folder_path="local/checkpoints",
...     path_in_repo="remote/experiment/checkpoints",
...     repo_id="username/my-dataset",
...     repo_type="datasets",
...     token="my_token",
... )
# "https://huggingface.co/datasets/username/my-dataset/tree/main/remote/experiment/checkpoints"

>>> upload_file(
...     folder_path="local/checkpoints",
...     path_in_repo="remote/experiment/checkpoints",
...     repo_id="username/my-dataset",
...     repo_type="datasets",
...     token="my_token",
...     create_pr=True,
... )
# "https://huggingface.co/datasets/username/my-dataset/tree/refs%2Fpr%2F1/remote/experiment/checkpoints"

whoami

< >

( token: typing.Optional[str] = None )

Parameters

  • token (str, optional) — Hugging Face token. Will default to the locally saved token if not provided.

Call HF API to know “whoami”.

Hugging Face local storage

huggingface_hub stores the authentication information locally so that it may be re-used in subsequent methods.

It does this using the HfFolder utility, which saves data at the root of the user.

class huggingface_hub.HfFolder

< >

( )

delete_token

< >

( )

Deletes the token from storage. Does not fail if token does not exist.

get_token

< >

( ) str or None

Returns

str or None

The token, None if it doesn’t exist.

Get token or None if not existent.

Note that a token can be also provided using the HUGGING_FACE_HUB_TOKEN environment variable.

save_token

< >

( token )

Parameters

  • token (str) — The token to save to the HfFolder

Save token, creating folder as needed.

Filtering helpers

Some helpers to filter repositories on the Hub are available in the huggingface_hub package.

class huggingface_hub.DatasetFilter

< >

( author: str = None benchmark: typing.Union[str, typing.List[str]] = None dataset_name: str = None language_creators: typing.Union[str, typing.List[str]] = None languages: typing.Union[str, typing.List[str]] = None multilinguality: typing.Union[str, typing.List[str]] = None size_categories: typing.Union[str, typing.List[str]] = None task_categories: typing.Union[str, typing.List[str]] = None task_ids: typing.Union[str, typing.List[str]] = None )

Parameters

  • author (str, optional) — A string or list of strings that can be used to identify datasets on the Hub by the original uploader (author or organization), such as facebook or huggingface.
  • benchmark (str or List, optional) — A string or list of strings that can be used to identify datasets on the Hub by their official benchmark.
  • dataset_name (str, optional) — A string or list of strings that can be used to identify datasets on the Hub by its name, such as SQAC or wikineural
  • language_creators (str or List, optional) — A string or list of strings that can be used to identify datasets on the Hub with how the data was curated, such as crowdsourced or machine_generated.
  • languages (str or List, optional) — A string or list of strings representing a two-character language to filter datasets by on the Hub.
  • multilinguality (str or List, optional) — A string or list of strings representing a filter for datasets that contain multiple languages.
  • size_categories (str or List, optional) — A string or list of strings that can be used to identify datasets on the Hub by the size of the dataset such as 100K<n<1M or 1M<n<10M.
  • task_categories (str or List, optional) — A string or list of strings that can be used to identify datasets on the Hub by the designed task, such as audio_classification or named_entity_recognition.
  • task_ids (str or List, optional) — A string or list of strings that can be used to identify datasets on the Hub by the specific task such as speech_emotion_recognition or paraphrase.

A class that converts human-readable dataset search parameters into ones compatible with the REST API. For all parameters capitalization does not matter.

Examples:

>>> from huggingface_hub import DatasetFilter

>>> # Using author
>>> new_filter = DatasetFilter(author="facebook")

>>> # Using benchmark
>>> new_filter = DatasetFilter(benchmark="raft")

>>> # Using dataset_name
>>> new_filter = DatasetFilter(dataset_name="wikineural")

>>> # Using language_creator
>>> new_filter = DatasetFilter(language_creator="crowdsourced")

>>> # Using language
>>> new_filter = DatasetFilter(language="en")

>>> # Using multilinguality
>>> new_filter = DatasetFilter(multilinguality="yes")

>>> # Using size_categories
>>> new_filter = DatasetFilter(size_categories="100K<n<1M")

>>> # Using task_categories
>>> new_filter = DatasetFilter(task_categories="audio_classification")

>>> # Using task_ids
>>> new_filter = DatasetFilter(task_ids="paraphrase")

class huggingface_hub.ModelFilter

< >

( author: str = None library: typing.Union[str, typing.List[str]] = None language: typing.Union[str, typing.List[str]] = None model_name: str = None task: typing.Union[str, typing.List[str]] = None trained_dataset: typing.Union[str, typing.List[str]] = None tags: typing.Union[str, typing.List[str]] = None )

Parameters

  • author (str, optional) — A string that can be used to identify models on the Hub by the original uploader (author or organization), such as facebook or huggingface.
  • library (str or List, optional) — A string or list of strings of foundational libraries models were originally trained from, such as pytorch, tensorflow, or allennlp.
  • language (str or List, optional) — A string or list of strings of languages, both by name and country code, such as “en” or “English”
  • model_name (str, optional) — A string that contain complete or partial names for models on the Hub, such as “bert” or “bert-base-cased”
  • task (str or List, optional) — A string or list of strings of tasks models were designed for, such as: “fill-mask” or “automatic-speech-recognition”
  • tags (str or List, optional) — A string tag or a list of tags to filter models on the Hub by, such as text-generation or spacy.
  • trained_dataset (str or List, optional) — A string tag or a list of string tags of the trained dataset for a model on the Hub.

A class that converts human-readable model search parameters into ones compatible with the REST API. For all parameters capitalization does not matter.

>>> from huggingface_hub import ModelFilter

>>> # For the author_or_organization
>>> new_filter = ModelFilter(author_or_organization="facebook")

>>> # For the library
>>> new_filter = ModelFilter(library="pytorch")

>>> # For the language
>>> new_filter = ModelFilter(language="french")

>>> # For the model_name
>>> new_filter = ModelFilter(model_name="bert")

>>> # For the task
>>> new_filter = ModelFilter(task="text-classification")

>>> # Retrieving tags using the `HfApi.get_model_tags` method
>>> from huggingface_hub import HfApi

>>> api = HfApi()
# To list model tags

>>> api.get_model_tags()
# To list dataset tags

>>> api.get_dataset_tags()
>>> new_filter = ModelFilter(tags="benchmark:raft")

>>> # Related to the dataset
>>> new_filter = ModelFilter(trained_dataset="common_voice")

class huggingface_hub.DatasetSearchArguments

< >

( )

A nested namespace object holding all possible values for properties of datasets currently hosted in the Hub with tab-completion. If a value starts with a number, it will only exist in the dictionary

Example:

>>> args = DatasetSearchArguments()
>>> args.author_or_organization.huggingface
>>> args.language.en

class huggingface_hub.ModelSearchArguments

< >

( )

A nested namespace object holding all possible values for properties of models currently hosted in the Hub with tab-completion. If a value starts with a number, it will only exist in the dictionary

Example:

>>> args = ModelSearchArguments()
>>> args.author_or_organization.huggingface
>>> args.language.en