The Hugging Face Hub is a collection of git repositories. Git is a widely used tool in software development to easily version projects when working collaboratively. This guide will show you how to interact with the repositories on the Hub, especially:
- Create and delete a repository.
- Manage branches and tags.
- Rename your repository.
- Update your repository visibility.
- Manage a local copy of your repository.
If you are used to working with platforms such as GitLab/GitHub/Bitbucket, your first instinct
might be to use
git CLI to clone your repo (
git clone), commit changes (
git add, git commit) and push them
git push). This is valid when using the Hugging Face Hub. However, software engineering and machine learning do
not share the same requirements and workflows. Model repositories might maintain large model weight files for different
frameworks and tools, so cloning the repository can lead to you maintaining large local folders with massive sizes. As
a result, it may be more efficient to use our custom HTTP methods. You can read our Git vs HTTP paradigm
explanation page for more details.
If you want to create and manage a repository on the Hub, your machine must be logged in. If you are not, please refer to this section. In the rest of this guide, we will assume that your machine is logged in.
The first step is to know how to create and delete repositories. You can only manage repositories that you own (under your username namespace) or from organizations in which you have write permissions.
Create an empty repository with create_repo() and give it a name with the
repo_id parameter. The
repo_id is your namespace followed by the repository name:
from huggingface_hub import create_repo create_repo("lysandre/test-model") 'https://huggingface.co/lysandre/test-model'
By default, create_repo() creates a model repository. But you can use the
repo_type parameter to specify another repository type. For example, if you want to create a dataset repository:
from huggingface_hub import create_repo create_repo("lysandre/test-dataset", repo_type="dataset") 'https://huggingface.co/datasets/lysandre/test-dataset'
When you create a repository, you can set your repository visibility with the
from huggingface_hub import create_repo create_repo("lysandre/test-private", private=True)
If you want to change the repository visibility at a later time, you can use the update_repo_visibility() function.
Delete a repository with delete_repo(). Make sure you want to delete a repository because this is an irreversible process!
repo_id of the repository you want to delete:
In some cases, you want to copy someone else’s repo to adapt it to your use case. This is possible for Spaces using the duplicate_space() method. It will duplicate the whole repository. You will still need to configure your own settings (hardware and secrets). Check out our Manage your Space guide for more details.
from huggingface_hub import duplicate_space duplicate_space("multimodalart/dreambooth-training", private=False) RepoUrl('https://huggingface.co/spaces/nateraw/dreambooth-training',...)
Now that you have created your repository, you are interested in pushing changes to it and downloading files from it.
Git repositories often make use of branches to store different versions of a same repository. Tags can also be used to flag a specific state of your repository, for example, when releasing a version. More generally, branches and tags are referred as git references.
from huggingface_hub import create_branch, create_tag # Create a branch on a Space repo from `main` branch create_branch("Matthijs/speecht5-tts-demo", repo_type="space", branch="handle-dog-speaker") # Create a tag on a Dataset repo from `v0.1-release` branch create_branch("bigcode/the-stack", repo_type="dataset", revision="v0.1-release", tag="v0.1.1", tag_message="Bump release version.")
You can also list the existing git refs from a repository using list_repo_refs():
from huggingface_hub import list_repo_refs api.list_repo_refs("bigcode/the-stack", repo_type="dataset") GitRefs( branches=[ GitRefInfo(name='main', ref='refs/heads/main', target_commit='18edc1591d9ce72aa82f56c4431b3c969b210ae3'), GitRefInfo(name='v1.1.a1', ref='refs/heads/v1.1.a1', target_commit='f9826b862d1567f3822d3d25649b0d6d22ace714') ], converts=, tags=[ GitRefInfo(name='v1.0', ref='refs/tags/v1.0', target_commit='c37a8cd1e382064d8aced5e05543c5f7753834da') ] )
Repositories come with some settings that you can configure. Most of the time, you will want to do that manually in the
repo settings page in your browser. You must have write access to a repo to configure it (either own it or being part of
an organization). In this section, we will see the settings that you can also configure programmatically using
Some settings are specific to Spaces (hardware, environment variables,…). To configure those, please refer to our Manage your Spaces guide.
A repository can be public or private. A private repository is only visible to you or members of the organization in which the repository is located. Change a repository to private as shown in the following:
from huggingface_hub import update_repo_visibility update_repo_visibility(repo_id=repo_id, private=True)
You can rename your repository on the Hub using move_repo(). Using this method, you can also move the repo from a user to an organization. When doing so, there are a few limitations that you should be aware of. For example, you can’t transfer your repo to another user.
from huggingface_hub import move_repo move_repo(from_id="Wauplin/cool-model", to_id="huggingface/cool-model")
All the actions described above can be done using HTTP requests. However, in some cases you might be interested in having a local copy of your repository and interact with it using the Git commands you are familiar with.
The Repository class allows you to interact with files and repositories on the Hub with functions similar to Git commands. It is a wrapper over Git and Git-LFS methods to use the Git commands you already know and love. Before starting, please make sure you have Git-LFS installed (see here for installation instructions).
Instantiate a Repository object with a path to a local repository:
from huggingface_hub import Repository repo = Repository(local_dir="<path>/<to>/<folder>")
clone_from parameter clones a repository from a Hugging Face repository ID to a local directory specified by the
from huggingface_hub import Repository repo = Repository(local_dir="w2v2", clone_from="facebook/wav2vec2-large-960h-lv60")
clone_from can also clone a repository using a URL:
"huggingface-hub", clone_from="https://huggingface.co/facebook/wav2vec2-large-960h-lv60")repo = Repository(local_dir=
You can combine the
clone_from parameter with create_repo() to create and clone a repository:
"repo_name") repo = Repository(local_dir="repo_local_path", clone_from=repo_url)repo_url = create_repo(repo_id=
You can also configure a Git username and email to a cloned repository by specifying the
git_email parameters when you clone a repository. When users commit to that repository, Git will be aware of the commit author.
"my-dataset", clone_from="<user>/<dataset_id>", token=True, repo_type="dataset", git_user="MyName", git_email="firstname.lastname@example.org" )repo = Repository(
Branches are important for collaboration and experimentation without impacting your current files and code. Switch between branches with git_checkout(). For example, if you want to switch from
from huggingface_hub import Repository repo = Repository(local_dir="huggingface-hub", clone_from="<user>/<dataset_id>", revision='branch1') repo.git_checkout("branch2")
git_pull() allows you to update a current local branch with changes from a remote repository:
from huggingface_hub import Repository repo.git_pull()
rebase=True if you want your local commits to occur after your branch is updated with the new commits from the remote: