A repository is a place where you can store your model or dataset files. This guide will show you how to:
- Create and delete a repository.
- Adjust repository visibility.
- Use the Repository class for common Git operations like clone, branch, and pull.
If you want to create a repository on the Hub, you need to log in to your Hugging Face account:
Log in to your Hugging Face account with the following command:
Alternatively, you can programmatically login using login() in a notebook or a script:
from huggingface_hub import login login()
If ran in a Jupyter or Colaboratory notebook, login() will launch a widget from which you can enter your Hugging Face access token. Otherwise, a message will be prompted in the terminal.
It is also possible to login programmatically without the widget by directly passing the token to login(). If you do so, be careful when sharing your notebook. It is best practice to load the token from a secure vault instead of saving it in plain in your Colaboratory notebook.
Create an empty repository with create_repo() and give it a name with the
repo_id parameter. The
repo_id is your namespace followed by the repository name:
from huggingface_hub import create_repo create_repo("lysandre/test-model") 'https://huggingface.co/lysandre/test-model'
By default, create_repo() creates a model repository. But you can use the
repo_type parameter to specify another repository type. For example, if you want to create a dataset repository:
from huggingface_hub import create_repo create_repo("lysandre/test-dataset", repo_type="dataset") 'https://huggingface.co/datasets/lysandre/test-dataset'
When you create a repository, you can set your repository visibility with the
private parameter. For example, if you want to create a private repository:
from huggingface_hub import create_repo create_repo("lysandre/test-private", private=True)
If you want to change the repository visibility at a later time, you can use the update_repo_visibility() function.
Delete a repository with delete_repo(). Make sure you want to delete a repository because this is an irreversible process!
repo_id of the repository you want to delete:
You can also specify the repository type to delete by adding the
A repository can be public or private. A private repository is only visible to you or members of the organization in which the repository is located. Change a repository to private as shown in the following:
from huggingface_hub import update_repo_visibility update_repo_visibility(name=REPO_NAME, private=True)
The Repository class allows you to interact with files and repositories on the Hub with functions similar to Git commands. Repository is a wrapper over Git and Git-LFS methods, so make sure you have Git-LFS installed (see here for installation instructions) and set up before you begin. With Repository, you can use the Git commands you already know and love.
Instantiate a Repository object with a path to a local repository:
from huggingface_hub import Repository repo = Repository(local_dir="<path>/<to>/<folder>")
clone_from parameter clones a repository from a Hugging Face repository ID to a local directory specified by the
from huggingface_hub import Repository repo = Repository(local_dir="w2v2", clone_from="facebook/wav2vec2-large-960h-lv60")
clone_from can also clone a repository from a specified directory using a URL (if you are working offline, this parameter should be
"huggingface-hub", clone_from="https://huggingface.co/facebook/wav2vec2-large-960h-lv60")repo = Repository(local_dir=
You can combine the
clone_from parameter with create_repo() to create and clone a repository:
"repo_name") repo = Repository(local_dir="repo_local_path", clone_from=repo_url)repo_url = create_repo(repo_id=
You can also attribute a Git username and email to a cloned repository by specifying the
git_email parameters when you clone a repository. When users commit to that repository, Git will be aware of the commit author.
"my-dataset", clone_from="<user>/<dataset_id>", token=True, repo_type="dataset", git_user="MyName", git_email="email@example.com" )repo = Repository(
Branches are important for collaboration and experimentation without impacting your current files and code. Switch between branches with git_checkout(). For example, if you want to switch from
from huggingface_hub import Repository repo = Repository(local_dir="huggingface-hub", clone_from="<user>/<dataset_id>", revision='branch1') repo.git_checkout("branch2")
git_pull() allows you to update a current local branch with changes from a remote repository:
from huggingface_hub import Repository repo.git_pull()
rebase=True if you want your local commits to occur after your branch is updated with the new commits from the remote: