Hub documentation

Authentication

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Authentication

In order to access private or gated datasets, you need to authenticate first. Authentication works by providing an access token which will be used to authenticate and authorize your access to gated and private datasets. The first step is to create an access token for your account. This can be done by visiting Hugging Face Settings - Tokens.

There are three ways to provide the token: setting an environment variable, passing a parameter to the reader or using the Hugging Face CLI.

Environment variable

If you set the environment variable HF_TOKEN, Polars will automatically use it when requesting datasets from Hugging Face.

export HF_TOKEN="hf_xxxxxxxxxxxxx"

Parameters

You can also explicitly provide the access token to the reader (e.g. read_parquet) through the storage_options parameter. For a full overview of all the parameters, check out the API reference guide.

pl.read_parquet(
    "hf://datasets/roneneldan/TinyStories/data/train-*.parquet",
    storage_options={"token": ACCESS_TOKEN},
)

CLI

Alternatively, you can you use the Hugging Face CLI to authenticate. After successfully logging in with huggingface-cli login an access token will be stored in the HF_HOME directory which defaults to ~/.cache/huggingface. Polars will then use this token for authentication.

If multiple methods are specified, they are prioritized in the following order:

  • Parameters (storage_options)
  • Environment variable (HF_TOKEN)
  • CLI
< > Update on GitHub