Datasets documentation


You are viewing v2.0.0 version. A newer version v2.19.0 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started


Our how-to guides will show you how to complete a specific task. These guides are intended to help you apply your knowledge of 🤗 Datasets to real-world problems you may encounter. Want to flatten a column or load a dataset from a local file? We got you covered! You should already be familiar and comfortable with the 🤗 Datasets basics, and if you aren’t, we recommend reading our tutorial first.

The how-to guides will cover eight key areas of 🤗 Datasets:

  • How to load a dataset from other data sources.

  • How to process a dataset.

  • How to stream large datasets.

  • How to upload and share a dataset.

  • How to create a dataset loading script.

  • How to create a dataset card.

  • How to compute metrics.

  • How to manage the cache.

You can also find guides on how to process massive datasets with Beam, how to integrate with cloud storage providers, and how to add an index to search your dataset.