Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

🤗 Datasets Server

Datasets Server is a lightweight web API for visualizing and exploring all types of datasets - computer vision, speech, text, and tabular - stored on the Hugging Face Hub. As datasets increase in size and data type richness, the cost of preprocessing (storage and compute) these datasets can be challenging and time-consuming. To help users access these modern datasets, Datasets Server runs a server behind the scenes to generate the API responses ahead of time and stores them in a database so they are instantly returned when you make a query through the API.

Let Datasets Server take care of the heavy lifting so you can use a simple REST API on any of the 30,000+ datasets on Hugging Face to:

  • List the dataset splits, column names and data types
  • Get the dataset size (in number of rows or bytes)
  • Download and view rows at any index in the dataset
  • Get insightful statistics about the data
  • Access the dataset as parquet files to use in your favorite processing or analytics framework

Dataset viewer of the OpenAssistant dataset

Join the growing community on the forum or Discord today, and give the Datasets Server repository a ⭐️ if you’re interested in the latest updates!