Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Libraries

The Datasets Hub has support for several libraries in the Open Source ecosystem. Thanks to the huggingface_hub Python library, it’s easy to enable sharing your datasets on the Hub. We’re happy to welcome to the Hub a set of Open Source libraries that are pushing Machine Learning forward.

The table below summarizes the supported libraries and their level of integration.

Library Description Download from Hub Push to Hub
Dask Parallel and distributed computing library that scales the existing Python and PyData ecosystem.
Datasets 🤗 Datasets is a library for accessing and sharing datasets for Audio, Computer Vision, and Natural Language Processing (NLP).
DuckDB In-process SQL OLAP database management system.
Pandas Python data analysis toolkit.
WebDataset Library to write I/O pipelines for large datasets.
< > Update on GitHub