Organization Card

Hugging Face Librarian Bots

✨ Curating the Hugging Face Hub one PR at a time. ✨

A stable diffusion generated image of a bookshelf

The Hugging Face Hub is the primary place for sharing machine learning models, datasets, and demos. It currently holds over 200,000 models, 40,000 datasets, and 100,000 machine learning demos.

The Librarian Bots organization is an effort by Hugging Face's Machine Learning Librarian to use machine learning to enhance metadata and documentation for material shared on the Hub with the ultimate goal of making it easier for people (and bots!) to find what they are looking for on the Hub. This organization is used to share datasets, models, and Spaces which help achieve this goal.

👾 Spaces

📚 Spaces Related to Hugging Face Papers

Recommend Similar Papers: a Space that allows you to find papers similar to a given paper.
Collections Reading List Generator: a Space that allows you to generate a reading list for a given Hugging Face Collection
📄🔗: Extract linked papers from a Hugging Face Collection: extract all the papers associated with items in a Hugging Face Collection.
📃 Hugging Face Paper Claimer 📃: a space that helps you to claim papers you authored on the Hugging Face Hub.

Spaces related to metadata

🤖 Librarian Bot Metadata Request Service 🤖: With a few clicks, enrich your Hugging Face models with key metadata!
MetaRefine: refine Hub search results by metadata quality and model card length.
metadata explorer: a space for exploring high-level information about the metadata associated with models hosted on the Hugging Face Hub.

Spaces for exploring and keeping track of repositories on the Hub

Dataset-to-Model Monitor: track datasets hosted on the Hugging Face Hub and get a notification when new models are trained on the dataset you are tracking.
Base Model Explorer: This Space allows you to find children's models for a given base model and view the popularity of models for fine-tuning.
Hugging Face Datasets Semantic Search: a Space that allows you to use semantic search to find relevant datasets on the Hugging Face Hub.

💽 Datasets

Datasets for model and dataset cards

Model Cards with metadata: a dataset containing model cards for models hosted on the Hugging Face hub with first commit information for each model. Model cards are intended to help communicate the strengths and weaknesses of machine learning models. Whilst these model cards are primarily intended to be read by a human they are themselves also interesting corpus that can be used to explore models hosted on the Hub in various ways.
Dataset Cards With Metadata: a dataset containing dataset cards for datasets hosted on the Hugging Face hub with first commit information for each dataset. Dataset cards are intended to help communicate the strengths and weaknesses of machine learning datasets. Whilst these dataset cards are primarily intended to be read by a human they are themselves also interesting corpus that can be used to explore datasets hosted on the Hub in various ways.

🤖 Models

BERTopic model card bias topic model: a BERTopic model trained on the bias section of model cards hosted on the Hub. The goal of this model is to explore which topics are discussed in the bias section of model cards. Potentially in the future models such as this could also be used to detect 'drift' in the kinds of bias being discussed in model cards hosted on the Hub.

Getting in touch

If you want to collaborate on improving metadata on the Hugging Face Hub or have ideas for other related projects, reach out to Daniel on Twitter (@vanstriendaniel) or via email (Daniel (at) our website).