File size: 5,310 Bytes
7727270
 
08b781c
 
977a028
7727270
 
 
 
1880c46
 
0e99e57
d8a3107
c493a9b
 
 
1880c46
 
d8a3107
 
a484993
 
22b6b28
02f2902
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6ffe4bf
02f2902
 
 
 
 
 
 
 
 
e4aa3f2
 
 
a9ed4e7
 
3287a42
a484993
de5a832
 
 
 
 
e21a458
de5a832
 
113ac33
a9ed4e7
 
3287a42
113ac33
78dfd6f
bbde3fc
 
 
 
8486a89
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
---
title: README
emoji: πŸ€–
colorFrom: red
colorTo: pink
sdk: static
pinned: false
---

<h1 style="text-align: center;"> Hugging Face Librarian Bots</h1>

<p style="text-align: center;">&#x2728; <em>Curating the Hugging Face Hub one PR at a time.</em> &#x2728;</p>

<div style="text-align: center;">
  <img src="https://huggingface.co/spaces/librarian-bots/README/resolve/main/image.png" alt="A stable diffusion generated image of a bookshelf" style="display: block; margin: auto;">
</div>


The Hugging Face Hub is the primary place for sharing machine learning models, datasets, and demos. It currently holds over 200,000 models, 40,000 datasets, and 100,000 machine learning demos. 

The `Librarian Bots` organization is an effort by Hugging Face's [Machine Learning Librarian](https://huggingface.co/davanstrien) to use machine learning to enhance metadata and documentation for material shared on the Hub with the ultimate goal of making it easier for people (and bots!) to find what they are looking for on the Hub. This organization is used to share datasets, models, and Spaces which help achieve this goal. 

## &#x1F47E; Spaces

<details>
  <summary>πŸ“š Spaces Related to Hugging Face Papers</summary>
  
  - [Recommend Similar Papers](https://huggingface.co/spaces/librarian-bots/recommend_similar_papers): a Space that allows you to find papers similar to a given paper.
  - [Collections Reading List Generator](https://huggingface.co/spaces/librarian-bots/collection-reading-list-generator): a Space that allows you to generate a reading list for a given Hugging Face Collection
  - [πŸ“„πŸ”—: Extract linked papers from a Hugging Face Collection](https://huggingface.co/spaces/librarian-bots/collection_papers_extractor): extract all the papers associated with items in a Hugging Face Collection.
  - [πŸ“ƒ Hugging Face Paper Claimer πŸ“ƒ](https://huggingface.co/spaces/librarian-bots/claim-papers): a space that helps you to claim papers you authored on the Hugging Face Hub.
  </details>

<details>
  <summary> Spaces related to metadata </summary>
 
  - [πŸ€– Librarian Bot Metadata Request Service πŸ€–](https://huggingface.co/spaces/librarian-bots/metadata_request_service): With a few clicks, enrich your Hugging Face models with key metadata!
 - [MetaRefine](https://huggingface.co/spaces/librarian-bots/MetaRefine): refine Hub search results by metadata quality and model card length. 
- [metadata explorer](https://huggingface.co/spaces/librarian-bots/metadata_explorer): a space for exploring high-level information about the metadata associated with models hosted on the Hugging Face Hub.

</details>

<details>
  <summary> Spaces for exploring and keeping track of repositories on the Hub </summary>
  
  - [Dataset-to-Model Monitor](https://huggingface.co/spaces/librarian-bots/dataset-to-model-monitor): track datasets hosted on the Hugging Face Hub and get a notification when new models are trained on the dataset you are tracking.
  - [Base Model Explorer](https://huggingface.co/spaces/librarian-bots/base_model_explorer): This Space allows you to find children's models for a given base model and view the popularity of models for fine-tuning.
  - [Hugging Face Datasets Semantic Search](https://huggingface.co/spaces/librarian-bots/huggingface-datasets-semantic-search): a Space that allows you to use semantic search to find relevant datasets on the Hugging Face Hub. 

</details> 

<br>

## &#x1F4BD; Datasets 

<details>
  <summary> Datasets for model and dataset cards </summary>

- [Model Cards with metadata](https://huggingface.co/datasets/librarian-bots/model_cards_with_metadata): a dataset containing model cards for models hosted on the Hugging Face hub with first commit information for each model. Model cards are intended to help communicate the strengths and weaknesses of machine learning models. Whilst these model cards are primarily intended to be read by a human they are themselves also interesting corpus that can be used to explore models hosted on the Hub in various ways.

- [Dataset Cards With Metadata](https://huggingface.co/datasets/librarian-bots/dataset_cards_with_metadata): a dataset containing dataset cards for datasets hosted on the Hugging Face hub with first commit information for each dataset. Dataset cards are intended to help communicate the strengths and weaknesses of machine learning datasets. Whilst these dataset cards are primarily intended to be read by a human they are themselves also interesting corpus that can be used to explore datasets hosted on the Hub in various ways.

</details>

<br>

## &#x1F916; Models 

- [BERTopic model card bias topic model](https://huggingface.co/librarian-bots/BERTopic_model_card_bias): a BERTopic model trained on the bias section of model cards hosted on the Hub. The goal of this model is to explore which topics are discussed in the bias section of model cards. Potentially in the future models such as this could also be used to detect 'drift' in the kinds of bias being discussed in model cards hosted on the Hub.


# Getting in touch 

If you want to collaborate on improving metadata on the Hugging Face Hub or have ideas for other related projects, reach out to [Daniel](https://huggingface.co/davanstrien) on Twitter (@vanstriendaniel) or via email (Daniel (at) our website).