davanstrien HF staff commited on
Commit
e1b71d5
1 Parent(s): 0f99f6b

add collections tutorial

Browse files
generate_collection_using_huggingface_hub.ipynb ADDED
@@ -0,0 +1,2299 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {},
6
+ "source": [
7
+ "# Automatic curation of the Hugging Face Hub using Collections and the `huggingface_hub` library\n",
8
+ "\n",
9
+ "In this short tutorial, we will see how to create a Hugging Face Collection automatically using the `huggingface_hub` library. We'll focus on creating a collection that will curate the top 10% most used instruction tuning datasets on the Hub. \n",
10
+ "\n",
11
+ "If you are already familiar with Collections and the `huggingface_hub` library, you can skip to the next section.\n",
12
+ "\n",
13
+ "## What is a Hugging Face Collection?\n",
14
+ "\n",
15
+ "Collections are a recently added feature on the Hugging Face Hub which unlock some really powerful new ways of curating what is on the Hub. With the Hub becoming the defacto platform for open-source machine learning models, it is important to be able to curate the content on the Hub. Collections allow you to do just that.\n",
16
+ "\n",
17
+ "Collections can be used to organize models, datasets, Spaces, and papers on the Hub in various different ways. You could for example create collections around a particular use case, or a particular topic, or a particular model architecture. You could also create collections that are a combination of these things. In this tutorial, we will create a collection that curates the top 10% most used instruction tuning datasets on the Hub. We will do this using the `huggingface_hub` library.\n",
18
+ "\n",
19
+ "## So what is the `huggingface_hub` library?\n",
20
+ "\n",
21
+ "The `hub` library is a Python library that allows you to interact with the Hugging Face Hub. It allows you to do things like upload and download models, datasets, and Spaces. Recently the library added support for creating and managing collection. This ability to programmatically create and manage collections unlocks a bunch of exciting new use cases. In this tutorial we'll show a few possibilities of what you can do with the `huggingface_hub` library and Collections but we're excited to see what you will do with it! "
22
+ ]
23
+ },
24
+ {
25
+ "cell_type": "markdown",
26
+ "metadata": {},
27
+ "source": [
28
+ "## Install packages\n",
29
+ "\n",
30
+ "For this tutorial, the only package we'll need outside of the Python standard library is the `huggingface_hub` library."
31
+ ]
32
+ },
33
+ {
34
+ "cell_type": "code",
35
+ "execution_count": 1,
36
+ "metadata": {
37
+ "colab": {
38
+ "base_uri": "https://localhost:8080/"
39
+ },
40
+ "id": "HKyybBVZ1hBh",
41
+ "outputId": "ceaa1d1a-85e6-4015-e4e1-04bbe88d05cf"
42
+ },
43
+ "outputs": [
44
+ {
45
+ "name": "stdout",
46
+ "output_type": "stream",
47
+ "text": [
48
+ "Collecting git+https://github.com/huggingface/huggingface_hub\n",
49
+ " Cloning https://github.com/huggingface/huggingface_hub to /private/var/folders/gf/nk18mwt53sb4d0zpvjzs40bw0000gn/T/pip-req-build-bdjiy2_a\n",
50
+ " Running command git clone --filter=blob:none --quiet https://github.com/huggingface/huggingface_hub /private/var/folders/gf/nk18mwt53sb4d0zpvjzs40bw0000gn/T/pip-req-build-bdjiy2_a\n",
51
+ " Resolved https://github.com/huggingface/huggingface_hub to commit c32d4b31b679c9e91b906709631901f6aa85324d\n",
52
+ " Installing build dependencies ... \u001b[?25ldone\n",
53
+ "\u001b[?25h Getting requirements to build wheel ... \u001b[?25ldone\n",
54
+ "\u001b[?25h Preparing metadata (pyproject.toml) ... \u001b[?25ldone\n",
55
+ "\u001b[?25hCollecting filelock (from huggingface-hub==0.18.0.dev0)\n",
56
+ " Obtaining dependency information for filelock from https://files.pythonhosted.org/packages/5e/5d/97afbafd9d584ff1b45fcb354a479a3609bd97f912f8f1f6c563cb1fae21/filelock-3.12.4-py3-none-any.whl.metadata\n",
57
+ " Using cached filelock-3.12.4-py3-none-any.whl.metadata (2.8 kB)\n",
58
+ "Collecting fsspec>=2023.5.0 (from huggingface-hub==0.18.0.dev0)\n",
59
+ " Obtaining dependency information for fsspec>=2023.5.0 from https://files.pythonhosted.org/packages/fe/d3/e1aa96437d944fbb9cc95d0316e25583886e9cd9e6adc07baad943524eda/fsspec-2023.9.2-py3-none-any.whl.metadata\n",
60
+ " Using cached fsspec-2023.9.2-py3-none-any.whl.metadata (6.7 kB)\n",
61
+ "Collecting requests (from huggingface-hub==0.18.0.dev0)\n",
62
+ " Obtaining dependency information for requests from https://files.pythonhosted.org/packages/70/8e/0e2d847013cb52cd35b38c009bb167a1a26b2ce6cd6965bf26b47bc0bf44/requests-2.31.0-py3-none-any.whl.metadata\n",
63
+ " Using cached requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)\n",
64
+ "Collecting tqdm>=4.42.1 (from huggingface-hub==0.18.0.dev0)\n",
65
+ " Obtaining dependency information for tqdm>=4.42.1 from https://files.pythonhosted.org/packages/00/e5/f12a80907d0884e6dff9c16d0c0114d81b8cd07dc3ae54c5e962cc83037e/tqdm-4.66.1-py3-none-any.whl.metadata\n",
66
+ " Using cached tqdm-4.66.1-py3-none-any.whl.metadata (57 kB)\n",
67
+ "Collecting pyyaml>=5.1 (from huggingface-hub==0.18.0.dev0)\n",
68
+ " Obtaining dependency information for pyyaml>=5.1 from https://files.pythonhosted.org/packages/28/09/55f715ddbf95a054b764b547f617e22f1d5e45d83905660e9a088078fe67/PyYAML-6.0.1-cp311-cp311-macosx_11_0_arm64.whl.metadata\n",
69
+ " Using cached PyYAML-6.0.1-cp311-cp311-macosx_11_0_arm64.whl.metadata (2.1 kB)\n",
70
+ "Collecting typing-extensions>=3.7.4.3 (from huggingface-hub==0.18.0.dev0)\n",
71
+ " Obtaining dependency information for typing-extensions>=3.7.4.3 from https://files.pythonhosted.org/packages/24/21/7d397a4b7934ff4028987914ac1044d3b7d52712f30e2ac7a2ae5bc86dd0/typing_extensions-4.8.0-py3-none-any.whl.metadata\n",
72
+ " Using cached typing_extensions-4.8.0-py3-none-any.whl.metadata (3.0 kB)\n",
73
+ "Requirement already satisfied: packaging>=20.9 in ./.venv/lib/python3.11/site-packages (from huggingface-hub==0.18.0.dev0) (23.1)\n",
74
+ "Collecting charset-normalizer<4,>=2 (from requests->huggingface-hub==0.18.0.dev0)\n",
75
+ " Obtaining dependency information for charset-normalizer<4,>=2 from https://files.pythonhosted.org/packages/91/e6/8fa919fc84a106e9b04109de62bdf8526899e2754a64da66e1cd50ac1faa/charset_normalizer-3.2.0-cp311-cp311-macosx_11_0_arm64.whl.metadata\n",
76
+ " Using cached charset_normalizer-3.2.0-cp311-cp311-macosx_11_0_arm64.whl.metadata (31 kB)\n",
77
+ "Collecting idna<4,>=2.5 (from requests->huggingface-hub==0.18.0.dev0)\n",
78
+ " Using cached idna-3.4-py3-none-any.whl (61 kB)\n",
79
+ "Collecting urllib3<3,>=1.21.1 (from requests->huggingface-hub==0.18.0.dev0)\n",
80
+ " Obtaining dependency information for urllib3<3,>=1.21.1 from https://files.pythonhosted.org/packages/37/dc/399e63f5d1d96bb643404ee830657f4dfcf8503f5ba8fa3c6d465d0c57fe/urllib3-2.0.5-py3-none-any.whl.metadata\n",
81
+ " Using cached urllib3-2.0.5-py3-none-any.whl.metadata (6.6 kB)\n",
82
+ "Collecting certifi>=2017.4.17 (from requests->huggingface-hub==0.18.0.dev0)\n",
83
+ " Obtaining dependency information for certifi>=2017.4.17 from https://files.pythonhosted.org/packages/4c/dd/2234eab22353ffc7d94e8d13177aaa050113286e93e7b40eae01fbf7c3d9/certifi-2023.7.22-py3-none-any.whl.metadata\n",
84
+ " Using cached certifi-2023.7.22-py3-none-any.whl.metadata (2.2 kB)\n",
85
+ "Using cached fsspec-2023.9.2-py3-none-any.whl (173 kB)\n",
86
+ "Using cached PyYAML-6.0.1-cp311-cp311-macosx_11_0_arm64.whl (167 kB)\n",
87
+ "Using cached tqdm-4.66.1-py3-none-any.whl (78 kB)\n",
88
+ "Using cached typing_extensions-4.8.0-py3-none-any.whl (31 kB)\n",
89
+ "Using cached filelock-3.12.4-py3-none-any.whl (11 kB)\n",
90
+ "Using cached requests-2.31.0-py3-none-any.whl (62 kB)\n",
91
+ "Using cached certifi-2023.7.22-py3-none-any.whl (158 kB)\n",
92
+ "Using cached charset_normalizer-3.2.0-cp311-cp311-macosx_11_0_arm64.whl (122 kB)\n",
93
+ "Using cached urllib3-2.0.5-py3-none-any.whl (123 kB)\n",
94
+ "Building wheels for collected packages: huggingface-hub\n",
95
+ " Building wheel for huggingface-hub (pyproject.toml) ... \u001b[?25ldone\n",
96
+ "\u001b[?25h Created wheel for huggingface-hub: filename=huggingface_hub-0.18.0.dev0-py3-none-any.whl size=298588 sha256=88b09ea2b9f009a9aeae12440af109575fc5b82e58a29b0b250cc9a95eaff3aa\n",
97
+ " Stored in directory: /private/var/folders/gf/nk18mwt53sb4d0zpvjzs40bw0000gn/T/pip-ephem-wheel-cache-5yfewvyz/wheels/0d/44/01/c6da8315f53a5f367cd4bb3e00643c462c8df2065b29a67f4f\n",
98
+ "Successfully built huggingface-hub\n",
99
+ "Installing collected packages: urllib3, typing-extensions, tqdm, pyyaml, idna, fsspec, filelock, charset-normalizer, certifi, requests, huggingface-hub\n",
100
+ "Successfully installed certifi-2023.7.22 charset-normalizer-3.2.0 filelock-3.12.4 fsspec-2023.9.2 huggingface-hub-0.18.0.dev0 idna-3.4 pyyaml-6.0.1 requests-2.31.0 tqdm-4.66.1 typing-extensions-4.8.0 urllib3-2.0.5\n",
101
+ "Note: you may need to restart the kernel to use updated packages.\n"
102
+ ]
103
+ }
104
+ ],
105
+ "source": [
106
+ "%pip install git+https://github.com/huggingface/huggingface_hub --upgrade"
107
+ ]
108
+ },
109
+ {
110
+ "cell_type": "markdown",
111
+ "metadata": {},
112
+ "source": [
113
+ "## Authenticate\n",
114
+ "\n",
115
+ "In order to create and manage collections, you need to be authenticated. You can do this via the `huggingface_hub` library using the `notebook_login` function if you're using a notebook, or the `login` function if you're using a script. "
116
+ ]
117
+ },
118
+ {
119
+ "cell_type": "code",
120
+ "execution_count": 5,
121
+ "metadata": {
122
+ "id": "Qn9p5Bsz2NN5"
123
+ },
124
+ "outputs": [],
125
+ "source": [
126
+ "from huggingface_hub import notebook_login"
127
+ ]
128
+ },
129
+ {
130
+ "cell_type": "code",
131
+ "execution_count": 7,
132
+ "metadata": {
133
+ "colab": {
134
+ "base_uri": "https://localhost:8080/",
135
+ "height": 145,
136
+ "referenced_widgets": [
137
+ "428d3687eb4342e59d23318099afe34f",
138
+ "18f533e671114b6385428a534364f10a",
139
+ "e4c0e23001254742a94898203a222c6c",
140
+ "9d970a88c8c04bc586473251393aaec7",
141
+ "9f8288bb8cae4796a067580ff7afce69",
142
+ "077012b6f63e4148848c9b9e8726fb18",
143
+ "14f46443f97c4b4fb46c5967aec1178f",
144
+ "5aaff54fecb84936a8dc9fee4393494d",
145
+ "aee4b5a2e361451dae879f37222245f3",
146
+ "9dfa5a7ee7794a5d8396674db2c0b683",
147
+ "3634abd523b7477082a0a8135f1fa770",
148
+ "023691d310634e6e83da20b9575759a2",
149
+ "dded08e463404a53abb86ac605968626",
150
+ "9d99d6e39a424145b017abff9021d9a0",
151
+ "38e81f2aed79485498035e9c418165b4",
152
+ "37807c4db5834365b86bc92c36835220",
153
+ "847b9e4085814f958a147286be4f56eb",
154
+ "6efac326ad7946e3a9ecc22b50568633",
155
+ "9d043d68e16440899a6fc9b740f5970d",
156
+ "88242ee08b884098ad743c1738b7dc97",
157
+ "862cc50e401845fa98054e6bd015a074",
158
+ "4fe39f9b54474fb4966f71c7df0cf93e",
159
+ "d97e9b98cade45eaa2b9b526b1a4bb98",
160
+ "b7f608ef35d84fd7a736260236025429",
161
+ "09ca4ed6420340e0b76d64949301bed4",
162
+ "b2261f5044db4af0bed02f76115d08f9",
163
+ "055c0da20e264ec896963f9edf372a7d",
164
+ "d0f55244ec614704a571f920ffa27bfd",
165
+ "3b77fe48c5e44c879998de497be7a381",
166
+ "15032b9578124624bcc42771cb5d5ad8",
167
+ "eca43f65d6c1407bbb16cd26f60d5b7f",
168
+ "934d899f6c604fa1bf4a8108aa09b190"
169
+ ]
170
+ },
171
+ "id": "Sv15J3mW2Ous",
172
+ "outputId": "e537a566-8cdd-4316-bb69-72f5353345da"
173
+ },
174
+ "outputs": [
175
+ {
176
+ "data": {
177
+ "application/vnd.jupyter.widget-view+json": {
178
+ "model_id": "79b9c67a0334432bad65c411b7560672",
179
+ "version_major": 2,
180
+ "version_minor": 0
181
+ },
182
+ "text/plain": [
183
+ "VBox(children=(HTML(value='<center> <img\\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…"
184
+ ]
185
+ },
186
+ "metadata": {},
187
+ "output_type": "display_data"
188
+ }
189
+ ],
190
+ "source": [
191
+ "notebook_login()"
192
+ ]
193
+ },
194
+ {
195
+ "cell_type": "markdown",
196
+ "metadata": {},
197
+ "source": [
198
+ "### Finding the right datasets using the `huggingface_hub` library\n",
199
+ "\n",
200
+ "We can use the `huggingface_hub` library to list datasets on the Hub using the `list_datasets` function. We can optionally pass in a query to search for datasets that match a particular query. We can also optionally pass in other filters that allow us to further refine the datasets returned by the library. For example, we could filter to only include datasets for a particular task, or datasets that have a particular type of license."
201
+ ]
202
+ },
203
+ {
204
+ "cell_type": "code",
205
+ "execution_count": 8,
206
+ "metadata": {
207
+ "id": "SfV6YRentLI8"
208
+ },
209
+ "outputs": [],
210
+ "source": [
211
+ "from huggingface_hub import list_datasets"
212
+ ]
213
+ },
214
+ {
215
+ "cell_type": "markdown",
216
+ "metadata": {},
217
+ "source": [
218
+ "For this tutorial we'll keep our approach fairly simple and just look for datasets that have the word `instruction` in the name."
219
+ ]
220
+ },
221
+ {
222
+ "cell_type": "code",
223
+ "execution_count": 9,
224
+ "metadata": {},
225
+ "outputs": [],
226
+ "source": [
227
+ "datasets = list_datasets(search=\"instruction\", full=True)"
228
+ ]
229
+ },
230
+ {
231
+ "cell_type": "markdown",
232
+ "metadata": {},
233
+ "source": [
234
+ "List datasets returns a generator. This means that we can process a large number of datasets, models or Spaces without running out of memory. "
235
+ ]
236
+ },
237
+ {
238
+ "cell_type": "code",
239
+ "execution_count": 10,
240
+ "metadata": {},
241
+ "outputs": [
242
+ {
243
+ "data": {
244
+ "text/plain": [
245
+ "generator"
246
+ ]
247
+ },
248
+ "execution_count": 10,
249
+ "metadata": {},
250
+ "output_type": "execute_result"
251
+ }
252
+ ],
253
+ "source": [
254
+ "type(datasets)"
255
+ ]
256
+ },
257
+ {
258
+ "cell_type": "markdown",
259
+ "metadata": {},
260
+ "source": [
261
+ "We can start filtering our results by removing any datasets that don't have at least a single download. Since we're doing this in a [list comprehension](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions), this will 'consume' the generator. This means that this step will take a little bit of time since it's actually starting to call the Hugging Face Hub API to get the datasets."
262
+ ]
263
+ },
264
+ {
265
+ "cell_type": "code",
266
+ "execution_count": 11,
267
+ "metadata": {},
268
+ "outputs": [],
269
+ "source": [
270
+ "datasets = [dataset for dataset in datasets if dataset.downloads > 1]"
271
+ ]
272
+ },
273
+ {
274
+ "cell_type": "markdown",
275
+ "metadata": {},
276
+ "source": [
277
+ "### Getting the top 10% instruction tuned datasets\n",
278
+ "\n",
279
+ "What do we mean by the top 10% instruction tuned datasets? There are various different metrics we could use to define what we mean by top 10%. For this tutorial we'll focus on the number of likes for a dataset. We could also look at other metrics, like the number of downloads, the level of user discussion on the dataset, the quality of documentation, the number of examples...Whilst likes isn't a perfect metric it does give us a good starting point."
280
+ ]
281
+ },
282
+ {
283
+ "cell_type": "markdown",
284
+ "metadata": {},
285
+ "source": [
286
+ "## Getting the number of likes for all of our datasets\n",
287
+ "\n",
288
+ "To know what our cuttoff point is for the top 10% of datasets, we need to know how many likes the most liked dataset has. We can do this by going through all of our datasets and getting the number of likes for each one. Let's take a peek at a single dataset from our list of datasets."
289
+ ]
290
+ },
291
+ {
292
+ "cell_type": "code",
293
+ "execution_count": 12,
294
+ "metadata": {},
295
+ "outputs": [
296
+ {
297
+ "data": {
298
+ "text/plain": [
299
+ "DatasetInfo: { \n",
300
+ " {'_id': '621ffdd236468d709f183185',\n",
301
+ " 'author': 'darkraipro',\n",
302
+ " 'cardData': None,\n",
303
+ " 'citation': None,\n",
304
+ " 'description': None,\n",
305
+ " 'disabled': False,\n",
306
+ " 'downloads': 340,\n",
307
+ " 'gated': False,\n",
308
+ " 'gitalyUid': 'f21ae5b0c1d4859c8ba1412e2aa6682e34e9596cd4cc027758b8557475f0ae11',\n",
309
+ " 'id': 'darkraipro/recipe-instructions',\n",
310
+ " 'lastModified': '2022-01-18T16:22:01.000Z',\n",
311
+ " 'likes': 0,\n",
312
+ " 'private': False,\n",
313
+ " 'sha': 'e7feba49dd438849ec3d309ec4ab52a0fd39fc39',\n",
314
+ " 'siblings': [],\n",
315
+ " 'tags': ['region:us']}\n",
316
+ "}"
317
+ ]
318
+ },
319
+ "execution_count": 12,
320
+ "metadata": {},
321
+ "output_type": "execute_result"
322
+ }
323
+ ],
324
+ "source": [
325
+ "datasets[0]"
326
+ ]
327
+ },
328
+ {
329
+ "cell_type": "markdown",
330
+ "metadata": {},
331
+ "source": [
332
+ "We can see that each dataset is a `DatasetInfo` object. This object contains a bunch of information about the dataset. We can see that the number of likes is stored in the `likes` attribute. We can use this to get the number of likes for each dataset."
333
+ ]
334
+ },
335
+ {
336
+ "cell_type": "code",
337
+ "execution_count": 13,
338
+ "metadata": {},
339
+ "outputs": [],
340
+ "source": [
341
+ "likes = [dataset.likes for dataset in datasets]"
342
+ ]
343
+ },
344
+ {
345
+ "cell_type": "markdown",
346
+ "metadata": {},
347
+ "source": [
348
+ "## Calculate the threshold for the top 10% of datasets\n",
349
+ "\n",
350
+ "To calculate the threshold for the top 10% of datasets, we'll create a function that takes our list of like numbers and return the threshold that separates the top 10% of likes from the rest. We can then use this function to get the threshold for our list of datasets."
351
+ ]
352
+ },
353
+ {
354
+ "cell_type": "code",
355
+ "execution_count": 14,
356
+ "metadata": {
357
+ "id": "Q3JCU5lj9dU3"
358
+ },
359
+ "outputs": [],
360
+ "source": [
361
+ "import math\n",
362
+ "from typing import List"
363
+ ]
364
+ },
365
+ {
366
+ "cell_type": "code",
367
+ "execution_count": 15,
368
+ "metadata": {
369
+ "id": "wKuyNPK09YJ5"
370
+ },
371
+ "outputs": [],
372
+ "source": [
373
+ "def get_threshold(numbers: List[int], threshold: float = 0.90) -> int:\n",
374
+ " sorted_numbers = sorted(numbers)\n",
375
+ " index = math.ceil(len(sorted_numbers) * threshold) - 1\n",
376
+ " return sorted_numbers[index]"
377
+ ]
378
+ },
379
+ {
380
+ "cell_type": "code",
381
+ "execution_count": 16,
382
+ "metadata": {
383
+ "id": "9mrqhvLq_0Yk"
384
+ },
385
+ "outputs": [
386
+ {
387
+ "data": {
388
+ "text/plain": [
389
+ "10"
390
+ ]
391
+ },
392
+ "execution_count": 16,
393
+ "metadata": {},
394
+ "output_type": "execute_result"
395
+ }
396
+ ],
397
+ "source": [
398
+ "threshold = get_threshold(likes)\n",
399
+ "threshold"
400
+ ]
401
+ },
402
+ {
403
+ "cell_type": "markdown",
404
+ "metadata": {},
405
+ "source": [
406
+ "### Filter our datasets to only include those with a number of likes above the threshold"
407
+ ]
408
+ },
409
+ {
410
+ "cell_type": "code",
411
+ "execution_count": 17,
412
+ "metadata": {
413
+ "id": "YguMpWpt7rlD"
414
+ },
415
+ "outputs": [],
416
+ "source": [
417
+ "datasets = [dataset for dataset in datasets if dataset.likes > threshold]"
418
+ ]
419
+ },
420
+ {
421
+ "cell_type": "code",
422
+ "execution_count": 18,
423
+ "metadata": {},
424
+ "outputs": [
425
+ {
426
+ "data": {
427
+ "text/plain": [
428
+ "13"
429
+ ]
430
+ },
431
+ "execution_count": 18,
432
+ "metadata": {},
433
+ "output_type": "execute_result"
434
+ }
435
+ ],
436
+ "source": [
437
+ "len(datasets)"
438
+ ]
439
+ },
440
+ {
441
+ "cell_type": "markdown",
442
+ "metadata": {},
443
+ "source": [
444
+ "## Creating our collection \n",
445
+ "\n",
446
+ "Now that we've got a subset of datasets which match our curation criteria we can move to the next step of creating a Collection to which we can add these datasets.\n",
447
+ "\n",
448
+ "We can to this using the `create_collection` function. This function allows us to create a Collection programmatically. We must pass in a `title` and we can also specify a `description` and a `namespace`. If you don't specify a namespace, the collection will be created in your personal namespace but since I want to add this collection to the `librarian-bots` organization I'll specify it explicitly here. \n",
449
+ "\n",
450
+ "The `existed_ok` parameter allows us to specify what to do if a collection with the same title already exists. If we set this to `True` then the function will return the existing collection. If we set this to `False` then the function will raise an error if a collection with the same title already exists."
451
+ ]
452
+ },
453
+ {
454
+ "cell_type": "code",
455
+ "execution_count": 19,
456
+ "metadata": {},
457
+ "outputs": [],
458
+ "source": [
459
+ "from huggingface_hub import create_collection\n",
460
+ "\n",
461
+ "collection = create_collection(\n",
462
+ " title=\"Top 10% instruction tuning datasets\",\n",
463
+ " description=\"Collects datasets with 'instruction' in the name and more than 1 download and in the top 10% for the number of likes\",\n",
464
+ " namespace=\"librarian-bots\",\n",
465
+ " exists_ok=True,\n",
466
+ ")"
467
+ ]
468
+ },
469
+ {
470
+ "cell_type": "markdown",
471
+ "metadata": {},
472
+ "source": [
473
+ "Lets take a quick look at the collection we've created."
474
+ ]
475
+ },
476
+ {
477
+ "cell_type": "code",
478
+ "execution_count": 20,
479
+ "metadata": {},
480
+ "outputs": [
481
+ {
482
+ "data": {
483
+ "text/plain": [
484
+ "Collection: { \n",
485
+ " {'description': \"Collects datasets with 'instruction' in the name and more than 1 download and in the top 10% for the \"\n",
486
+ " 'number of likes',\n",
487
+ " 'items': [],\n",
488
+ " 'last_updated': datetime.datetime(2023, 9, 25, 11, 52, 53, 545000, tzinfo=datetime.timezone.utc),\n",
489
+ " 'owner': 'librarian-bots',\n",
490
+ " 'position': 0,\n",
491
+ " 'private': False,\n",
492
+ " 'slug': 'librarian-bots/top-10-instruction-tuning-datasets-65117495134fd906b070c410',\n",
493
+ " 'theme': 'indigo',\n",
494
+ " 'title': 'Top 10% instruction tuning datasets',\n",
495
+ " 'url': 'https://huggingface.co/collections/librarian-bots/top-10-instruction-tuning-datasets-65117495134fd906b070c410'}\n",
496
+ "}"
497
+ ]
498
+ },
499
+ "execution_count": 20,
500
+ "metadata": {},
501
+ "output_type": "execute_result"
502
+ }
503
+ ],
504
+ "source": [
505
+ "collection"
506
+ ]
507
+ },
508
+ {
509
+ "cell_type": "markdown",
510
+ "metadata": {},
511
+ "source": [
512
+ "When we call the `create_collection` function we get back a `Collection` object. This object contains a bunch of information about the collection. We can see for example the title, description, and namespace of the collection.\n",
513
+ "\n",
514
+ "We can also see that at the moment the attribute `items` is an empty list. The `items` attribute stores the datasets, models, Spaces, and papers that are in the collection. We can add items to the collection using the `add_collection_item` function."
515
+ ]
516
+ },
517
+ {
518
+ "cell_type": "markdown",
519
+ "metadata": {},
520
+ "source": [
521
+ "Before we add our items to the collection we can do one more bit of additional curation: sorting by downloads. For this collection we don't have a huge number of items so sorting isn't as important but the order of a collection can be used to express so additional information about the collection. For example, you could sort a collection by the date a item as last updated, or by the number of downloads, or by the number of likes. For our example we'll sort by the number of downloads."
522
+ ]
523
+ },
524
+ {
525
+ "cell_type": "code",
526
+ "execution_count": 21,
527
+ "metadata": {
528
+ "id": "8a9U5Bc376qD"
529
+ },
530
+ "outputs": [],
531
+ "source": [
532
+ "sorted_datasets = sorted(datasets, key=lambda dataset: dataset.downloads, reverse=True)"
533
+ ]
534
+ },
535
+ {
536
+ "cell_type": "markdown",
537
+ "metadata": {},
538
+ "source": [
539
+ "Let's take a quick peek at the first two examples to see if this looks okay!"
540
+ ]
541
+ },
542
+ {
543
+ "cell_type": "code",
544
+ "execution_count": 22,
545
+ "metadata": {
546
+ "colab": {
547
+ "base_uri": "https://localhost:8080/"
548
+ },
549
+ "id": "eQQHSZ-IABLb",
550
+ "outputId": "aec16d23-acd8-46be-f476-fa39d5b656eb"
551
+ },
552
+ "outputs": [
553
+ {
554
+ "data": {
555
+ "text/plain": [
556
+ "[DatasetInfo: { \n",
557
+ " {'_id': '64773a98906bb0203e52faad',\n",
558
+ " 'author': 'LinkSoul',\n",
559
+ " 'cardData': {'dataset_info': {'dataset_size': 13444870155,\n",
560
+ " 'download_size': 3542585235,\n",
561
+ " 'features': [{'dtype': 'string', 'name': 'id'},\n",
562
+ " {'list': [{'dtype': 'string', 'name': 'from'},\n",
563
+ " {'dtype': 'string', 'name': 'value'}],\n",
564
+ " 'name': 'conversations'},\n",
565
+ " {'dtype': 'string', 'name': 'instruction'}],\n",
566
+ " 'splits': [{'name': 'train', 'num_bytes': 13444870155, 'num_examples': 10077297}]}},\n",
567
+ " 'citation': None,\n",
568
+ " 'description': None,\n",
569
+ " 'disabled': False,\n",
570
+ " 'downloads': 2968,\n",
571
+ " 'gated': False,\n",
572
+ " 'gitalyUid': '27a10b39e75118535e9d37a774ac8e8f0af89e44385f38bb930c6d5474270e1d',\n",
573
+ " 'id': 'LinkSoul/instruction_merge_set',\n",
574
+ " 'lastModified': '2023-06-01T03:19:51.000Z',\n",
575
+ " 'likes': 104,\n",
576
+ " 'private': False,\n",
577
+ " 'sha': 'f26cd861df9df27973f0a07682462860ccdf14ae',\n",
578
+ " 'siblings': [],\n",
579
+ " 'tags': ['region:us']}\n",
580
+ " },\n",
581
+ " DatasetInfo: { \n",
582
+ " {'_id': '642ef3fe28a26b5c89afb5a8',\n",
583
+ " 'author': 'ArmelR',\n",
584
+ " 'cardData': {'pretty_name': 'stack exchange instruction'},\n",
585
+ " 'citation': None,\n",
586
+ " 'description': None,\n",
587
+ " 'disabled': False,\n",
588
+ " 'downloads': 879,\n",
589
+ " 'gated': False,\n",
590
+ " 'gitalyUid': 'db81393f8a3b85f258162beca50f86f7f54449e9ffe1a84b304b28cedbb76a51',\n",
591
+ " 'id': 'ArmelR/stack-exchange-instruction',\n",
592
+ " 'lastModified': '2023-05-26T08:37:42.000Z',\n",
593
+ " 'likes': 46,\n",
594
+ " 'private': False,\n",
595
+ " 'sha': '3e463817c02767cd64d9ad0276c6d291c7f120aa',\n",
596
+ " 'siblings': [],\n",
597
+ " 'tags': ['region:us']}\n",
598
+ " }]"
599
+ ]
600
+ },
601
+ "execution_count": 22,
602
+ "metadata": {},
603
+ "output_type": "execute_result"
604
+ }
605
+ ],
606
+ "source": [
607
+ "sorted_datasets[:2]"
608
+ ]
609
+ },
610
+ {
611
+ "cell_type": "markdown",
612
+ "metadata": {},
613
+ "source": [
614
+ "## Adding items to our collection \n",
615
+ "\n",
616
+ "Now we're ready to populate our collection. We'll use the `add_collection_item` function to add each dataset to our collection. We can use the `?` operator to get more information about this function."
617
+ ]
618
+ },
619
+ {
620
+ "cell_type": "code",
621
+ "execution_count": 23,
622
+ "metadata": {},
623
+ "outputs": [],
624
+ "source": [
625
+ "from huggingface_hub import add_collection_item"
626
+ ]
627
+ },
628
+ {
629
+ "cell_type": "code",
630
+ "execution_count": 24,
631
+ "metadata": {},
632
+ "outputs": [
633
+ {
634
+ "name": "stdout",
635
+ "output_type": "stream",
636
+ "text": [
637
+ "\u001b[0;31mSignature:\u001b[0m\n",
638
+ "\u001b[0madd_collection_item\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\u001b[0m\n",
639
+ "\u001b[0;34m\u001b[0m \u001b[0mcollection_slug\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'str'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
640
+ "\u001b[0;34m\u001b[0m \u001b[0mitem_id\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'str'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
641
+ "\u001b[0;34m\u001b[0m \u001b[0mitem_type\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'CollectionItemType_T'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
642
+ "\u001b[0;34m\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
643
+ "\u001b[0;34m\u001b[0m \u001b[0mnote\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[str]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
644
+ "\u001b[0;34m\u001b[0m \u001b[0mexists_ok\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'bool'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
645
+ "\u001b[0;34m\u001b[0m \u001b[0mtoken\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[str]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
646
+ "\u001b[0;34m\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0;34m'Collection'\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
647
+ "\u001b[0;31mDocstring:\u001b[0m\n",
648
+ "Add an item to a collection on the Hub.\n",
649
+ "\n",
650
+ "Args:\n",
651
+ " collection_slug (`str`):\n",
652
+ " Slug of the collection to update. Example: `\"TheBloke/recent-models-64f9a55bb3115b4f513ec026\"`.\n",
653
+ " item_id (`str`):\n",
654
+ " ID of the item to add to the collection. It can be the ID of a repo on the Hub (e.g. `\"facebook/bart-large-mnli\"`)\n",
655
+ " or a paper id (e.g. `\"2307.09288\"`).\n",
656
+ " item_type (`str`):\n",
657
+ " Type of the item to add. Can be one of `\"model\"`, `\"dataset\"`, `\"space\"` or `\"paper\"`.\n",
658
+ " note (`str`, *optional*):\n",
659
+ " A note to attach to the item in the collection. The maximum size for a note is 500 characters.\n",
660
+ " exists_ok (`bool`, *optional*):\n",
661
+ " If `True`, do not raise an error if item already exists.\n",
662
+ " token (`str`, *optional*):\n",
663
+ " Hugging Face token. Will default to the locally saved token if not provided.\n",
664
+ "\n",
665
+ "Returns: [`Collection`]\n",
666
+ "\n",
667
+ "Example:\n",
668
+ "\n",
669
+ "```py\n",
670
+ ">>> from huggingface_hub import add_collection_item\n",
671
+ ">>> collection = add_collection_item(\n",
672
+ "... collection_slug=\"davanstrien/climate-64f99dc2a5067f6b65531bab\",\n",
673
+ "... item_id=\"pierre-loic/climate-news-articles\",\n",
674
+ "... item_type=\"dataset\"\n",
675
+ "... )\n",
676
+ ">>> collection.items[-1].item_id\n",
677
+ "\"pierre-loic/climate-news-articles\"\n",
678
+ "# ^item got added to the collection on last position\n",
679
+ "\n",
680
+ "# Add collection with a note\n",
681
+ ">>> add_collection_item(\n",
682
+ "... collection_slug=\"davanstrien/climate-64f99dc2a5067f6b65531bab\",\n",
683
+ "... item_id=\"datasets/climate_fever\",\n",
684
+ "... item_type=\"dataset\"\n",
685
+ "... note=\"This dataset adopts the FEVER methodology that consists of 1,535 real-world claims regarding climate-change collected on the internet.\"\n",
686
+ "... )\n",
687
+ "(...)\n",
688
+ "```\n",
689
+ "\u001b[0;31mFile:\u001b[0m ~/Documents/code/librarian-bot-work/tutorials/.venv/lib/python3.11/site-packages/huggingface_hub/hf_api.py\n",
690
+ "\u001b[0;31mType:\u001b[0m method"
691
+ ]
692
+ }
693
+ ],
694
+ "source": [
695
+ "?add_collection_item"
696
+ ]
697
+ },
698
+ {
699
+ "cell_type": "markdown",
700
+ "metadata": {},
701
+ "source": [
702
+ "As you can see the `add_collection_item` function requires a `collection_slug` argument. This is to let `add_collection_item` know which collection to add the item to. We can get the `collection_slug` from the `Collection` object we created earlier. \n",
703
+ "\n",
704
+ "We also need to specify the `item_id` of the item we want to add. For datasets we can access the `id` from the `DatasetInfo` object to get this value. Additionally we need to specify the type of the item we want to add. This should be one of `dataset`, `model`, `space`, or `paper`. \n",
705
+ "\n",
706
+ "We can optionally add a note which we could use to store some additional information about the item. For example, we could use this to store the reason why we added this item to the collection. In this case we'll store the number of likes and downloads for the dataset."
707
+ ]
708
+ },
709
+ {
710
+ "cell_type": "code",
711
+ "execution_count": 25,
712
+ "metadata": {
713
+ "id": "mp2GYws46kBD"
714
+ },
715
+ "outputs": [],
716
+ "source": [
717
+ "for dataset in datasets:\n",
718
+ " add_collection_item(\n",
719
+ " collection.slug,\n",
720
+ " item_id=dataset.id,\n",
721
+ " item_type=\"dataset\",\n",
722
+ " note=f\"Dataset has {dataset.downloads} downloads and {dataset.likes} likes\",\n",
723
+ " )"
724
+ ]
725
+ },
726
+ {
727
+ "cell_type": "markdown",
728
+ "metadata": {},
729
+ "source": [
730
+ "## Taking a look at our collection\n",
731
+ "\n",
732
+ "The `huggingface_hub` library has a `get_collection` function which allows us to get a `Collection` object from the Hub. We can use this to take a look at our collection."
733
+ ]
734
+ },
735
+ {
736
+ "cell_type": "code",
737
+ "execution_count": 29,
738
+ "metadata": {},
739
+ "outputs": [],
740
+ "source": [
741
+ "from huggingface_hub import get_collection"
742
+ ]
743
+ },
744
+ {
745
+ "cell_type": "markdown",
746
+ "metadata": {},
747
+ "source": [
748
+ "We'll pass in the `collection_slug` to the `get_collection` function to get our collection. We can then take a look at the `items` attribute to see the items in our collection."
749
+ ]
750
+ },
751
+ {
752
+ "cell_type": "code",
753
+ "execution_count": 31,
754
+ "metadata": {},
755
+ "outputs": [
756
+ {
757
+ "data": {
758
+ "text/plain": [
759
+ "[CollectionItem: { \n",
760
+ " {'author': 'Muennighoff',\n",
761
+ " 'downloads': 313,\n",
762
+ " 'gated': False,\n",
763
+ " 'isLikedByUser': False,\n",
764
+ " 'item_id': 'Muennighoff/natural-instructions',\n",
765
+ " 'item_object_id': '6511749fbb66f847cc57a04f',\n",
766
+ " 'item_type': 'dataset',\n",
767
+ " 'lastModified': '2022-12-23T20:08:44.000Z',\n",
768
+ " 'likes': 18,\n",
769
+ " 'note': 'Dataset has 313 downloads and 18 likes',\n",
770
+ " 'position': 0,\n",
771
+ " 'private': False,\n",
772
+ " 'repoType': 'dataset',\n",
773
+ " 'viewer': 'viewer'}\n",
774
+ " },\n",
775
+ " CollectionItem: { \n",
776
+ " {'author': 'qwedsacf',\n",
777
+ " 'downloads': 225,\n",
778
+ " 'gated': False,\n",
779
+ " 'isLikedByUser': False,\n",
780
+ " 'item_id': 'qwedsacf/grade-school-math-instructions',\n",
781
+ " 'item_object_id': '6511749f01307d048b987e74',\n",
782
+ " 'item_type': 'dataset',\n",
783
+ " 'lastModified': '2023-02-11T01:59:26.000Z',\n",
784
+ " 'likes': 21,\n",
785
+ " 'note': 'Dataset has 225 downloads and 21 likes',\n",
786
+ " 'position': 1,\n",
787
+ " 'private': False,\n",
788
+ " 'repoType': 'dataset',\n",
789
+ " 'viewer': 'viewer'}\n",
790
+ " }]"
791
+ ]
792
+ },
793
+ "execution_count": 31,
794
+ "metadata": {},
795
+ "output_type": "execute_result"
796
+ }
797
+ ],
798
+ "source": [
799
+ "updated_collection = get_collection(collection.slug)\n",
800
+ "updated_collection.items[:2]"
801
+ ]
802
+ },
803
+ {
804
+ "cell_type": "markdown",
805
+ "metadata": {},
806
+ "source": [
807
+ "We can see that our collection now contains the datasets we added to it. We can now also begin to think of some possible ways we could programmatically explore our collections. For example we could quickly look at the mean number of downloads for the datasets in our collection."
808
+ ]
809
+ },
810
+ {
811
+ "cell_type": "code",
812
+ "execution_count": 34,
813
+ "metadata": {},
814
+ "outputs": [
815
+ {
816
+ "data": {
817
+ "text/plain": [
818
+ "502.6923076923077"
819
+ ]
820
+ },
821
+ "execution_count": 34,
822
+ "metadata": {},
823
+ "output_type": "execute_result"
824
+ }
825
+ ],
826
+ "source": [
827
+ "from statistics import mean\n",
828
+ "\n",
829
+ "mean(item.downloads for item in updated_collection.items)"
830
+ ]
831
+ },
832
+ {
833
+ "cell_type": "markdown",
834
+ "metadata": {},
835
+ "source": [
836
+ "We could also use other functionality from the `huggingface_hub` library to explore our collection. For example, we could use the `dataset_info` function to try and grab the language of each dataset in our collection."
837
+ ]
838
+ },
839
+ {
840
+ "cell_type": "code",
841
+ "execution_count": 46,
842
+ "metadata": {},
843
+ "outputs": [],
844
+ "source": [
845
+ "from huggingface_hub import dataset_info"
846
+ ]
847
+ },
848
+ {
849
+ "cell_type": "code",
850
+ "execution_count": 47,
851
+ "metadata": {},
852
+ "outputs": [],
853
+ "source": [
854
+ "def try_get_languages(dataset):\n",
855
+ " try:\n",
856
+ " return dataset_info(dataset.id).cardData[\"language\"]\n",
857
+ " except KeyError:\n",
858
+ " return None"
859
+ ]
860
+ },
861
+ {
862
+ "cell_type": "code",
863
+ "execution_count": 48,
864
+ "metadata": {},
865
+ "outputs": [
866
+ {
867
+ "data": {
868
+ "text/plain": [
869
+ "[['en'],\n",
870
+ " None,\n",
871
+ " None,\n",
872
+ " ['en'],\n",
873
+ " None,\n",
874
+ " ['en'],\n",
875
+ " ['en'],\n",
876
+ " None,\n",
877
+ " None,\n",
878
+ " None,\n",
879
+ " ['en'],\n",
880
+ " None,\n",
881
+ " None]"
882
+ ]
883
+ },
884
+ "execution_count": 48,
885
+ "metadata": {},
886
+ "output_type": "execute_result"
887
+ }
888
+ ],
889
+ "source": [
890
+ "[try_get_languages(dataset) for dataset in datasets]"
891
+ ]
892
+ },
893
+ {
894
+ "cell_type": "markdown",
895
+ "metadata": {},
896
+ "source": [
897
+ "We can see here that of the datasets in our collection which have language information, the most common language is English. Quite a few of the datasets in our collection don't have language information. This might be a good opportunity to contribute to the datasets by adding language information to them! "
898
+ ]
899
+ },
900
+ {
901
+ "cell_type": "code",
902
+ "execution_count": 52,
903
+ "metadata": {},
904
+ "outputs": [
905
+ {
906
+ "name": "stdout",
907
+ "output_type": "stream",
908
+ "text": [
909
+ "qwedsacf/grade-school-math-instructions has no language and could benefit from a PR to add it! Here is the url to fix it: https://huggingface.co/datasets/qwedsacf/grade-school-math-instructions/edit/main/README.md \n",
910
+ "HuggingFaceH4/instruction-dataset has no language and could benefit from a PR to add it! Here is the url to fix it: https://huggingface.co/datasets/HuggingFaceH4/instruction-dataset/edit/main/README.md \n",
911
+ "ArmelR/stack-exchange-instruction has no language and could benefit from a PR to add it! Here is the url to fix it: https://huggingface.co/datasets/ArmelR/stack-exchange-instruction/edit/main/README.md \n",
912
+ "openllmplayground/pandagpt_visual_instruction_dataset has no language and could benefit from a PR to add it! Here is the url to fix it: https://huggingface.co/datasets/openllmplayground/pandagpt_visual_instruction_dataset/edit/main/README.md \n",
913
+ "rewoo/planner_instruction_tuning_2k has no language and could benefit from a PR to add it! Here is the url to fix it: https://huggingface.co/datasets/rewoo/planner_instruction_tuning_2k/edit/main/README.md \n",
914
+ "LinkSoul/instruction_merge_set has no language and could benefit from a PR to add it! Here is the url to fix it: https://huggingface.co/datasets/LinkSoul/instruction_merge_set/edit/main/README.md \n",
915
+ "TokenBender/code_instructions_122k_alpaca_style has no language and could benefit from a PR to add it! Here is the url to fix it: https://huggingface.co/datasets/TokenBender/code_instructions_122k_alpaca_style/edit/main/README.md \n",
916
+ "codefuse-ai/Evol-instruction-66k has no language and could benefit from a PR to add it! Here is the url to fix it: https://huggingface.co/datasets/codefuse-ai/Evol-instruction-66k/edit/main/README.md \n"
917
+ ]
918
+ }
919
+ ],
920
+ "source": [
921
+ "for dataset in datasets:\n",
922
+ " language = try_get_languages(dataset)\n",
923
+ " if language is None:\n",
924
+ " print(\n",
925
+ " f\"{dataset.id} has no language and could benefit from a PR to add it! Here is the url to fix it: https://huggingface.co/datasets/{dataset.id}/edit/main/README.md \"\n",
926
+ " )"
927
+ ]
928
+ },
929
+ {
930
+ "cell_type": "markdown",
931
+ "metadata": {},
932
+ "source": [
933
+ "## Looking at the collection on the Hub\n",
934
+ "\n",
935
+ "We can also take a look at our collection on the Hub. We can quickly get to the URL for our collection on the Hub using the `url` attribute of our `Collection` object."
936
+ ]
937
+ },
938
+ {
939
+ "cell_type": "code",
940
+ "execution_count": 53,
941
+ "metadata": {},
942
+ "outputs": [
943
+ {
944
+ "data": {
945
+ "text/plain": [
946
+ "'https://huggingface.co/collections/librarian-bots/top-10-instruction-tuning-datasets-65117495134fd906b070c410'"
947
+ ]
948
+ },
949
+ "execution_count": 53,
950
+ "metadata": {},
951
+ "output_type": "execute_result"
952
+ }
953
+ ],
954
+ "source": [
955
+ "updated_collection.url"
956
+ ]
957
+ },
958
+ {
959
+ "cell_type": "markdown",
960
+ "metadata": {},
961
+ "source": [
962
+ "# Conclusion and other things to try\n",
963
+ "\n",
964
+ "In this tutorial we've seen how to use the `huggingface_hub` library to create a collection that curates the top 10% most used instruction tuning datasets on the Hub. We've also seen how we can use the `huggingface_hub` library to explore our collection and the datasets in it.\n",
965
+ "\n",
966
+ "There are many potential opportunities to build on this approach to automatically/semi-automatically curate useful collections."
967
+ ]
968
+ }
969
+ ],
970
+ "metadata": {
971
+ "colab": {
972
+ "provenance": []
973
+ },
974
+ "kernelspec": {
975
+ "display_name": "Python 3",
976
+ "name": "python3"
977
+ },
978
+ "language_info": {
979
+ "codemirror_mode": {
980
+ "name": "ipython",
981
+ "version": 3
982
+ },
983
+ "file_extension": ".py",
984
+ "mimetype": "text/x-python",
985
+ "name": "python",
986
+ "nbconvert_exporter": "python",
987
+ "pygments_lexer": "ipython3",
988
+ "version": "3.11.1"
989
+ },
990
+ "widgets": {
991
+ "application/vnd.jupyter.widget-state+json": {
992
+ "023691d310634e6e83da20b9575759a2": {
993
+ "model_module": "@jupyter-widgets/base",
994
+ "model_module_version": "1.2.0",
995
+ "model_name": "LayoutModel",
996
+ "state": {
997
+ "_model_module": "@jupyter-widgets/base",
998
+ "_model_module_version": "1.2.0",
999
+ "_model_name": "LayoutModel",
1000
+ "_view_count": null,
1001
+ "_view_module": "@jupyter-widgets/base",
1002
+ "_view_module_version": "1.2.0",
1003
+ "_view_name": "LayoutView",
1004
+ "align_content": null,
1005
+ "align_items": null,
1006
+ "align_self": null,
1007
+ "border": null,
1008
+ "bottom": null,
1009
+ "display": null,
1010
+ "flex": null,
1011
+ "flex_flow": null,
1012
+ "grid_area": null,
1013
+ "grid_auto_columns": null,
1014
+ "grid_auto_flow": null,
1015
+ "grid_auto_rows": null,
1016
+ "grid_column": null,
1017
+ "grid_gap": null,
1018
+ "grid_row": null,
1019
+ "grid_template_areas": null,
1020
+ "grid_template_columns": null,
1021
+ "grid_template_rows": null,
1022
+ "height": null,
1023
+ "justify_content": null,
1024
+ "justify_items": null,
1025
+ "left": null,
1026
+ "margin": null,
1027
+ "max_height": null,
1028
+ "max_width": null,
1029
+ "min_height": null,
1030
+ "min_width": null,
1031
+ "object_fit": null,
1032
+ "object_position": null,
1033
+ "order": null,
1034
+ "overflow": null,
1035
+ "overflow_x": null,
1036
+ "overflow_y": null,
1037
+ "padding": null,
1038
+ "right": null,
1039
+ "top": null,
1040
+ "visibility": null,
1041
+ "width": null
1042
+ }
1043
+ },
1044
+ "055c0da20e264ec896963f9edf372a7d": {
1045
+ "model_module": "@jupyter-widgets/base",
1046
+ "model_module_version": "1.2.0",
1047
+ "model_name": "LayoutModel",
1048
+ "state": {
1049
+ "_model_module": "@jupyter-widgets/base",
1050
+ "_model_module_version": "1.2.0",
1051
+ "_model_name": "LayoutModel",
1052
+ "_view_count": null,
1053
+ "_view_module": "@jupyter-widgets/base",
1054
+ "_view_module_version": "1.2.0",
1055
+ "_view_name": "LayoutView",
1056
+ "align_content": null,
1057
+ "align_items": null,
1058
+ "align_self": null,
1059
+ "border": null,
1060
+ "bottom": null,
1061
+ "display": null,
1062
+ "flex": null,
1063
+ "flex_flow": null,
1064
+ "grid_area": null,
1065
+ "grid_auto_columns": null,
1066
+ "grid_auto_flow": null,
1067
+ "grid_auto_rows": null,
1068
+ "grid_column": null,
1069
+ "grid_gap": null,
1070
+ "grid_row": null,
1071
+ "grid_template_areas": null,
1072
+ "grid_template_columns": null,
1073
+ "grid_template_rows": null,
1074
+ "height": null,
1075
+ "justify_content": null,
1076
+ "justify_items": null,
1077
+ "left": null,
1078
+ "margin": null,
1079
+ "max_height": null,
1080
+ "max_width": null,
1081
+ "min_height": null,
1082
+ "min_width": null,
1083
+ "object_fit": null,
1084
+ "object_position": null,
1085
+ "order": null,
1086
+ "overflow": null,
1087
+ "overflow_x": null,
1088
+ "overflow_y": null,
1089
+ "padding": null,
1090
+ "right": null,
1091
+ "top": null,
1092
+ "visibility": null,
1093
+ "width": null
1094
+ }
1095
+ },
1096
+ "077012b6f63e4148848c9b9e8726fb18": {
1097
+ "model_module": "@jupyter-widgets/controls",
1098
+ "model_module_version": "1.5.0",
1099
+ "model_name": "HTMLModel",
1100
+ "state": {
1101
+ "_dom_classes": [],
1102
+ "_model_module": "@jupyter-widgets/controls",
1103
+ "_model_module_version": "1.5.0",
1104
+ "_model_name": "HTMLModel",
1105
+ "_view_count": null,
1106
+ "_view_module": "@jupyter-widgets/controls",
1107
+ "_view_module_version": "1.5.0",
1108
+ "_view_name": "HTMLView",
1109
+ "description": "",
1110
+ "description_tooltip": null,
1111
+ "layout": "IPY_MODEL_37807c4db5834365b86bc92c36835220",
1112
+ "placeholder": "​",
1113
+ "style": "IPY_MODEL_847b9e4085814f958a147286be4f56eb",
1114
+ "value": "\n<b>Pro Tip:</b> If you don't already have one, you can create a dedicated\n'notebooks' token with 'write' access, that you can then easily reuse for all\nnotebooks. </center>"
1115
+ }
1116
+ },
1117
+ "09ca4ed6420340e0b76d64949301bed4": {
1118
+ "model_module": "@jupyter-widgets/base",
1119
+ "model_module_version": "1.2.0",
1120
+ "model_name": "LayoutModel",
1121
+ "state": {
1122
+ "_model_module": "@jupyter-widgets/base",
1123
+ "_model_module_version": "1.2.0",
1124
+ "_model_name": "LayoutModel",
1125
+ "_view_count": null,
1126
+ "_view_module": "@jupyter-widgets/base",
1127
+ "_view_module_version": "1.2.0",
1128
+ "_view_name": "LayoutView",
1129
+ "align_content": null,
1130
+ "align_items": null,
1131
+ "align_self": null,
1132
+ "border": null,
1133
+ "bottom": null,
1134
+ "display": null,
1135
+ "flex": null,
1136
+ "flex_flow": null,
1137
+ "grid_area": null,
1138
+ "grid_auto_columns": null,
1139
+ "grid_auto_flow": null,
1140
+ "grid_auto_rows": null,
1141
+ "grid_column": null,
1142
+ "grid_gap": null,
1143
+ "grid_row": null,
1144
+ "grid_template_areas": null,
1145
+ "grid_template_columns": null,
1146
+ "grid_template_rows": null,
1147
+ "height": null,
1148
+ "justify_content": null,
1149
+ "justify_items": null,
1150
+ "left": null,
1151
+ "margin": null,
1152
+ "max_height": null,
1153
+ "max_width": null,
1154
+ "min_height": null,
1155
+ "min_width": null,
1156
+ "object_fit": null,
1157
+ "object_position": null,
1158
+ "order": null,
1159
+ "overflow": null,
1160
+ "overflow_x": null,
1161
+ "overflow_y": null,
1162
+ "padding": null,
1163
+ "right": null,
1164
+ "top": null,
1165
+ "visibility": null,
1166
+ "width": null
1167
+ }
1168
+ },
1169
+ "14f46443f97c4b4fb46c5967aec1178f": {
1170
+ "model_module": "@jupyter-widgets/base",
1171
+ "model_module_version": "1.2.0",
1172
+ "model_name": "LayoutModel",
1173
+ "state": {
1174
+ "_model_module": "@jupyter-widgets/base",
1175
+ "_model_module_version": "1.2.0",
1176
+ "_model_name": "LayoutModel",
1177
+ "_view_count": null,
1178
+ "_view_module": "@jupyter-widgets/base",
1179
+ "_view_module_version": "1.2.0",
1180
+ "_view_name": "LayoutView",
1181
+ "align_content": null,
1182
+ "align_items": "center",
1183
+ "align_self": null,
1184
+ "border": null,
1185
+ "bottom": null,
1186
+ "display": "flex",
1187
+ "flex": null,
1188
+ "flex_flow": "column",
1189
+ "grid_area": null,
1190
+ "grid_auto_columns": null,
1191
+ "grid_auto_flow": null,
1192
+ "grid_auto_rows": null,
1193
+ "grid_column": null,
1194
+ "grid_gap": null,
1195
+ "grid_row": null,
1196
+ "grid_template_areas": null,
1197
+ "grid_template_columns": null,
1198
+ "grid_template_rows": null,
1199
+ "height": null,
1200
+ "justify_content": null,
1201
+ "justify_items": null,
1202
+ "left": null,
1203
+ "margin": null,
1204
+ "max_height": null,
1205
+ "max_width": null,
1206
+ "min_height": null,
1207
+ "min_width": null,
1208
+ "object_fit": null,
1209
+ "object_position": null,
1210
+ "order": null,
1211
+ "overflow": null,
1212
+ "overflow_x": null,
1213
+ "overflow_y": null,
1214
+ "padding": null,
1215
+ "right": null,
1216
+ "top": null,
1217
+ "visibility": null,
1218
+ "width": "50%"
1219
+ }
1220
+ },
1221
+ "15032b9578124624bcc42771cb5d5ad8": {
1222
+ "model_module": "@jupyter-widgets/controls",
1223
+ "model_module_version": "1.5.0",
1224
+ "model_name": "DescriptionStyleModel",
1225
+ "state": {
1226
+ "_model_module": "@jupyter-widgets/controls",
1227
+ "_model_module_version": "1.5.0",
1228
+ "_model_name": "DescriptionStyleModel",
1229
+ "_view_count": null,
1230
+ "_view_module": "@jupyter-widgets/base",
1231
+ "_view_module_version": "1.2.0",
1232
+ "_view_name": "StyleView",
1233
+ "description_width": ""
1234
+ }
1235
+ },
1236
+ "16bb0f6cf2a04a78a1f76d9fd2ddb74a": {
1237
+ "model_module": "@jupyter-widgets/base",
1238
+ "model_module_version": "1.2.0",
1239
+ "model_name": "LayoutModel",
1240
+ "state": {
1241
+ "_model_module": "@jupyter-widgets/base",
1242
+ "_model_module_version": "1.2.0",
1243
+ "_model_name": "LayoutModel",
1244
+ "_view_count": null,
1245
+ "_view_module": "@jupyter-widgets/base",
1246
+ "_view_module_version": "1.2.0",
1247
+ "_view_name": "LayoutView",
1248
+ "align_content": null,
1249
+ "align_items": null,
1250
+ "align_self": null,
1251
+ "border": null,
1252
+ "bottom": null,
1253
+ "display": null,
1254
+ "flex": null,
1255
+ "flex_flow": null,
1256
+ "grid_area": null,
1257
+ "grid_auto_columns": null,
1258
+ "grid_auto_flow": null,
1259
+ "grid_auto_rows": null,
1260
+ "grid_column": null,
1261
+ "grid_gap": null,
1262
+ "grid_row": null,
1263
+ "grid_template_areas": null,
1264
+ "grid_template_columns": null,
1265
+ "grid_template_rows": null,
1266
+ "height": null,
1267
+ "justify_content": null,
1268
+ "justify_items": null,
1269
+ "left": null,
1270
+ "margin": null,
1271
+ "max_height": null,
1272
+ "max_width": null,
1273
+ "min_height": null,
1274
+ "min_width": null,
1275
+ "object_fit": null,
1276
+ "object_position": null,
1277
+ "order": null,
1278
+ "overflow": null,
1279
+ "overflow_x": null,
1280
+ "overflow_y": null,
1281
+ "padding": null,
1282
+ "right": null,
1283
+ "top": null,
1284
+ "visibility": null,
1285
+ "width": null
1286
+ }
1287
+ },
1288
+ "18f533e671114b6385428a534364f10a": {
1289
+ "model_module": "@jupyter-widgets/controls",
1290
+ "model_module_version": "1.5.0",
1291
+ "model_name": "HTMLModel",
1292
+ "state": {
1293
+ "_dom_classes": [],
1294
+ "_model_module": "@jupyter-widgets/controls",
1295
+ "_model_module_version": "1.5.0",
1296
+ "_model_name": "HTMLModel",
1297
+ "_view_count": null,
1298
+ "_view_module": "@jupyter-widgets/controls",
1299
+ "_view_module_version": "1.5.0",
1300
+ "_view_name": "HTMLView",
1301
+ "description": "",
1302
+ "description_tooltip": null,
1303
+ "layout": "IPY_MODEL_5aaff54fecb84936a8dc9fee4393494d",
1304
+ "placeholder": "​",
1305
+ "style": "IPY_MODEL_aee4b5a2e361451dae879f37222245f3",
1306
+ "value": "<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.svg\nalt='Hugging Face'> <br> Copy a token from <a\nhref=\"https://huggingface.co/settings/tokens\" target=\"_blank\">your Hugging Face\ntokens page</a> and paste it below. <br> Immediately click login after copying\nyour token or it might be stored in plain text in this notebook file. </center>"
1307
+ }
1308
+ },
1309
+ "2b3d0284c2ce49fb86c2da7d1ea327b7": {
1310
+ "model_module": "@jupyter-widgets/controls",
1311
+ "model_module_version": "1.5.0",
1312
+ "model_name": "HTMLModel",
1313
+ "state": {
1314
+ "_dom_classes": [],
1315
+ "_model_module": "@jupyter-widgets/controls",
1316
+ "_model_module_version": "1.5.0",
1317
+ "_model_name": "HTMLModel",
1318
+ "_view_count": null,
1319
+ "_view_module": "@jupyter-widgets/controls",
1320
+ "_view_module_version": "1.5.0",
1321
+ "_view_name": "HTMLView",
1322
+ "description": "",
1323
+ "description_tooltip": null,
1324
+ "layout": "IPY_MODEL_7e7bb5e6d08842bdbd1b6ec41ae5b6e2",
1325
+ "placeholder": "​",
1326
+ "style": "IPY_MODEL_2e120a36c40f43118699e1c4615d1332",
1327
+ "value": " 340426/? [00:53&lt;00:00, 6620.19it/s]"
1328
+ }
1329
+ },
1330
+ "2e120a36c40f43118699e1c4615d1332": {
1331
+ "model_module": "@jupyter-widgets/controls",
1332
+ "model_module_version": "1.5.0",
1333
+ "model_name": "DescriptionStyleModel",
1334
+ "state": {
1335
+ "_model_module": "@jupyter-widgets/controls",
1336
+ "_model_module_version": "1.5.0",
1337
+ "_model_name": "DescriptionStyleModel",
1338
+ "_view_count": null,
1339
+ "_view_module": "@jupyter-widgets/base",
1340
+ "_view_module_version": "1.2.0",
1341
+ "_view_name": "StyleView",
1342
+ "description_width": ""
1343
+ }
1344
+ },
1345
+ "3592ba829079449583dda09d0e5aba1a": {
1346
+ "model_module": "@jupyter-widgets/controls",
1347
+ "model_module_version": "1.5.0",
1348
+ "model_name": "HTMLModel",
1349
+ "state": {
1350
+ "_dom_classes": [],
1351
+ "_model_module": "@jupyter-widgets/controls",
1352
+ "_model_module_version": "1.5.0",
1353
+ "_model_name": "HTMLModel",
1354
+ "_view_count": null,
1355
+ "_view_module": "@jupyter-widgets/controls",
1356
+ "_view_module_version": "1.5.0",
1357
+ "_view_name": "HTMLView",
1358
+ "description": "",
1359
+ "description_tooltip": null,
1360
+ "layout": "IPY_MODEL_16bb0f6cf2a04a78a1f76d9fd2ddb74a",
1361
+ "placeholder": "​",
1362
+ "style": "IPY_MODEL_9b26690a9e244eb5b8eb737eedebf9c9",
1363
+ "value": ""
1364
+ }
1365
+ },
1366
+ "3634abd523b7477082a0a8135f1fa770": {
1367
+ "model_module": "@jupyter-widgets/controls",
1368
+ "model_module_version": "1.5.0",
1369
+ "model_name": "DescriptionStyleModel",
1370
+ "state": {
1371
+ "_model_module": "@jupyter-widgets/controls",
1372
+ "_model_module_version": "1.5.0",
1373
+ "_model_name": "DescriptionStyleModel",
1374
+ "_view_count": null,
1375
+ "_view_module": "@jupyter-widgets/base",
1376
+ "_view_module_version": "1.2.0",
1377
+ "_view_name": "StyleView",
1378
+ "description_width": ""
1379
+ }
1380
+ },
1381
+ "37807c4db5834365b86bc92c36835220": {
1382
+ "model_module": "@jupyter-widgets/base",
1383
+ "model_module_version": "1.2.0",
1384
+ "model_name": "LayoutModel",
1385
+ "state": {
1386
+ "_model_module": "@jupyter-widgets/base",
1387
+ "_model_module_version": "1.2.0",
1388
+ "_model_name": "LayoutModel",
1389
+ "_view_count": null,
1390
+ "_view_module": "@jupyter-widgets/base",
1391
+ "_view_module_version": "1.2.0",
1392
+ "_view_name": "LayoutView",
1393
+ "align_content": null,
1394
+ "align_items": null,
1395
+ "align_self": null,
1396
+ "border": null,
1397
+ "bottom": null,
1398
+ "display": null,
1399
+ "flex": null,
1400
+ "flex_flow": null,
1401
+ "grid_area": null,
1402
+ "grid_auto_columns": null,
1403
+ "grid_auto_flow": null,
1404
+ "grid_auto_rows": null,
1405
+ "grid_column": null,
1406
+ "grid_gap": null,
1407
+ "grid_row": null,
1408
+ "grid_template_areas": null,
1409
+ "grid_template_columns": null,
1410
+ "grid_template_rows": null,
1411
+ "height": null,
1412
+ "justify_content": null,
1413
+ "justify_items": null,
1414
+ "left": null,
1415
+ "margin": null,
1416
+ "max_height": null,
1417
+ "max_width": null,
1418
+ "min_height": null,
1419
+ "min_width": null,
1420
+ "object_fit": null,
1421
+ "object_position": null,
1422
+ "order": null,
1423
+ "overflow": null,
1424
+ "overflow_x": null,
1425
+ "overflow_y": null,
1426
+ "padding": null,
1427
+ "right": null,
1428
+ "top": null,
1429
+ "visibility": null,
1430
+ "width": null
1431
+ }
1432
+ },
1433
+ "38e81f2aed79485498035e9c418165b4": {
1434
+ "model_module": "@jupyter-widgets/controls",
1435
+ "model_module_version": "1.5.0",
1436
+ "model_name": "ButtonStyleModel",
1437
+ "state": {
1438
+ "_model_module": "@jupyter-widgets/controls",
1439
+ "_model_module_version": "1.5.0",
1440
+ "_model_name": "ButtonStyleModel",
1441
+ "_view_count": null,
1442
+ "_view_module": "@jupyter-widgets/base",
1443
+ "_view_module_version": "1.2.0",
1444
+ "_view_name": "StyleView",
1445
+ "button_color": null,
1446
+ "font_weight": ""
1447
+ }
1448
+ },
1449
+ "3b77fe48c5e44c879998de497be7a381": {
1450
+ "model_module": "@jupyter-widgets/base",
1451
+ "model_module_version": "1.2.0",
1452
+ "model_name": "LayoutModel",
1453
+ "state": {
1454
+ "_model_module": "@jupyter-widgets/base",
1455
+ "_model_module_version": "1.2.0",
1456
+ "_model_name": "LayoutModel",
1457
+ "_view_count": null,
1458
+ "_view_module": "@jupyter-widgets/base",
1459
+ "_view_module_version": "1.2.0",
1460
+ "_view_name": "LayoutView",
1461
+ "align_content": null,
1462
+ "align_items": null,
1463
+ "align_self": null,
1464
+ "border": null,
1465
+ "bottom": null,
1466
+ "display": null,
1467
+ "flex": null,
1468
+ "flex_flow": null,
1469
+ "grid_area": null,
1470
+ "grid_auto_columns": null,
1471
+ "grid_auto_flow": null,
1472
+ "grid_auto_rows": null,
1473
+ "grid_column": null,
1474
+ "grid_gap": null,
1475
+ "grid_row": null,
1476
+ "grid_template_areas": null,
1477
+ "grid_template_columns": null,
1478
+ "grid_template_rows": null,
1479
+ "height": null,
1480
+ "justify_content": null,
1481
+ "justify_items": null,
1482
+ "left": null,
1483
+ "margin": null,
1484
+ "max_height": null,
1485
+ "max_width": null,
1486
+ "min_height": null,
1487
+ "min_width": null,
1488
+ "object_fit": null,
1489
+ "object_position": null,
1490
+ "order": null,
1491
+ "overflow": null,
1492
+ "overflow_x": null,
1493
+ "overflow_y": null,
1494
+ "padding": null,
1495
+ "right": null,
1496
+ "top": null,
1497
+ "visibility": null,
1498
+ "width": null
1499
+ }
1500
+ },
1501
+ "428d3687eb4342e59d23318099afe34f": {
1502
+ "model_module": "@jupyter-widgets/controls",
1503
+ "model_module_version": "1.5.0",
1504
+ "model_name": "VBoxModel",
1505
+ "state": {
1506
+ "_dom_classes": [],
1507
+ "_model_module": "@jupyter-widgets/controls",
1508
+ "_model_module_version": "1.5.0",
1509
+ "_model_name": "VBoxModel",
1510
+ "_view_count": null,
1511
+ "_view_module": "@jupyter-widgets/controls",
1512
+ "_view_module_version": "1.5.0",
1513
+ "_view_name": "VBoxView",
1514
+ "box_style": "",
1515
+ "children": [
1516
+ "IPY_MODEL_862cc50e401845fa98054e6bd015a074",
1517
+ "IPY_MODEL_4fe39f9b54474fb4966f71c7df0cf93e",
1518
+ "IPY_MODEL_d97e9b98cade45eaa2b9b526b1a4bb98",
1519
+ "IPY_MODEL_b7f608ef35d84fd7a736260236025429"
1520
+ ],
1521
+ "layout": "IPY_MODEL_14f46443f97c4b4fb46c5967aec1178f"
1522
+ }
1523
+ },
1524
+ "4fc0c071d3f24730ab9264ea62092d76": {
1525
+ "model_module": "@jupyter-widgets/controls",
1526
+ "model_module_version": "1.5.0",
1527
+ "model_name": "HBoxModel",
1528
+ "state": {
1529
+ "_dom_classes": [],
1530
+ "_model_module": "@jupyter-widgets/controls",
1531
+ "_model_module_version": "1.5.0",
1532
+ "_model_name": "HBoxModel",
1533
+ "_view_count": null,
1534
+ "_view_module": "@jupyter-widgets/controls",
1535
+ "_view_module_version": "1.5.0",
1536
+ "_view_name": "HBoxView",
1537
+ "box_style": "",
1538
+ "children": [
1539
+ "IPY_MODEL_3592ba829079449583dda09d0e5aba1a",
1540
+ "IPY_MODEL_e2c41567b4eb400ba534374edb3a0df9",
1541
+ "IPY_MODEL_2b3d0284c2ce49fb86c2da7d1ea327b7"
1542
+ ],
1543
+ "layout": "IPY_MODEL_c26831761f8e4fdf9477663d15d95e78"
1544
+ }
1545
+ },
1546
+ "4fe39f9b54474fb4966f71c7df0cf93e": {
1547
+ "model_module": "@jupyter-widgets/controls",
1548
+ "model_module_version": "1.5.0",
1549
+ "model_name": "LabelModel",
1550
+ "state": {
1551
+ "_dom_classes": [],
1552
+ "_model_module": "@jupyter-widgets/controls",
1553
+ "_model_module_version": "1.5.0",
1554
+ "_model_name": "LabelModel",
1555
+ "_view_count": null,
1556
+ "_view_module": "@jupyter-widgets/controls",
1557
+ "_view_module_version": "1.5.0",
1558
+ "_view_name": "LabelView",
1559
+ "description": "",
1560
+ "description_tooltip": null,
1561
+ "layout": "IPY_MODEL_055c0da20e264ec896963f9edf372a7d",
1562
+ "placeholder": "​",
1563
+ "style": "IPY_MODEL_d0f55244ec614704a571f920ffa27bfd",
1564
+ "value": "Your token has been saved in your configured git credential helpers (store)."
1565
+ }
1566
+ },
1567
+ "5aaff54fecb84936a8dc9fee4393494d": {
1568
+ "model_module": "@jupyter-widgets/base",
1569
+ "model_module_version": "1.2.0",
1570
+ "model_name": "LayoutModel",
1571
+ "state": {
1572
+ "_model_module": "@jupyter-widgets/base",
1573
+ "_model_module_version": "1.2.0",
1574
+ "_model_name": "LayoutModel",
1575
+ "_view_count": null,
1576
+ "_view_module": "@jupyter-widgets/base",
1577
+ "_view_module_version": "1.2.0",
1578
+ "_view_name": "LayoutView",
1579
+ "align_content": null,
1580
+ "align_items": null,
1581
+ "align_self": null,
1582
+ "border": null,
1583
+ "bottom": null,
1584
+ "display": null,
1585
+ "flex": null,
1586
+ "flex_flow": null,
1587
+ "grid_area": null,
1588
+ "grid_auto_columns": null,
1589
+ "grid_auto_flow": null,
1590
+ "grid_auto_rows": null,
1591
+ "grid_column": null,
1592
+ "grid_gap": null,
1593
+ "grid_row": null,
1594
+ "grid_template_areas": null,
1595
+ "grid_template_columns": null,
1596
+ "grid_template_rows": null,
1597
+ "height": null,
1598
+ "justify_content": null,
1599
+ "justify_items": null,
1600
+ "left": null,
1601
+ "margin": null,
1602
+ "max_height": null,
1603
+ "max_width": null,
1604
+ "min_height": null,
1605
+ "min_width": null,
1606
+ "object_fit": null,
1607
+ "object_position": null,
1608
+ "order": null,
1609
+ "overflow": null,
1610
+ "overflow_x": null,
1611
+ "overflow_y": null,
1612
+ "padding": null,
1613
+ "right": null,
1614
+ "top": null,
1615
+ "visibility": null,
1616
+ "width": null
1617
+ }
1618
+ },
1619
+ "6efac326ad7946e3a9ecc22b50568633": {
1620
+ "model_module": "@jupyter-widgets/controls",
1621
+ "model_module_version": "1.5.0",
1622
+ "model_name": "LabelModel",
1623
+ "state": {
1624
+ "_dom_classes": [],
1625
+ "_model_module": "@jupyter-widgets/controls",
1626
+ "_model_module_version": "1.5.0",
1627
+ "_model_name": "LabelModel",
1628
+ "_view_count": null,
1629
+ "_view_module": "@jupyter-widgets/controls",
1630
+ "_view_module_version": "1.5.0",
1631
+ "_view_name": "LabelView",
1632
+ "description": "",
1633
+ "description_tooltip": null,
1634
+ "layout": "IPY_MODEL_9d043d68e16440899a6fc9b740f5970d",
1635
+ "placeholder": "​",
1636
+ "style": "IPY_MODEL_88242ee08b884098ad743c1738b7dc97",
1637
+ "value": "Connecting..."
1638
+ }
1639
+ },
1640
+ "7e7bb5e6d08842bdbd1b6ec41ae5b6e2": {
1641
+ "model_module": "@jupyter-widgets/base",
1642
+ "model_module_version": "1.2.0",
1643
+ "model_name": "LayoutModel",
1644
+ "state": {
1645
+ "_model_module": "@jupyter-widgets/base",
1646
+ "_model_module_version": "1.2.0",
1647
+ "_model_name": "LayoutModel",
1648
+ "_view_count": null,
1649
+ "_view_module": "@jupyter-widgets/base",
1650
+ "_view_module_version": "1.2.0",
1651
+ "_view_name": "LayoutView",
1652
+ "align_content": null,
1653
+ "align_items": null,
1654
+ "align_self": null,
1655
+ "border": null,
1656
+ "bottom": null,
1657
+ "display": null,
1658
+ "flex": null,
1659
+ "flex_flow": null,
1660
+ "grid_area": null,
1661
+ "grid_auto_columns": null,
1662
+ "grid_auto_flow": null,
1663
+ "grid_auto_rows": null,
1664
+ "grid_column": null,
1665
+ "grid_gap": null,
1666
+ "grid_row": null,
1667
+ "grid_template_areas": null,
1668
+ "grid_template_columns": null,
1669
+ "grid_template_rows": null,
1670
+ "height": null,
1671
+ "justify_content": null,
1672
+ "justify_items": null,
1673
+ "left": null,
1674
+ "margin": null,
1675
+ "max_height": null,
1676
+ "max_width": null,
1677
+ "min_height": null,
1678
+ "min_width": null,
1679
+ "object_fit": null,
1680
+ "object_position": null,
1681
+ "order": null,
1682
+ "overflow": null,
1683
+ "overflow_x": null,
1684
+ "overflow_y": null,
1685
+ "padding": null,
1686
+ "right": null,
1687
+ "top": null,
1688
+ "visibility": null,
1689
+ "width": null
1690
+ }
1691
+ },
1692
+ "847b9e4085814f958a147286be4f56eb": {
1693
+ "model_module": "@jupyter-widgets/controls",
1694
+ "model_module_version": "1.5.0",
1695
+ "model_name": "DescriptionStyleModel",
1696
+ "state": {
1697
+ "_model_module": "@jupyter-widgets/controls",
1698
+ "_model_module_version": "1.5.0",
1699
+ "_model_name": "DescriptionStyleModel",
1700
+ "_view_count": null,
1701
+ "_view_module": "@jupyter-widgets/base",
1702
+ "_view_module_version": "1.2.0",
1703
+ "_view_name": "StyleView",
1704
+ "description_width": ""
1705
+ }
1706
+ },
1707
+ "862cc50e401845fa98054e6bd015a074": {
1708
+ "model_module": "@jupyter-widgets/controls",
1709
+ "model_module_version": "1.5.0",
1710
+ "model_name": "LabelModel",
1711
+ "state": {
1712
+ "_dom_classes": [],
1713
+ "_model_module": "@jupyter-widgets/controls",
1714
+ "_model_module_version": "1.5.0",
1715
+ "_model_name": "LabelModel",
1716
+ "_view_count": null,
1717
+ "_view_module": "@jupyter-widgets/controls",
1718
+ "_view_module_version": "1.5.0",
1719
+ "_view_name": "LabelView",
1720
+ "description": "",
1721
+ "description_tooltip": null,
1722
+ "layout": "IPY_MODEL_09ca4ed6420340e0b76d64949301bed4",
1723
+ "placeholder": "​",
1724
+ "style": "IPY_MODEL_b2261f5044db4af0bed02f76115d08f9",
1725
+ "value": "Token is valid (permission: write)."
1726
+ }
1727
+ },
1728
+ "88242ee08b884098ad743c1738b7dc97": {
1729
+ "model_module": "@jupyter-widgets/controls",
1730
+ "model_module_version": "1.5.0",
1731
+ "model_name": "DescriptionStyleModel",
1732
+ "state": {
1733
+ "_model_module": "@jupyter-widgets/controls",
1734
+ "_model_module_version": "1.5.0",
1735
+ "_model_name": "DescriptionStyleModel",
1736
+ "_view_count": null,
1737
+ "_view_module": "@jupyter-widgets/base",
1738
+ "_view_module_version": "1.2.0",
1739
+ "_view_name": "StyleView",
1740
+ "description_width": ""
1741
+ }
1742
+ },
1743
+ "8851c4aa5ec04a368d518110692e1d67": {
1744
+ "model_module": "@jupyter-widgets/controls",
1745
+ "model_module_version": "1.5.0",
1746
+ "model_name": "ProgressStyleModel",
1747
+ "state": {
1748
+ "_model_module": "@jupyter-widgets/controls",
1749
+ "_model_module_version": "1.5.0",
1750
+ "_model_name": "ProgressStyleModel",
1751
+ "_view_count": null,
1752
+ "_view_module": "@jupyter-widgets/base",
1753
+ "_view_module_version": "1.2.0",
1754
+ "_view_name": "StyleView",
1755
+ "bar_color": null,
1756
+ "description_width": ""
1757
+ }
1758
+ },
1759
+ "934d899f6c604fa1bf4a8108aa09b190": {
1760
+ "model_module": "@jupyter-widgets/controls",
1761
+ "model_module_version": "1.5.0",
1762
+ "model_name": "DescriptionStyleModel",
1763
+ "state": {
1764
+ "_model_module": "@jupyter-widgets/controls",
1765
+ "_model_module_version": "1.5.0",
1766
+ "_model_name": "DescriptionStyleModel",
1767
+ "_view_count": null,
1768
+ "_view_module": "@jupyter-widgets/base",
1769
+ "_view_module_version": "1.2.0",
1770
+ "_view_name": "StyleView",
1771
+ "description_width": ""
1772
+ }
1773
+ },
1774
+ "9b26690a9e244eb5b8eb737eedebf9c9": {
1775
+ "model_module": "@jupyter-widgets/controls",
1776
+ "model_module_version": "1.5.0",
1777
+ "model_name": "DescriptionStyleModel",
1778
+ "state": {
1779
+ "_model_module": "@jupyter-widgets/controls",
1780
+ "_model_module_version": "1.5.0",
1781
+ "_model_name": "DescriptionStyleModel",
1782
+ "_view_count": null,
1783
+ "_view_module": "@jupyter-widgets/base",
1784
+ "_view_module_version": "1.2.0",
1785
+ "_view_name": "StyleView",
1786
+ "description_width": ""
1787
+ }
1788
+ },
1789
+ "9d043d68e16440899a6fc9b740f5970d": {
1790
+ "model_module": "@jupyter-widgets/base",
1791
+ "model_module_version": "1.2.0",
1792
+ "model_name": "LayoutModel",
1793
+ "state": {
1794
+ "_model_module": "@jupyter-widgets/base",
1795
+ "_model_module_version": "1.2.0",
1796
+ "_model_name": "LayoutModel",
1797
+ "_view_count": null,
1798
+ "_view_module": "@jupyter-widgets/base",
1799
+ "_view_module_version": "1.2.0",
1800
+ "_view_name": "LayoutView",
1801
+ "align_content": null,
1802
+ "align_items": null,
1803
+ "align_self": null,
1804
+ "border": null,
1805
+ "bottom": null,
1806
+ "display": null,
1807
+ "flex": null,
1808
+ "flex_flow": null,
1809
+ "grid_area": null,
1810
+ "grid_auto_columns": null,
1811
+ "grid_auto_flow": null,
1812
+ "grid_auto_rows": null,
1813
+ "grid_column": null,
1814
+ "grid_gap": null,
1815
+ "grid_row": null,
1816
+ "grid_template_areas": null,
1817
+ "grid_template_columns": null,
1818
+ "grid_template_rows": null,
1819
+ "height": null,
1820
+ "justify_content": null,
1821
+ "justify_items": null,
1822
+ "left": null,
1823
+ "margin": null,
1824
+ "max_height": null,
1825
+ "max_width": null,
1826
+ "min_height": null,
1827
+ "min_width": null,
1828
+ "object_fit": null,
1829
+ "object_position": null,
1830
+ "order": null,
1831
+ "overflow": null,
1832
+ "overflow_x": null,
1833
+ "overflow_y": null,
1834
+ "padding": null,
1835
+ "right": null,
1836
+ "top": null,
1837
+ "visibility": null,
1838
+ "width": null
1839
+ }
1840
+ },
1841
+ "9d970a88c8c04bc586473251393aaec7": {
1842
+ "model_module": "@jupyter-widgets/controls",
1843
+ "model_module_version": "1.5.0",
1844
+ "model_name": "CheckboxModel",
1845
+ "state": {
1846
+ "_dom_classes": [],
1847
+ "_model_module": "@jupyter-widgets/controls",
1848
+ "_model_module_version": "1.5.0",
1849
+ "_model_name": "CheckboxModel",
1850
+ "_view_count": null,
1851
+ "_view_module": "@jupyter-widgets/controls",
1852
+ "_view_module_version": "1.5.0",
1853
+ "_view_name": "CheckboxView",
1854
+ "description": "Add token as git credential?",
1855
+ "description_tooltip": null,
1856
+ "disabled": false,
1857
+ "indent": true,
1858
+ "layout": "IPY_MODEL_023691d310634e6e83da20b9575759a2",
1859
+ "style": "IPY_MODEL_dded08e463404a53abb86ac605968626",
1860
+ "value": true
1861
+ }
1862
+ },
1863
+ "9d99d6e39a424145b017abff9021d9a0": {
1864
+ "model_module": "@jupyter-widgets/base",
1865
+ "model_module_version": "1.2.0",
1866
+ "model_name": "LayoutModel",
1867
+ "state": {
1868
+ "_model_module": "@jupyter-widgets/base",
1869
+ "_model_module_version": "1.2.0",
1870
+ "_model_name": "LayoutModel",
1871
+ "_view_count": null,
1872
+ "_view_module": "@jupyter-widgets/base",
1873
+ "_view_module_version": "1.2.0",
1874
+ "_view_name": "LayoutView",
1875
+ "align_content": null,
1876
+ "align_items": null,
1877
+ "align_self": null,
1878
+ "border": null,
1879
+ "bottom": null,
1880
+ "display": null,
1881
+ "flex": null,
1882
+ "flex_flow": null,
1883
+ "grid_area": null,
1884
+ "grid_auto_columns": null,
1885
+ "grid_auto_flow": null,
1886
+ "grid_auto_rows": null,
1887
+ "grid_column": null,
1888
+ "grid_gap": null,
1889
+ "grid_row": null,
1890
+ "grid_template_areas": null,
1891
+ "grid_template_columns": null,
1892
+ "grid_template_rows": null,
1893
+ "height": null,
1894
+ "justify_content": null,
1895
+ "justify_items": null,
1896
+ "left": null,
1897
+ "margin": null,
1898
+ "max_height": null,
1899
+ "max_width": null,
1900
+ "min_height": null,
1901
+ "min_width": null,
1902
+ "object_fit": null,
1903
+ "object_position": null,
1904
+ "order": null,
1905
+ "overflow": null,
1906
+ "overflow_x": null,
1907
+ "overflow_y": null,
1908
+ "padding": null,
1909
+ "right": null,
1910
+ "top": null,
1911
+ "visibility": null,
1912
+ "width": null
1913
+ }
1914
+ },
1915
+ "9dfa5a7ee7794a5d8396674db2c0b683": {
1916
+ "model_module": "@jupyter-widgets/base",
1917
+ "model_module_version": "1.2.0",
1918
+ "model_name": "LayoutModel",
1919
+ "state": {
1920
+ "_model_module": "@jupyter-widgets/base",
1921
+ "_model_module_version": "1.2.0",
1922
+ "_model_name": "LayoutModel",
1923
+ "_view_count": null,
1924
+ "_view_module": "@jupyter-widgets/base",
1925
+ "_view_module_version": "1.2.0",
1926
+ "_view_name": "LayoutView",
1927
+ "align_content": null,
1928
+ "align_items": null,
1929
+ "align_self": null,
1930
+ "border": null,
1931
+ "bottom": null,
1932
+ "display": null,
1933
+ "flex": null,
1934
+ "flex_flow": null,
1935
+ "grid_area": null,
1936
+ "grid_auto_columns": null,
1937
+ "grid_auto_flow": null,
1938
+ "grid_auto_rows": null,
1939
+ "grid_column": null,
1940
+ "grid_gap": null,
1941
+ "grid_row": null,
1942
+ "grid_template_areas": null,
1943
+ "grid_template_columns": null,
1944
+ "grid_template_rows": null,
1945
+ "height": null,
1946
+ "justify_content": null,
1947
+ "justify_items": null,
1948
+ "left": null,
1949
+ "margin": null,
1950
+ "max_height": null,
1951
+ "max_width": null,
1952
+ "min_height": null,
1953
+ "min_width": null,
1954
+ "object_fit": null,
1955
+ "object_position": null,
1956
+ "order": null,
1957
+ "overflow": null,
1958
+ "overflow_x": null,
1959
+ "overflow_y": null,
1960
+ "padding": null,
1961
+ "right": null,
1962
+ "top": null,
1963
+ "visibility": null,
1964
+ "width": null
1965
+ }
1966
+ },
1967
+ "9f8288bb8cae4796a067580ff7afce69": {
1968
+ "model_module": "@jupyter-widgets/controls",
1969
+ "model_module_version": "1.5.0",
1970
+ "model_name": "ButtonModel",
1971
+ "state": {
1972
+ "_dom_classes": [],
1973
+ "_model_module": "@jupyter-widgets/controls",
1974
+ "_model_module_version": "1.5.0",
1975
+ "_model_name": "ButtonModel",
1976
+ "_view_count": null,
1977
+ "_view_module": "@jupyter-widgets/controls",
1978
+ "_view_module_version": "1.5.0",
1979
+ "_view_name": "ButtonView",
1980
+ "button_style": "",
1981
+ "description": "Login",
1982
+ "disabled": false,
1983
+ "icon": "",
1984
+ "layout": "IPY_MODEL_9d99d6e39a424145b017abff9021d9a0",
1985
+ "style": "IPY_MODEL_38e81f2aed79485498035e9c418165b4",
1986
+ "tooltip": ""
1987
+ }
1988
+ },
1989
+ "a264d9f9f65846f9b606dd4e3e7eda8e": {
1990
+ "model_module": "@jupyter-widgets/base",
1991
+ "model_module_version": "1.2.0",
1992
+ "model_name": "LayoutModel",
1993
+ "state": {
1994
+ "_model_module": "@jupyter-widgets/base",
1995
+ "_model_module_version": "1.2.0",
1996
+ "_model_name": "LayoutModel",
1997
+ "_view_count": null,
1998
+ "_view_module": "@jupyter-widgets/base",
1999
+ "_view_module_version": "1.2.0",
2000
+ "_view_name": "LayoutView",
2001
+ "align_content": null,
2002
+ "align_items": null,
2003
+ "align_self": null,
2004
+ "border": null,
2005
+ "bottom": null,
2006
+ "display": null,
2007
+ "flex": null,
2008
+ "flex_flow": null,
2009
+ "grid_area": null,
2010
+ "grid_auto_columns": null,
2011
+ "grid_auto_flow": null,
2012
+ "grid_auto_rows": null,
2013
+ "grid_column": null,
2014
+ "grid_gap": null,
2015
+ "grid_row": null,
2016
+ "grid_template_areas": null,
2017
+ "grid_template_columns": null,
2018
+ "grid_template_rows": null,
2019
+ "height": null,
2020
+ "justify_content": null,
2021
+ "justify_items": null,
2022
+ "left": null,
2023
+ "margin": null,
2024
+ "max_height": null,
2025
+ "max_width": null,
2026
+ "min_height": null,
2027
+ "min_width": null,
2028
+ "object_fit": null,
2029
+ "object_position": null,
2030
+ "order": null,
2031
+ "overflow": null,
2032
+ "overflow_x": null,
2033
+ "overflow_y": null,
2034
+ "padding": null,
2035
+ "right": null,
2036
+ "top": null,
2037
+ "visibility": null,
2038
+ "width": "20px"
2039
+ }
2040
+ },
2041
+ "aee4b5a2e361451dae879f37222245f3": {
2042
+ "model_module": "@jupyter-widgets/controls",
2043
+ "model_module_version": "1.5.0",
2044
+ "model_name": "DescriptionStyleModel",
2045
+ "state": {
2046
+ "_model_module": "@jupyter-widgets/controls",
2047
+ "_model_module_version": "1.5.0",
2048
+ "_model_name": "DescriptionStyleModel",
2049
+ "_view_count": null,
2050
+ "_view_module": "@jupyter-widgets/base",
2051
+ "_view_module_version": "1.2.0",
2052
+ "_view_name": "StyleView",
2053
+ "description_width": ""
2054
+ }
2055
+ },
2056
+ "b2261f5044db4af0bed02f76115d08f9": {
2057
+ "model_module": "@jupyter-widgets/controls",
2058
+ "model_module_version": "1.5.0",
2059
+ "model_name": "DescriptionStyleModel",
2060
+ "state": {
2061
+ "_model_module": "@jupyter-widgets/controls",
2062
+ "_model_module_version": "1.5.0",
2063
+ "_model_name": "DescriptionStyleModel",
2064
+ "_view_count": null,
2065
+ "_view_module": "@jupyter-widgets/base",
2066
+ "_view_module_version": "1.2.0",
2067
+ "_view_name": "StyleView",
2068
+ "description_width": ""
2069
+ }
2070
+ },
2071
+ "b7f608ef35d84fd7a736260236025429": {
2072
+ "model_module": "@jupyter-widgets/controls",
2073
+ "model_module_version": "1.5.0",
2074
+ "model_name": "LabelModel",
2075
+ "state": {
2076
+ "_dom_classes": [],
2077
+ "_model_module": "@jupyter-widgets/controls",
2078
+ "_model_module_version": "1.5.0",
2079
+ "_model_name": "LabelModel",
2080
+ "_view_count": null,
2081
+ "_view_module": "@jupyter-widgets/controls",
2082
+ "_view_module_version": "1.5.0",
2083
+ "_view_name": "LabelView",
2084
+ "description": "",
2085
+ "description_tooltip": null,
2086
+ "layout": "IPY_MODEL_eca43f65d6c1407bbb16cd26f60d5b7f",
2087
+ "placeholder": "​",
2088
+ "style": "IPY_MODEL_934d899f6c604fa1bf4a8108aa09b190",
2089
+ "value": "Login successful"
2090
+ }
2091
+ },
2092
+ "c26831761f8e4fdf9477663d15d95e78": {
2093
+ "model_module": "@jupyter-widgets/base",
2094
+ "model_module_version": "1.2.0",
2095
+ "model_name": "LayoutModel",
2096
+ "state": {
2097
+ "_model_module": "@jupyter-widgets/base",
2098
+ "_model_module_version": "1.2.0",
2099
+ "_model_name": "LayoutModel",
2100
+ "_view_count": null,
2101
+ "_view_module": "@jupyter-widgets/base",
2102
+ "_view_module_version": "1.2.0",
2103
+ "_view_name": "LayoutView",
2104
+ "align_content": null,
2105
+ "align_items": null,
2106
+ "align_self": null,
2107
+ "border": null,
2108
+ "bottom": null,
2109
+ "display": null,
2110
+ "flex": null,
2111
+ "flex_flow": null,
2112
+ "grid_area": null,
2113
+ "grid_auto_columns": null,
2114
+ "grid_auto_flow": null,
2115
+ "grid_auto_rows": null,
2116
+ "grid_column": null,
2117
+ "grid_gap": null,
2118
+ "grid_row": null,
2119
+ "grid_template_areas": null,
2120
+ "grid_template_columns": null,
2121
+ "grid_template_rows": null,
2122
+ "height": null,
2123
+ "justify_content": null,
2124
+ "justify_items": null,
2125
+ "left": null,
2126
+ "margin": null,
2127
+ "max_height": null,
2128
+ "max_width": null,
2129
+ "min_height": null,
2130
+ "min_width": null,
2131
+ "object_fit": null,
2132
+ "object_position": null,
2133
+ "order": null,
2134
+ "overflow": null,
2135
+ "overflow_x": null,
2136
+ "overflow_y": null,
2137
+ "padding": null,
2138
+ "right": null,
2139
+ "top": null,
2140
+ "visibility": null,
2141
+ "width": null
2142
+ }
2143
+ },
2144
+ "d0f55244ec614704a571f920ffa27bfd": {
2145
+ "model_module": "@jupyter-widgets/controls",
2146
+ "model_module_version": "1.5.0",
2147
+ "model_name": "DescriptionStyleModel",
2148
+ "state": {
2149
+ "_model_module": "@jupyter-widgets/controls",
2150
+ "_model_module_version": "1.5.0",
2151
+ "_model_name": "DescriptionStyleModel",
2152
+ "_view_count": null,
2153
+ "_view_module": "@jupyter-widgets/base",
2154
+ "_view_module_version": "1.2.0",
2155
+ "_view_name": "StyleView",
2156
+ "description_width": ""
2157
+ }
2158
+ },
2159
+ "d97e9b98cade45eaa2b9b526b1a4bb98": {
2160
+ "model_module": "@jupyter-widgets/controls",
2161
+ "model_module_version": "1.5.0",
2162
+ "model_name": "LabelModel",
2163
+ "state": {
2164
+ "_dom_classes": [],
2165
+ "_model_module": "@jupyter-widgets/controls",
2166
+ "_model_module_version": "1.5.0",
2167
+ "_model_name": "LabelModel",
2168
+ "_view_count": null,
2169
+ "_view_module": "@jupyter-widgets/controls",
2170
+ "_view_module_version": "1.5.0",
2171
+ "_view_name": "LabelView",
2172
+ "description": "",
2173
+ "description_tooltip": null,
2174
+ "layout": "IPY_MODEL_3b77fe48c5e44c879998de497be7a381",
2175
+ "placeholder": "​",
2176
+ "style": "IPY_MODEL_15032b9578124624bcc42771cb5d5ad8",
2177
+ "value": "Your token has been saved to /root/.cache/huggingface/token"
2178
+ }
2179
+ },
2180
+ "dded08e463404a53abb86ac605968626": {
2181
+ "model_module": "@jupyter-widgets/controls",
2182
+ "model_module_version": "1.5.0",
2183
+ "model_name": "DescriptionStyleModel",
2184
+ "state": {
2185
+ "_model_module": "@jupyter-widgets/controls",
2186
+ "_model_module_version": "1.5.0",
2187
+ "_model_name": "DescriptionStyleModel",
2188
+ "_view_count": null,
2189
+ "_view_module": "@jupyter-widgets/base",
2190
+ "_view_module_version": "1.2.0",
2191
+ "_view_name": "StyleView",
2192
+ "description_width": ""
2193
+ }
2194
+ },
2195
+ "e2c41567b4eb400ba534374edb3a0df9": {
2196
+ "model_module": "@jupyter-widgets/controls",
2197
+ "model_module_version": "1.5.0",
2198
+ "model_name": "FloatProgressModel",
2199
+ "state": {
2200
+ "_dom_classes": [],
2201
+ "_model_module": "@jupyter-widgets/controls",
2202
+ "_model_module_version": "1.5.0",
2203
+ "_model_name": "FloatProgressModel",
2204
+ "_view_count": null,
2205
+ "_view_module": "@jupyter-widgets/controls",
2206
+ "_view_module_version": "1.5.0",
2207
+ "_view_name": "ProgressView",
2208
+ "bar_style": "success",
2209
+ "description": "",
2210
+ "description_tooltip": null,
2211
+ "layout": "IPY_MODEL_a264d9f9f65846f9b606dd4e3e7eda8e",
2212
+ "max": 1,
2213
+ "min": 0,
2214
+ "orientation": "horizontal",
2215
+ "style": "IPY_MODEL_8851c4aa5ec04a368d518110692e1d67",
2216
+ "value": 1
2217
+ }
2218
+ },
2219
+ "e4c0e23001254742a94898203a222c6c": {
2220
+ "model_module": "@jupyter-widgets/controls",
2221
+ "model_module_version": "1.5.0",
2222
+ "model_name": "PasswordModel",
2223
+ "state": {
2224
+ "_dom_classes": [],
2225
+ "_model_module": "@jupyter-widgets/controls",
2226
+ "_model_module_version": "1.5.0",
2227
+ "_model_name": "PasswordModel",
2228
+ "_view_count": null,
2229
+ "_view_module": "@jupyter-widgets/controls",
2230
+ "_view_module_version": "1.5.0",
2231
+ "_view_name": "PasswordView",
2232
+ "continuous_update": true,
2233
+ "description": "Token:",
2234
+ "description_tooltip": null,
2235
+ "disabled": false,
2236
+ "layout": "IPY_MODEL_9dfa5a7ee7794a5d8396674db2c0b683",
2237
+ "placeholder": "​",
2238
+ "style": "IPY_MODEL_3634abd523b7477082a0a8135f1fa770",
2239
+ "value": ""
2240
+ }
2241
+ },
2242
+ "eca43f65d6c1407bbb16cd26f60d5b7f": {
2243
+ "model_module": "@jupyter-widgets/base",
2244
+ "model_module_version": "1.2.0",
2245
+ "model_name": "LayoutModel",
2246
+ "state": {
2247
+ "_model_module": "@jupyter-widgets/base",
2248
+ "_model_module_version": "1.2.0",
2249
+ "_model_name": "LayoutModel",
2250
+ "_view_count": null,
2251
+ "_view_module": "@jupyter-widgets/base",
2252
+ "_view_module_version": "1.2.0",
2253
+ "_view_name": "LayoutView",
2254
+ "align_content": null,
2255
+ "align_items": null,
2256
+ "align_self": null,
2257
+ "border": null,
2258
+ "bottom": null,
2259
+ "display": null,
2260
+ "flex": null,
2261
+ "flex_flow": null,
2262
+ "grid_area": null,
2263
+ "grid_auto_columns": null,
2264
+ "grid_auto_flow": null,
2265
+ "grid_auto_rows": null,
2266
+ "grid_column": null,
2267
+ "grid_gap": null,
2268
+ "grid_row": null,
2269
+ "grid_template_areas": null,
2270
+ "grid_template_columns": null,
2271
+ "grid_template_rows": null,
2272
+ "height": null,
2273
+ "justify_content": null,
2274
+ "justify_items": null,
2275
+ "left": null,
2276
+ "margin": null,
2277
+ "max_height": null,
2278
+ "max_width": null,
2279
+ "min_height": null,
2280
+ "min_width": null,
2281
+ "object_fit": null,
2282
+ "object_position": null,
2283
+ "order": null,
2284
+ "overflow": null,
2285
+ "overflow_x": null,
2286
+ "overflow_y": null,
2287
+ "padding": null,
2288
+ "right": null,
2289
+ "top": null,
2290
+ "visibility": null,
2291
+ "width": null
2292
+ }
2293
+ }
2294
+ }
2295
+ }
2296
+ },
2297
+ "nbformat": 4,
2298
+ "nbformat_minor": 0
2299
+ }