{ "cells": [ { "cell_type": "markdown", "id": "75b58048-7d14-4fc6-8085-1fc08c81b4a6", "metadata": { "id": "75b58048-7d14-4fc6-8085-1fc08c81b4a6" }, "source": [ "# Fine-Tune Whisper With 🤗 Transformers and Streaming Mode" ] }, { "cell_type": "markdown", "id": "fbfa8ad5-4cdc-4512-9058-836cbbf65e1a", "metadata": { "id": "fbfa8ad5-4cdc-4512-9058-836cbbf65e1a" }, "source": [ "In this Colab, we present a step-by-step guide on fine-tuning Whisper with Hugging Face 🤗 Transformers on 400 hours of speech data! Using streaming mode, we'll show how you can train a speech recongition model on any dataset, irrespective of size. With streaming mode, storage requirements are no longer a consideration: you can train a model on whatever dataset you want, even if it's download size exceeds your devices disk space. How can this be possible? It simply seems too good to be true! Well, rest assured it's not 😉 Carry on reading to find out more." ] }, { "cell_type": "markdown", "id": "afe0d503-ae4e-4aa7-9af4-dbcba52db41e", "metadata": { "id": "afe0d503-ae4e-4aa7-9af4-dbcba52db41e" }, "source": [ "## Introduction" ] }, { "cell_type": "markdown", "id": "9ae91ed4-9c3e-4ade-938e-f4c2dcfbfdc0", "metadata": { "id": "9ae91ed4-9c3e-4ade-938e-f4c2dcfbfdc0" }, "source": [ "Speech recognition datasets are large. A typical speech dataset consists of approximately 100 hours of audio-transcription data, requiring upwards of 130GB of storage space for download and preparation. For most ASR researchers, this is already at the upper limit of what is feasible for disk space. So what happens when we want to train on a larger dataset? The full [LibriSpeech](https://huggingface.co/datasets/librispeech_asr) dataset consists of 960 hours of audio data. Kensho's [SPGISpeech](https://huggingface.co/datasets/kensho/spgispeech) contains 5,000 hours of audio data. ML Commons [People's Speech](https://huggingface.co/datasets/MLCommons/peoples_speech) contains **30,000+** hours of audio data! Do we need to bite the bullet and buy additional storage? Or is there a way we can train on all of these datasets with no disk drive requirements?\n", "\n", "When training machine learning systems, we rarely use the entire dataset at once. We typically _batch_ our data into smaller subsets of data, and pass these incrementally through our training pipeline. This is because we train our system on an accelerator device, such as a GPU or TPU, which has a memory limit typically around 16GB. We have to fit our model, optimiser and training data all on the same accelerator device, so we usually have to divide the dataset up into smaller batches and move them from the CPU to the GPU when required.\n", "\n", "Consequently, we don't require the entire dataset to be downloaded at once; we simply need the batch of data that we pass to our model at any one go. We can leverage this principle of partial dataset loading when preparing our dataset: rather than downloading the entire dataset at the start, we can load each piece of data as and when we need it. For each batch, we load the relevant data from a remote server and pass it through the training pipeline. For the next batch, we load the next items and again pass them through the training pipeline. At no point do we have to save data to our disk drive, we simply load them in memory and use them in our pipeline. In doing so, we only ever need as much memory as each individual batch requires.\n", "\n", "This is analogous to downloading a TV show versus streaming it 📺 When we download a TV show, we download the entire video offline and save it to our disk. Compare this to when we stream a TV show. Here, we don't download any part of the video to memory, but iterate over the video file and load each part in real-time as required. It's this same principle that we can apply to our ML training pipeline! We want to iterate over the dataset and load each sample of data as required.\n", "\n", "While the principle of partial dataset loading sounds ideal, it also seems **pretty** difficult to do. Luckily for us, 🤗 Datasets allows us to do this with minimal code changes! We'll make use of the principle of [_streaming_](https://huggingface.co/docs/datasets/stream), depicted graphically in Figure 1. Streaming does exactly this: the data is loaded progressively as we iterate over the dataset, meaning it is only loaded as and when we need it. If you're familiar with 🤗 Transformers and Datasets, the content of this notebook will be very familiar, with some small extensions to support streaming mode." ] }, { "cell_type": "markdown", "id": "1c87f76e-47be-4a5d-bc52-7b1c2e9d4f5a", "metadata": { "id": "1c87f76e-47be-4a5d-bc52-7b1c2e9d4f5a" }, "source": [ "" ] }, { "cell_type": "markdown", "id": "d44b85a2-3465-4cd5-bcca-8ddb302ab71b", "metadata": { "id": "d44b85a2-3465-4cd5-bcca-8ddb302ab71b", "tags": [] }, "source": [ "## Prepare Environment" ] }, { "cell_type": "code", "execution_count": 1, "id": "a0e8a3b5-2c0b-4ee6-98cc-21a571266a5d", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "a0e8a3b5-2c0b-4ee6-98cc-21a571266a5d", "outputId": "09b1863a-eb05-4610-b763-2a7b69cd77bf" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Reading package lists... Done\n", "Building dependency tree \n", "Reading state information... Done\n", "git-lfs is already the newest version (2.9.2-1).\n", "0 upgraded, 0 newly installed, 0 to remove and 154 not upgraded.\n", "Error: Failed to call git rev-parse --git-dir: exit status 128 \n", "Git LFS initialized.\n" ] } ], "source": [ "!sudo apt-get install git-lfs\n", "!sudo git lfs install\n" ] }, { "cell_type": "code", "execution_count": 2, "id": "QJBETye7FkvV", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "QJBETye7FkvV", "outputId": "e055cc0a-0a62-4a14-f360-2a64782a5a35" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Defaulting to user installation because normal site-packages is not writeable\n", "Requirement already satisfied: pip in ./.local/lib/python3.8/site-packages (22.3.1)\n", "Defaulting to user installation because normal site-packages is not writeable\n", "Requirement already satisfied: numpy<1.23.0 in ./.local/lib/python3.8/site-packages (1.22.4)\n", "Defaulting to user installation because normal site-packages is not writeable\n", "Collecting torch\n", " Using cached torch-1.13.0-cp38-cp38-manylinux1_x86_64.whl (890.2 MB)\n", "Collecting torchaudio\n", " Using cached torchaudio-0.13.0-cp38-cp38-manylinux1_x86_64.whl (4.2 MB)\n", "Collecting torchvision\n", " Downloading torchvision-0.14.0-cp38-cp38-manylinux1_x86_64.whl (24.3 MB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m24.3/24.3 MB\u001b[0m \u001b[31m103.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n", "\u001b[?25hCollecting nvidia-cuda-runtime-cu11==11.7.99\n", " Using cached nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)\n", "Collecting typing-extensions\n", " Using cached typing_extensions-4.4.0-py3-none-any.whl (26 kB)\n", "Collecting nvidia-cuda-nvrtc-cu11==11.7.99\n", " Using cached nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)\n", "Collecting nvidia-cudnn-cu11==8.5.0.96\n", " Using cached nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)\n", "Collecting nvidia-cublas-cu11==11.10.3.66\n", " Using cached nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)\n", "Collecting setuptools\n", " Using cached setuptools-65.6.3-py3-none-any.whl (1.2 MB)\n", "Collecting wheel\n", " Using cached wheel-0.38.4-py3-none-any.whl (36 kB)\n", "Collecting numpy\n", " Downloading numpy-1.24.0rc2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m17.3/17.3 MB\u001b[0m \u001b[31m100.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n", "\u001b[?25hCollecting pillow!=8.3.*,>=5.3.0\n", " Downloading Pillow-9.3.0-cp38-cp38-manylinux_2_28_x86_64.whl (3.3 MB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.3/3.3 MB\u001b[0m \u001b[31m145.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting requests\n", " Using cached requests-2.28.1-py3-none-any.whl (62 kB)\n", "Collecting certifi>=2017.4.17\n", " Downloading certifi-2022.12.7-py3-none-any.whl (155 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m155.3/155.3 kB\u001b[0m \u001b[31m44.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting charset-normalizer<3,>=2\n", " Using cached charset_normalizer-2.1.1-py3-none-any.whl (39 kB)\n", "Collecting urllib3<1.27,>=1.21.1\n", " Downloading urllib3-1.26.13-py2.py3-none-any.whl (140 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m140.6/140.6 kB\u001b[0m \u001b[31m38.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting idna<4,>=2.5\n", " Downloading idna-3.4-py3-none-any.whl (61 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m61.5/61.5 kB\u001b[0m \u001b[31m16.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hInstalling collected packages: wheel, urllib3, typing-extensions, setuptools, pillow, nvidia-cuda-nvrtc-cu11, numpy, idna, charset-normalizer, certifi, requests, nvidia-cuda-runtime-cu11, nvidia-cublas-cu11, nvidia-cudnn-cu11, torch, torchvision, torchaudio\n", " Attempting uninstall: typing-extensions\n", " Found existing installation: typing_extensions 4.4.0\n", " Uninstalling typing_extensions-4.4.0:\n", " Successfully uninstalled typing_extensions-4.4.0\n", " Attempting uninstall: nvidia-cuda-nvrtc-cu11\n", " Found existing installation: nvidia-cuda-nvrtc-cu11 11.7.99\n", " Uninstalling nvidia-cuda-nvrtc-cu11-11.7.99:\n", " Successfully uninstalled nvidia-cuda-nvrtc-cu11-11.7.99\n", " Attempting uninstall: numpy\n", " Found existing installation: numpy 1.22.4\n", " Uninstalling numpy-1.22.4:\n", " Successfully uninstalled numpy-1.22.4\n", " Attempting uninstall: charset-normalizer\n", " Found existing installation: charset-normalizer 2.1.1\n", " Uninstalling charset-normalizer-2.1.1:\n", " Successfully uninstalled charset-normalizer-2.1.1\n", " Attempting uninstall: requests\n", " Found existing installation: requests 2.28.1\n", " Uninstalling requests-2.28.1:\n", " Successfully uninstalled requests-2.28.1\n", " Attempting uninstall: nvidia-cuda-runtime-cu11\n", " Found existing installation: nvidia-cuda-runtime-cu11 11.7.99\n", " Uninstalling nvidia-cuda-runtime-cu11-11.7.99:\n", " Successfully uninstalled nvidia-cuda-runtime-cu11-11.7.99\n", " Attempting uninstall: nvidia-cublas-cu11\n", " Found existing installation: nvidia-cublas-cu11 11.10.3.66\n", " Uninstalling nvidia-cublas-cu11-11.10.3.66:\n", " Successfully uninstalled nvidia-cublas-cu11-11.10.3.66\n", " Attempting uninstall: nvidia-cudnn-cu11\n", " Found existing installation: nvidia-cudnn-cu11 8.5.0.96\n", " Uninstalling nvidia-cudnn-cu11-8.5.0.96:\n", " Successfully uninstalled nvidia-cudnn-cu11-8.5.0.96\n", " Attempting uninstall: torch\n", " Found existing installation: torch 1.13.0\n", " Uninstalling torch-1.13.0:\n", " Successfully uninstalled torch-1.13.0\n", " Attempting uninstall: torchaudio\n", " Found existing installation: torchaudio 0.13.0\n", " Uninstalling torchaudio-0.13.0:\n", " Successfully uninstalled torchaudio-0.13.0\n", "\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n", "launchpadlib 1.10.13 requires testresources, which is not installed.\n", "pandas-profiling 3.4.0 requires numpy<1.24,>=1.16.0, but you have numpy 1.24.0rc2 which is incompatible.\n", "numba 0.56.4 requires numpy<1.24,>=1.18, but you have numpy 1.24.0rc2 which is incompatible.\u001b[0m\u001b[31m\n", "\u001b[0mSuccessfully installed certifi-2022.12.7 charset-normalizer-2.1.1 idna-3.4 numpy-1.24.0rc2 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 pillow-9.3.0 requests-2.28.1 setuptools-65.6.3 torch-1.13.0 torchaudio-0.13.0 torchvision-0.14.0 typing-extensions-4.4.0 urllib3-1.26.13 wheel-0.38.4\n" ] } ], "source": [ "!pip3 install --upgrade pip\n", "!pip3 install \"numpy<1.23.0\"\n", "\n", "!pip3 install --pre torch torchaudio torchvision --force-reinstall\n", "\n", "!pip3 install bitsandbytes\n", "\n", "\n", "#!pip3 install --pre torch torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cu116\n", "#!pip3 install numpy --pre torch[dynamo] torchvision torchaudio --force-reinstall --extra-index-url https://download.pytorch.org/whl/nightly/cu116\n", "#!pip3 install numpy --pre torch[dynamo] torchaudio --force-reinstall --extra-index-url https://download.pytorch.org/whl/nightly/cu116\n", "\n", "#!pip3 install numpy --pre torch[dynamo] torchaudio --upgrade --extra-index-url https://download.pytorch.org/whl/nightly/cu117\n", "\n" ] }, { "cell_type": "markdown", "id": "a47bbac5-b44b-41ac-a948-1b57cec2b6f1", "metadata": { "id": "a47bbac5-b44b-41ac-a948-1b57cec2b6f1" }, "source": [ "First of all, let's try to secure a decent GPU for our Colab! Unfortunately, it's becoming much harder to get access to a good GPU with the free version of Google Colab. However, with Google Colab Pro / Pro+ one should have no issues in being allocated a V100 or P100 GPU.\n", "\n", "To get a GPU, click _Runtime_ -> _Change runtime type_, then change _Hardware accelerator_ from _None_ to _GPU_." ] }, { "cell_type": "markdown", "id": "47686bd5-cbb1-4352-81cf-0fcf7bbd45c3", "metadata": { "id": "47686bd5-cbb1-4352-81cf-0fcf7bbd45c3" }, "source": [ "We can verify that we've been assigned a GPU and view its specifications:" ] }, { "cell_type": "code", "execution_count": 3, "id": "d74b38c5-a1fb-4214-b4f4-b5bf0869f169", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "d74b38c5-a1fb-4214-b4f4-b5bf0869f169", "outputId": "18ca6853-0836-4cba-f06a-02fe1cecd715", "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Thu Dec 8 18:45:37 2022 \n", "+-----------------------------------------------------------------------------+\n", "| NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 |\n", "|-------------------------------+----------------------+----------------------+\n", "| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n", "| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n", "| | | MIG M. |\n", "|===============================+======================+======================|\n", "| 0 NVIDIA A100-SXM... On | 00000000:06:00.0 Off | 0 |\n", "| N/A 31C P0 47W / 400W | 0MiB / 40960MiB | 0% Default |\n", "| | | Disabled |\n", "+-------------------------------+----------------------+----------------------+\n", " \n", "+-----------------------------------------------------------------------------+\n", "| Processes: |\n", "| GPU GI CI PID Type Process name GPU Memory |\n", "| ID ID Usage |\n", "|=============================================================================|\n", "| No running processes found |\n", "+-----------------------------------------------------------------------------+\n" ] } ], "source": [ "gpu_info = !nvidia-smi\n", "gpu_info = '\\n'.join(gpu_info)\n", "if gpu_info.find('failed') >= 0:\n", " print('Not connected to a GPU')\n", "else:\n", " print(gpu_info)" ] }, { "cell_type": "markdown", "id": "be67f92a-2f3b-4941-a1c0-5ed2de6e0a6a", "metadata": { "id": "be67f92a-2f3b-4941-a1c0-5ed2de6e0a6a", "tags": [] }, "source": [ "Next, we need to update the Unix package `ffmpeg` to version 4:" ] }, { "cell_type": "code", "execution_count": 4, "id": "15493a84-8b7c-4b35-9aeb-2b0a57a4e937", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "15493a84-8b7c-4b35-9aeb-2b0a57a4e937", "outputId": "9463f72d-a888-4980-abc4-2b6a2ece61b2", "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Get:1 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 InRelease [1484 B]\n", "Hit:2 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/amd64 InRelease\n", "Hit:3 https://download.docker.com/linux/ubuntu focal InRelease \n", "Hit:4 http://archive.lambdalabs.com/ubuntu focal InRelease \n", "Hit:5 https://packages.cloud.google.com/apt cloud-sdk InRelease \n", "Hit:6 http://security.ubuntu.com/ubuntu focal-security InRelease \n", "Ign:7 http://ppa.launchpad.net/jonathonf/ffmpeg-4/ubuntu focal InRelease \n", "Hit:8 http://archive.ubuntu.com/ubuntu focal InRelease \n", "Hit:9 https://packages.microsoft.com/repos/azure-cli focal InRelease \n", "Hit:10 http://archive.ubuntu.com/ubuntu focal-updates InRelease \n", "Hit:11 https://pkg.cloudflare.com/cloudflared focal InRelease \n", "Hit:12 http://archive.ubuntu.com/ubuntu focal-backports InRelease \n", "Err:13 http://ppa.launchpad.net/jonathonf/ffmpeg-4/ubuntu focal Release \n", " 404 Not Found [IP: 185.125.190.52 80]\n", "Hit:14 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu focal InRelease \n", "Reading package lists... Done\n", "E: The repository 'http://ppa.launchpad.net/jonathonf/ffmpeg-4/ubuntu focal Release' does not have a Release file.\n", "N: Updating from such a repository can't be done securely, and is therefore disabled by default.\n", "N: See apt-secure(8) manpage for repository creation and user configuration details.\n", "Get:1 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 InRelease [1484 B]\n", "Hit:2 https://download.docker.com/linux/ubuntu focal InRelease \u001b[0m\u001b[33m\n", "Hit:3 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/amd64 InRelease\n", "Hit:4 https://packages.cloud.google.com/apt cloud-sdk InRelease \u001b[0m\u001b[33m\u001b[33m\n", "Hit:5 http://archive.lambdalabs.com/ubuntu focal InRelease \u001b[0m\n", "Hit:6 http://archive.ubuntu.com/ubuntu focal InRelease \u001b[0m\u001b[33m\n", "Hit:7 http://security.ubuntu.com/ubuntu focal-security InRelease \u001b[0m\n", "Ign:8 http://ppa.launchpad.net/jonathonf/ffmpeg-4/ubuntu focal InRelease \u001b[0m\n", "Hit:9 https://packages.microsoft.com/repos/azure-cli focal InRelease \u001b[0m\u001b[33m\n", "Hit:10 http://archive.ubuntu.com/ubuntu focal-updates InRelease \u001b[0m\n", "Hit:11 https://pkg.cloudflare.com/cloudflared focal InRelease \u001b[0m\u001b[33m\n", "Hit:12 http://archive.ubuntu.com/ubuntu focal-backports InRelease \u001b[0m \u001b[0m\u001b[33m\n", "Hit:13 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu focal InRelease\n", "Err:14 http://ppa.launchpad.net/jonathonf/ffmpeg-4/ubuntu focal Release\n", " 404 Not Found [IP: 185.125.190.52 80]\n", "Reading package lists... Done\u001b[33m\n", "\u001b[1;31mE: \u001b[0mThe repository 'http://ppa.launchpad.net/jonathonf/ffmpeg-4/ubuntu focal Release' does not have a Release file.\u001b[0m\n", "\u001b[33mN: \u001b[0mUpdating from such a repository can't be done securely, and is therefore disabled by default.\u001b[0m\n", "\u001b[33mN: \u001b[0mSee apt-secure(8) manpage for repository creation and user configuration details.\u001b[0m\n", "Reading package lists... Done\n", "Building dependency tree \n", "Reading state information... Done\n", "ffmpeg is already the newest version (7:4.2.7-0ubuntu0.1).\n", "0 upgraded, 0 newly installed, 0 to remove and 154 not upgraded.\n" ] } ], "source": [ "!sudo add-apt-repository -y ppa:jonathonf/ffmpeg-4\n", "!sudo apt update\n", "!sudo apt install -y ffmpeg" ] }, { "cell_type": "markdown", "id": "ab471347-a547-4d14-9d11-f151dc9547a7", "metadata": { "id": "ab471347-a547-4d14-9d11-f151dc9547a7" }, "source": [ "We'll employ several popular Python packages to fine-tune the Whisper model.\n", "We'll use `datasets` to download and prepare our training data and \n", "`transformers` to load and train our Whisper model. We'll also require\n", "the `soundfile` package to pre-process audio files, `evaluate` and `jiwer` to\n", "assess the performance of our model. Finally, we'll\n", "use `gradio` to build a flashy demo of our fine-tuned model." ] }, { "cell_type": "code", "execution_count": 5, "id": "4e106846-3620-46aa-989d-5e35e27c8057", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "4e106846-3620-46aa-989d-5e35e27c8057", "outputId": "6bcef5d6-c7de-45de-abd4-ab5883dfaab4" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Defaulting to user installation because normal site-packages is not writeable\n", "Collecting git+https://github.com/huggingface/datasets\n", " Cloning https://github.com/huggingface/datasets to /tmp/pip-req-build-_aqg2yxr\n", " Running command git clone --filter=blob:none --quiet https://github.com/huggingface/datasets /tmp/pip-req-build-_aqg2yxr\n", " Resolved https://github.com/huggingface/datasets to commit 45508f7d8858579c62d93779873ef5eb6b05bc74\n", " Installing build dependencies ... \u001b[?25ldone\n", "\u001b[?25h Getting requirements to build wheel ... \u001b[?25ldone\n", "\u001b[?25h Preparing metadata (pyproject.toml) ... \u001b[?25ldone\n", "\u001b[?25hRequirement already satisfied: dill<0.3.7 in /usr/local/lib/python3.8/dist-packages (from datasets==2.7.1.dev0) (0.3.6)\n", "Requirement already satisfied: pyarrow>=6.0.0 in /usr/local/lib/python3.8/dist-packages (from datasets==2.7.1.dev0) (10.0.1)\n", "Requirement already satisfied: pyyaml>=5.1 in /usr/lib/python3/dist-packages (from datasets==2.7.1.dev0) (5.3.1)\n", "Requirement already satisfied: pandas in ./.local/lib/python3.8/site-packages (from datasets==2.7.1.dev0) (1.5.1)\n", "Requirement already satisfied: huggingface-hub<1.0.0,>=0.2.0 in /usr/local/lib/python3.8/dist-packages (from datasets==2.7.1.dev0) (0.11.1)\n", "Requirement already satisfied: aiohttp in /usr/local/lib/python3.8/dist-packages (from datasets==2.7.1.dev0) (3.8.3)\n", "Requirement already satisfied: numpy>=1.17 in ./.local/lib/python3.8/site-packages (from datasets==2.7.1.dev0) (1.24.0rc2)\n", "Requirement already satisfied: multiprocess in /usr/local/lib/python3.8/dist-packages (from datasets==2.7.1.dev0) (0.70.14)\n", "Requirement already satisfied: fsspec[http]>=2021.11.1 in /usr/local/lib/python3.8/dist-packages (from datasets==2.7.1.dev0) (2022.11.0)\n", "Requirement already satisfied: packaging in ./.local/lib/python3.8/site-packages (from datasets==2.7.1.dev0) (21.3)\n", "Requirement already satisfied: responses<0.19 in /usr/local/lib/python3.8/dist-packages (from datasets==2.7.1.dev0) (0.18.0)\n", "Requirement already satisfied: requests>=2.19.0 in ./.local/lib/python3.8/site-packages (from datasets==2.7.1.dev0) (2.28.1)\n", "Requirement already satisfied: tqdm>=4.62.1 in ./.local/lib/python3.8/site-packages (from datasets==2.7.1.dev0) (4.64.1)\n", "Requirement already satisfied: xxhash in /usr/local/lib/python3.8/dist-packages (from datasets==2.7.1.dev0) (3.1.0)\n", "Requirement already satisfied: attrs>=17.3.0 in /usr/lib/python3/dist-packages (from aiohttp->datasets==2.7.1.dev0) (19.3.0)\n", "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets==2.7.1.dev0) (1.3.3)\n", "Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets==2.7.1.dev0) (4.0.2)\n", "Requirement already satisfied: charset-normalizer<3.0,>=2.0 in ./.local/lib/python3.8/site-packages (from aiohttp->datasets==2.7.1.dev0) (2.1.1)\n", "Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets==2.7.1.dev0) (1.8.2)\n", "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets==2.7.1.dev0) (6.0.3)\n", "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.8/dist-packages (from aiohttp->datasets==2.7.1.dev0) (1.3.1)\n", "Requirement already satisfied: filelock in /usr/lib/python3/dist-packages (from huggingface-hub<1.0.0,>=0.2.0->datasets==2.7.1.dev0) (3.0.12)\n", "Requirement already satisfied: typing-extensions>=3.7.4.3 in ./.local/lib/python3.8/site-packages (from huggingface-hub<1.0.0,>=0.2.0->datasets==2.7.1.dev0) (4.4.0)\n", "Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/lib/python3/dist-packages (from packaging->datasets==2.7.1.dev0) (2.4.6)\n", "Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./.local/lib/python3.8/site-packages (from requests>=2.19.0->datasets==2.7.1.dev0) (1.26.13)\n", "Requirement already satisfied: certifi>=2017.4.17 in ./.local/lib/python3.8/site-packages (from requests>=2.19.0->datasets==2.7.1.dev0) (2022.12.7)\n", "Requirement already satisfied: idna<4,>=2.5 in ./.local/lib/python3.8/site-packages (from requests>=2.19.0->datasets==2.7.1.dev0) (3.4)\n", "Requirement already satisfied: python-dateutil>=2.8.1 in ./.local/lib/python3.8/site-packages (from pandas->datasets==2.7.1.dev0) (2.8.2)\n", "Requirement already satisfied: pytz>=2020.1 in ./.local/lib/python3.8/site-packages (from pandas->datasets==2.7.1.dev0) (2022.5)\n", "Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.1->pandas->datasets==2.7.1.dev0) (1.14.0)\n", "Defaulting to user installation because normal site-packages is not writeable\n", "Collecting git+https://github.com/huggingface/transformers\n", " Cloning https://github.com/huggingface/transformers to /tmp/pip-req-build-x539p2ep\n", " Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers /tmp/pip-req-build-x539p2ep\n", " Resolved https://github.com/huggingface/transformers to commit e3cc4487fe66e03ec85970ea2db8e5fb34c455f4\n", " Installing build dependencies ... \u001b[?25ldone\n", "\u001b[?25h Getting requirements to build wheel ... \u001b[?25ldone\n", "\u001b[?25h Preparing metadata (pyproject.toml) ... \u001b[?25ldone\n", "\u001b[?25hRequirement already satisfied: huggingface-hub<1.0,>=0.10.0 in /usr/local/lib/python3.8/dist-packages (from transformers==4.26.0.dev0) (0.11.1)\n", "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.8/dist-packages (from transformers==4.26.0.dev0) (2022.10.31)\n", "Requirement already satisfied: numpy>=1.17 in ./.local/lib/python3.8/site-packages (from transformers==4.26.0.dev0) (1.24.0rc2)\n", "Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /usr/local/lib/python3.8/dist-packages (from transformers==4.26.0.dev0) (0.13.2)\n", "Requirement already satisfied: filelock in /usr/lib/python3/dist-packages (from transformers==4.26.0.dev0) (3.0.12)\n", "Requirement already satisfied: requests in ./.local/lib/python3.8/site-packages (from transformers==4.26.0.dev0) (2.28.1)\n", "Requirement already satisfied: pyyaml>=5.1 in /usr/lib/python3/dist-packages (from transformers==4.26.0.dev0) (5.3.1)\n", "Requirement already satisfied: packaging>=20.0 in ./.local/lib/python3.8/site-packages (from transformers==4.26.0.dev0) (21.3)\n", "Requirement already satisfied: tqdm>=4.27 in ./.local/lib/python3.8/site-packages (from transformers==4.26.0.dev0) (4.64.1)\n", "Requirement already satisfied: typing-extensions>=3.7.4.3 in ./.local/lib/python3.8/site-packages (from huggingface-hub<1.0,>=0.10.0->transformers==4.26.0.dev0) (4.4.0)\n", "Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/lib/python3/dist-packages (from packaging>=20.0->transformers==4.26.0.dev0) (2.4.6)\n", "Requirement already satisfied: idna<4,>=2.5 in ./.local/lib/python3.8/site-packages (from requests->transformers==4.26.0.dev0) (3.4)\n", "Requirement already satisfied: charset-normalizer<3,>=2 in ./.local/lib/python3.8/site-packages (from requests->transformers==4.26.0.dev0) (2.1.1)\n", "Requirement already satisfied: certifi>=2017.4.17 in ./.local/lib/python3.8/site-packages (from requests->transformers==4.26.0.dev0) (2022.12.7)\n", "Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./.local/lib/python3.8/site-packages (from requests->transformers==4.26.0.dev0) (1.26.13)\n", "/usr/bin/sh: 1: cannot create =2.7.1: Permission denied\n", "Defaulting to user installation because normal site-packages is not writeable\n", "Requirement already satisfied: librosa in /usr/local/lib/python3.8/dist-packages (0.9.2)\n", "Requirement already satisfied: scikit-learn>=0.19.1 in /usr/lib/python3/dist-packages (from librosa) (0.22.2.post1)\n", "Requirement already satisfied: audioread>=2.1.9 in /usr/local/lib/python3.8/dist-packages (from librosa) (3.0.0)\n", "Requirement already satisfied: joblib>=0.14 in ./.local/lib/python3.8/site-packages (from librosa) (1.2.0)\n", "Requirement already satisfied: decorator>=4.0.10 in /usr/lib/python3/dist-packages (from librosa) (4.4.2)\n", "Requirement already satisfied: soundfile>=0.10.2 in /usr/local/lib/python3.8/dist-packages (from librosa) (0.11.0)\n", "Requirement already satisfied: pooch>=1.0 in /usr/local/lib/python3.8/dist-packages (from librosa) (1.6.0)\n", "Requirement already satisfied: packaging>=20.0 in ./.local/lib/python3.8/site-packages (from librosa) (21.3)\n", "Requirement already satisfied: numpy>=1.17.0 in ./.local/lib/python3.8/site-packages (from librosa) (1.24.0rc2)\n", "Requirement already satisfied: scipy>=1.2.0 in ./.local/lib/python3.8/site-packages (from librosa) (1.9.3)\n", "Requirement already satisfied: resampy>=0.2.2 in /usr/local/lib/python3.8/dist-packages (from librosa) (0.4.2)\n", "Requirement already satisfied: numba>=0.45.1 in /usr/local/lib/python3.8/dist-packages (from librosa) (0.56.4)\n", "Collecting numpy>=1.17.0\n", " Downloading numpy-1.23.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m17.1/17.1 MB\u001b[0m \u001b[31m104.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n", "\u001b[?25hRequirement already satisfied: setuptools in ./.local/lib/python3.8/site-packages (from numba>=0.45.1->librosa) (65.6.3)\n", "Requirement already satisfied: importlib-metadata in ./.local/lib/python3.8/site-packages (from numba>=0.45.1->librosa) (5.0.0)\n", "Requirement already satisfied: llvmlite<0.40,>=0.39.0dev0 in /usr/local/lib/python3.8/dist-packages (from numba>=0.45.1->librosa) (0.39.1)\n", "Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/lib/python3/dist-packages (from packaging>=20.0->librosa) (2.4.6)\n", "Requirement already satisfied: requests>=2.19.0 in ./.local/lib/python3.8/site-packages (from pooch>=1.0->librosa) (2.28.1)\n", "Requirement already satisfied: appdirs>=1.3.0 in /usr/lib/python3/dist-packages (from pooch>=1.0->librosa) (1.4.3)\n", "Requirement already satisfied: cffi>=1.0 in /usr/lib/python3/dist-packages (from soundfile>=0.10.2->librosa) (1.14.0)\n", "Requirement already satisfied: certifi>=2017.4.17 in ./.local/lib/python3.8/site-packages (from requests>=2.19.0->pooch>=1.0->librosa) (2022.12.7)\n", "Requirement already satisfied: idna<4,>=2.5 in ./.local/lib/python3.8/site-packages (from requests>=2.19.0->pooch>=1.0->librosa) (3.4)\n", "Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./.local/lib/python3.8/site-packages (from requests>=2.19.0->pooch>=1.0->librosa) (1.26.13)\n", "Requirement already satisfied: charset-normalizer<3,>=2 in ./.local/lib/python3.8/site-packages (from requests>=2.19.0->pooch>=1.0->librosa) (2.1.1)\n", "Requirement already satisfied: zipp>=0.5 in /usr/lib/python3/dist-packages (from importlib-metadata->numba>=0.45.1->librosa) (1.0.0)\n", "Installing collected packages: numpy\n", " Attempting uninstall: numpy\n", " Found existing installation: numpy 1.24.0rc2\n", " Uninstalling numpy-1.24.0rc2:\n", " Successfully uninstalled numpy-1.24.0rc2\n", "Successfully installed numpy-1.23.5\n", "/usr/bin/sh: 1: cannot create =0.3.0: Permission denied\n", "Defaulting to user installation because normal site-packages is not writeable\n", "Requirement already satisfied: jiwer in /usr/local/lib/python3.8/dist-packages (2.5.1)\n", "Requirement already satisfied: levenshtein==0.20.2 in /usr/local/lib/python3.8/dist-packages (from jiwer) (0.20.2)\n", "Requirement already satisfied: rapidfuzz<3.0.0,>=2.3.0 in /usr/local/lib/python3.8/dist-packages (from levenshtein==0.20.2->jiwer) (2.13.3)\n", "Defaulting to user installation because normal site-packages is not writeable\n", "Requirement already satisfied: gradio in /usr/local/lib/python3.8/dist-packages (3.12.0)\n", "Requirement already satisfied: pillow in ./.local/lib/python3.8/site-packages (from gradio) (9.3.0)\n", "Requirement already satisfied: requests in ./.local/lib/python3.8/site-packages (from gradio) (2.28.1)\n", "Requirement already satisfied: pyyaml in /usr/lib/python3/dist-packages (from gradio) (5.3.1)\n", "Requirement already satisfied: pycryptodome in /usr/local/lib/python3.8/dist-packages (from gradio) (3.16.0)\n", "Requirement already satisfied: ffmpy in /usr/local/lib/python3.8/dist-packages (from gradio) (0.3.0)\n", "Requirement already satisfied: websockets>=10.0 in /usr/local/lib/python3.8/dist-packages (from gradio) (10.4)\n", "Requirement already satisfied: aiohttp in /usr/local/lib/python3.8/dist-packages (from gradio) (3.8.3)\n", "Requirement already satisfied: paramiko in /usr/local/lib/python3.8/dist-packages (from gradio) (2.12.0)\n", "Requirement already satisfied: pydub in /usr/local/lib/python3.8/dist-packages (from gradio) (0.25.1)\n", "Requirement already satisfied: fsspec in /usr/local/lib/python3.8/dist-packages (from gradio) (2022.11.0)\n", "Requirement already satisfied: fastapi in /usr/local/lib/python3.8/dist-packages (from gradio) (0.88.0)\n", "Requirement already satisfied: python-multipart in /usr/local/lib/python3.8/dist-packages (from gradio) (0.0.5)\n", "Requirement already satisfied: pydantic in ./.local/lib/python3.8/site-packages (from gradio) (1.10.2)\n", "Requirement already satisfied: orjson in /usr/local/lib/python3.8/dist-packages (from gradio) (3.8.3)\n", "Requirement already satisfied: h11<0.13,>=0.11 in /usr/local/lib/python3.8/dist-packages (from gradio) (0.12.0)\n", "Requirement already satisfied: matplotlib in ./.local/lib/python3.8/site-packages (from gradio) (3.5.3)\n", "Requirement already satisfied: markdown-it-py[linkify,plugins] in /usr/local/lib/python3.8/dist-packages (from gradio) (2.1.0)\n", "Requirement already satisfied: jinja2 in ./.local/lib/python3.8/site-packages (from gradio) (3.1.2)\n", "Requirement already satisfied: numpy in ./.local/lib/python3.8/site-packages (from gradio) (1.23.5)\n", "Requirement already satisfied: pandas in ./.local/lib/python3.8/site-packages (from gradio) (1.5.1)\n", "Requirement already satisfied: uvicorn in /usr/local/lib/python3.8/dist-packages (from gradio) (0.20.0)\n", "Requirement already satisfied: httpx in /usr/local/lib/python3.8/dist-packages (from gradio) (0.23.1)\n", "Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp->gradio) (1.8.2)\n", "Requirement already satisfied: attrs>=17.3.0 in /usr/lib/python3/dist-packages (from aiohttp->gradio) (19.3.0)\n", "Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.8/dist-packages (from aiohttp->gradio) (4.0.2)\n", "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.8/dist-packages (from aiohttp->gradio) (6.0.3)\n", "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.8/dist-packages (from aiohttp->gradio) (1.3.3)\n", "Requirement already satisfied: charset-normalizer<3.0,>=2.0 in ./.local/lib/python3.8/site-packages (from aiohttp->gradio) (2.1.1)\n", "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.8/dist-packages (from aiohttp->gradio) (1.3.1)\n", "Requirement already satisfied: starlette==0.22.0 in /usr/local/lib/python3.8/dist-packages (from fastapi->gradio) (0.22.0)\n", "Requirement already satisfied: typing-extensions>=3.10.0 in ./.local/lib/python3.8/site-packages (from starlette==0.22.0->fastapi->gradio) (4.4.0)\n", "Requirement already satisfied: anyio<5,>=3.4.0 in ./.local/lib/python3.8/site-packages (from starlette==0.22.0->fastapi->gradio) (3.6.2)\n", "Requirement already satisfied: httpcore<0.17.0,>=0.15.0 in /usr/local/lib/python3.8/dist-packages (from httpx->gradio) (0.15.0)\n", "Requirement already satisfied: sniffio in ./.local/lib/python3.8/site-packages (from httpx->gradio) (1.3.0)\n", "Requirement already satisfied: rfc3986[idna2008]<2,>=1.3 in /usr/local/lib/python3.8/dist-packages (from httpx->gradio) (1.5.0)\n", "Requirement already satisfied: certifi in ./.local/lib/python3.8/site-packages (from httpx->gradio) (2022.12.7)\n", "Requirement already satisfied: MarkupSafe>=2.0 in ./.local/lib/python3.8/site-packages (from jinja2->gradio) (2.1.1)\n", "Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.8/dist-packages (from markdown-it-py[linkify,plugins]->gradio) (0.1.2)\n", "Requirement already satisfied: mdit-py-plugins in /usr/local/lib/python3.8/dist-packages (from markdown-it-py[linkify,plugins]->gradio) (0.3.3)\n", "Requirement already satisfied: linkify-it-py~=1.0 in /usr/local/lib/python3.8/dist-packages (from markdown-it-py[linkify,plugins]->gradio) (1.0.3)\n", "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/lib/python3/dist-packages (from matplotlib->gradio) (1.0.1)\n", "Requirement already satisfied: cycler>=0.10 in /usr/lib/python3/dist-packages (from matplotlib->gradio) (0.10.0)\n", "Requirement already satisfied: fonttools>=4.22.0 in ./.local/lib/python3.8/site-packages (from matplotlib->gradio) (4.38.0)\n", "Requirement already satisfied: pyparsing>=2.2.1 in /usr/lib/python3/dist-packages (from matplotlib->gradio) (2.4.6)\n", "Requirement already satisfied: python-dateutil>=2.7 in ./.local/lib/python3.8/site-packages (from matplotlib->gradio) (2.8.2)\n", "Requirement already satisfied: packaging>=20.0 in ./.local/lib/python3.8/site-packages (from matplotlib->gradio) (21.3)\n", "Requirement already satisfied: pytz>=2020.1 in ./.local/lib/python3.8/site-packages (from pandas->gradio) (2022.5)\n", "Requirement already satisfied: cryptography>=2.5 in /usr/lib/python3/dist-packages (from paramiko->gradio) (2.8)\n", "Requirement already satisfied: pynacl>=1.0.1 in /usr/lib/python3/dist-packages (from paramiko->gradio) (1.3.0)\n", "Requirement already satisfied: six in /usr/lib/python3/dist-packages (from paramiko->gradio) (1.14.0)\n", "Requirement already satisfied: bcrypt>=3.1.3 in /usr/local/lib/python3.8/dist-packages (from paramiko->gradio) (4.0.1)\n", "Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./.local/lib/python3.8/site-packages (from requests->gradio) (1.26.13)\n", "Requirement already satisfied: idna<4,>=2.5 in ./.local/lib/python3.8/site-packages (from requests->gradio) (3.4)\n", "Requirement already satisfied: click>=7.0 in /usr/lib/python3/dist-packages (from uvicorn->gradio) (7.0)\n", "Requirement already satisfied: uc-micro-py in /usr/local/lib/python3.8/dist-packages (from linkify-it-py~=1.0->markdown-it-py[linkify,plugins]->gradio) (1.0.1)\n", "Defaulting to user installation because normal site-packages is not writeable\n", "Requirement already satisfied: more-itertools in /usr/lib/python3/dist-packages (4.2.0)\n" ] } ], "source": [ "!pip install git+https://github.com/huggingface/datasets\n", "!pip install git+https://github.com/huggingface/transformers\n", "!pip3 install numexpr>=2.7.1\n", "!pip install librosa\n", "!pip install evaluate>=0.3.0\n", "!pip install jiwer\n", "!pip install gradio\n", "!pip install more-itertools" ] }, { "cell_type": "markdown", "id": "5b185650-af09-48c6-a67b-0e4368b74b3b", "metadata": { "id": "5b185650-af09-48c6-a67b-0e4368b74b3b", "tags": [] }, "source": [ "Linking the notebook to the Hugging Face Hub is straightforward - it simply \n", "\n", "\n", "requires entering your \n", "Hub authentication token when prompted. Find your Hub authentication token [here](https://huggingface.co/settings/tokens):" ] }, { "cell_type": "code", "execution_count": 6, "id": "dff27c76-575c-432b-8916-b1b810efef4a", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 331, "referenced_widgets": [ "7f16af38d92e4cac84284de5e3756ce6", "87a7ee7ff6d44cd7881ea185796628a6", "e1bb1e29bac248f793290223324a4c4b", "3ae98fe05f88443d821d5df7488a123b", "21188bee327c4aedb689117ac9587842", "3b4a918dcadb4b18903907fcab930dfe", "50b3d6dda7504241961ab0bf9c9c033a", "b57d12eff9424744bd7e79cc039a22e5", "38135b51abf54c749ccb3db099f10b1d", "5782da4956ce4aeeafff328e0b821936", "481fdd626960471e94d31f00e20344ac", "40d85caeaa614d918e5f199e1c5136ef", "88dc4572ad5c495cb0489a5bc6467ee2", "a011f026bff54c2cbfcf32d839f69a38", "c8f6afbce8ca417d979c579760fe7311", "ed3ad08826e24e03ba6d611550249160", "f736a64d34b94efabfdd36c297177aed" ] }, "id": "dff27c76-575c-432b-8916-b1b810efef4a", "outputId": "5beb152e-9d4f-4581-8063-c5752890c4fa" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "cee65d4b203d4b2a910d65aba8ff273c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HTML(value='