Spaces:

ANDRYHA
/

FakeNewsClassifier

Running

App Files Files Community

Andrey Moskalenko commited on Mar 23, 2022

Commit

3d38624

•

1 Parent(s): c94270d

Upload Train_fakenews_detector.ipynb

Browse files

Files changed (1) hide show

Train_fakenews_detector.ipynb +1465 -0

Train_fakenews_detector.ipynb ADDED Viewed

	@@ -0,0 +1,1465 @@

+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Data Preparation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Я нашел три датасета на kaggle по классификации фейков. Они все на английском, поэтому для поддержки русскуязычных статей будем использовать специально обученную для перевода новостей модель wmt19-ru-en. \n",
+    "\n",
+    "Выбранные датасеты:\n",
+    "* https://www.kaggle.com/c/fake-news/data\n",
+    "* https://www.kaggle.com/c/fakenewskdd2020/data\n",
+    "* https://www.kaggle.com/c/classifying-the-fake-news/data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 95,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "\n",
+    "df1_train = pd.read_csv('./data1/train.csv')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 96,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>id</th>\n",
+       "      <th>title</th>\n",
+       "      <th>author</th>\n",
+       "      <th>text</th>\n",
+       "      <th>label</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>0</td>\n",
+       "      <td>House Dem Aide: We Didn’t Even See Comey’s Let...</td>\n",
+       "      <td>Darrell Lucus</td>\n",
+       "      <td>House Dem Aide: We Didn’t Even See Comey’s Let...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>1</td>\n",
+       "      <td>FLYNN: Hillary Clinton, Big Woman on Campus - ...</td>\n",
+       "      <td>Daniel J. Flynn</td>\n",
+       "      <td>Ever get the feeling your life circles the rou...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>2</td>\n",
+       "      <td>Why the Truth Might Get You Fired</td>\n",
+       "      <td>Consortiumnews.com</td>\n",
+       "      <td>Why the Truth Might Get You Fired October 29, ...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>3</td>\n",
+       "      <td>15 Civilians Killed In Single US Airstrike Hav...</td>\n",
+       "      <td>Jessica Purkiss</td>\n",
+       "      <td>Videos 15 Civilians Killed In Single US Airstr...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>4</td>\n",
+       "      <td>Iranian woman jailed for fictional unpublished...</td>\n",
+       "      <td>Howard Portnoy</td>\n",
+       "      <td>Print \\nAn Iranian woman has been sentenced to...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>20795</th>\n",
+       "      <td>20795</td>\n",
+       "      <td>Rapper T.I.: Trump a ’Poster Child For White S...</td>\n",
+       "      <td>Jerome Hudson</td>\n",
+       "      <td>Rapper T. I. unloaded on black celebrities who...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>20796</th>\n",
+       "      <td>20796</td>\n",
+       "      <td>N.F.L. Playoffs: Schedule, Matchups and Odds -...</td>\n",
+       "      <td>Benjamin Hoffman</td>\n",
+       "      <td>When the Green Bay Packers lost to the Washing...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>20797</th>\n",
+       "      <td>20797</td>\n",
+       "      <td>Macy’s Is Said to Receive Takeover Approach by...</td>\n",
+       "      <td>Michael J. de la Merced and Rachel Abrams</td>\n",
+       "      <td>The Macy’s of today grew from the union of sev...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>20798</th>\n",
+       "      <td>20798</td>\n",
+       "      <td>NATO, Russia To Hold Parallel Exercises In Bal...</td>\n",
+       "      <td>Alex Ansary</td>\n",
+       "      <td>NATO, Russia To Hold Parallel Exercises In Bal...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>20799</th>\n",
+       "      <td>20799</td>\n",
+       "      <td>What Keeps the F-35 Alive</td>\n",
+       "      <td>David Swanson</td>\n",
+       "      <td>David Swanson is an author, activist, journa...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>20800 rows × 5 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "          id                                              title  \\\n",
+       "0          0  House Dem Aide: We Didn’t Even See Comey’s Let...   \n",
+       "1          1  FLYNN: Hillary Clinton, Big Woman on Campus - ...   \n",
+       "2          2                  Why the Truth Might Get You Fired   \n",
+       "3          3  15 Civilians Killed In Single US Airstrike Hav...   \n",
+       "4          4  Iranian woman jailed for fictional unpublished...   \n",
+       "...      ...                                                ...   \n",
+       "20795  20795  Rapper T.I.: Trump a ’Poster Child For White S...   \n",
+       "20796  20796  N.F.L. Playoffs: Schedule, Matchups and Odds -...   \n",
+       "20797  20797  Macy’s Is Said to Receive Takeover Approach by...   \n",
+       "20798  20798  NATO, Russia To Hold Parallel Exercises In Bal...   \n",
+       "20799  20799                          What Keeps the F-35 Alive   \n",
+       "\n",
+       "                                          author  \\\n",
+       "0                                  Darrell Lucus   \n",
+       "1                                Daniel J. Flynn   \n",
+       "2                             Consortiumnews.com   \n",
+       "3                                Jessica Purkiss   \n",
+       "4                                 Howard Portnoy   \n",
+       "...                                          ...   \n",
+       "20795                              Jerome Hudson   \n",
+       "20796                           Benjamin Hoffman   \n",
+       "20797  Michael J. de la Merced and Rachel Abrams   \n",
+       "20798                                Alex Ansary   \n",
+       "20799                              David Swanson   \n",
+       "\n",
+       "                                                    text  label  \n",
+       "0      House Dem Aide: We Didn’t Even See Comey’s Let...      1  \n",
+       "1      Ever get the feeling your life circles the rou...      0  \n",
+       "2      Why the Truth Might Get You Fired October 29, ...      1  \n",
+       "3      Videos 15 Civilians Killed In Single US Airstr...      1  \n",
+       "4      Print \\nAn Iranian woman has been sentenced to...      1  \n",
+       "...                                                  ...    ...  \n",
+       "20795  Rapper T. I. unloaded on black celebrities who...      0  \n",
+       "20796  When the Green Bay Packers lost to the Washing...      0  \n",
+       "20797  The Macy’s of today grew from the union of sev...      0  \n",
+       "20798  NATO, Russia To Hold Parallel Exercises In Bal...      1  \n",
+       "20799    David Swanson is an author, activist, journa...      1  \n",
+       "\n",
+       "[20800 rows x 5 columns]"
+      ]
+     },
+     "execution_count": 96,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df1_train"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 97,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df1_train['text'] = df1_train.apply(lambda x: str(x.title) + '. ' + str(x.text), axis=1)\n",
+    "df1_train = df1_train[['text', 'label']]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 98,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df2_train = pd.read_csv('./data2/train.csv', sep='\\t')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 99,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Битая строка\n",
+    "df2_train = df2_train.drop([1615])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 100,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>text</th>\n",
+       "      <th>label</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Get the latest from TODAY Sign up for our news...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>2d  Conan On The Funeral Trump Will Be Invited...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>It’s safe to say that Instagram Stories has fa...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>Much like a certain Amazon goddess with a lass...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>At a time when the perfect outfit is just one ...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4982</th>\n",
+       "      <td>The storybook romance of WWE stars John Cena a...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4983</th>\n",
+       "      <td>The actor told friends he’s responsible for en...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4984</th>\n",
+       "      <td>Sarah Hyland is getting real.  The Modern Fami...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4985</th>\n",
+       "      <td>Production has been suspended on the sixth and...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4986</th>\n",
+       "      <td>A jury ruled against Bill Cosby in his sexual ...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>4986 rows × 2 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                                                   text label\n",
+       "0     Get the latest from TODAY Sign up for our news...     1\n",
+       "1     2d  Conan On The Funeral Trump Will Be Invited...     1\n",
+       "2     It’s safe to say that Instagram Stories has fa...     0\n",
+       "3     Much like a certain Amazon goddess with a lass...     0\n",
+       "4     At a time when the perfect outfit is just one ...     0\n",
+       "...                                                 ...   ...\n",
+       "4982  The storybook romance of WWE stars John Cena a...     0\n",
+       "4983  The actor told friends he’s responsible for en...     0\n",
+       "4984  Sarah Hyland is getting real.  The Modern Fami...     0\n",
+       "4985  Production has been suspended on the sixth and...     0\n",
+       "4986  A jury ruled against Bill Cosby in his sexual ...     0\n",
+       "\n",
+       "[4986 rows x 2 columns]"
+      ]
+     },
+     "execution_count": 100,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df2_train"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 104,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df3_train = pd.read_csv('./data3/training.csv')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 105,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df3_train['text'] = df3_train.apply(lambda x: str(x.title) + '. ' + str(x.text), axis=1)\n",
+    "df3_train = df3_train[['text', 'label']]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 106,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "all_data_train = df1_train.append(df2_train).append(df3_train)\n",
+    "all_data_train.to_csv('./train.csv', index=False)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Training"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "id": "zriTdjauH8iQ"
+   },
+   "outputs": [],
+   "source": [
+    "#!pip install transformers\n",
+    "import transformers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "id": "TFh3upySL3XG"
+   },
+   "outputs": [],
+   "source": [
+    "from transformers import Trainer, TrainingArguments, LineByLineTextDataset"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "id": "H2Ym6YhyNfON"
+   },
+   "outputs": [],
+   "source": [
+    "import pandas as pd"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {
+    "id": "ueRyDnvgNgpW"
+   },
+   "outputs": [],
+   "source": [
+    "from datasets import Dataset"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "id": "HVBCtqyjNhLn"
+   },
+   "outputs": [],
+   "source": [
+    "df = pd.read_csv('./train.csv')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 424
+    },
+    "id": "f7j8fEl1Nogb",
+    "outputId": "3b5b13a0-4c34-412c-9718-5b0decb855cc"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>text</th>\n",
+       "      <th>label</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>House Dem Aide: We Didn’t Even See Comey’s Let...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>FLYNN: Hillary Clinton, Big Woman on Campus - ...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>Why the Truth Might Get You Fired.Why the Trut...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>15 Civilians Killed In Single US Airstrike Hav...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>Iranian woman jailed for fictional unpublished...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>...</th>\n",
+       "      <td>...</td>\n",
+       "      <td>...</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>57209</th>\n",
+       "      <td>CHICAGO TRUMP RALLY CANCELLED: Radicals And BL...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>57210</th>\n",
+       "      <td>Trump supports completion of Dakota Access Pip...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>57211</th>\n",
+       "      <td>Obama Can’t Stop Winning As New Jobs Report S...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>57212</th>\n",
+       "      <td>Turkey bank regulator dismisses 'rumors' after...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>57213</th>\n",
+       "      <td>California mayors ask for governor's support f...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>57214 rows × 2 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                                                    text  label\n",
+       "0      House Dem Aide: We Didn’t Even See Comey’s Let...      1\n",
+       "1      FLYNN: Hillary Clinton, Big Woman on Campus - ...      0\n",
+       "2      Why the Truth Might Get You Fired.Why the Trut...      1\n",
+       "3      15 Civilians Killed In Single US Airstrike Hav...      1\n",
+       "4      Iranian woman jailed for fictional unpublished...      1\n",
+       "...                                                  ...    ...\n",
+       "57209  CHICAGO TRUMP RALLY CANCELLED: Radicals And BL...      1\n",
+       "57210  Trump supports completion of Dakota Access Pip...      0\n",
+       "57211   Obama Can’t Stop Winning As New Jobs Report S...      1\n",
+       "57212  Turkey bank regulator dismisses 'rumors' after...      0\n",
+       "57213  California mayors ask for governor's support f...      0\n",
+       "\n",
+       "[57214 rows x 2 columns]"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {
+    "id": "L0ET6Z83Pcxu"
+   },
+   "outputs": [],
+   "source": [
+    "df['labels'] = df['label']"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {
+    "id": "39Zv6HBJPgEt"
+   },
+   "outputs": [],
+   "source": [
+    "df = df[['text', 'labels']]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {
+    "id": "bPGVPY17NI7x"
+   },
+   "outputs": [],
+   "source": [
+    "dataset = Dataset.from_pandas(df)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "3LTGwWrINmZq",
+    "outputId": "177d8749-68cf-4f81-a91b-1097bf155478"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Dataset({\n",
+       "    features: ['text', 'labels'],\n",
+       "    num_rows: 57214\n",
+       "})"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "dataset"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "3DrWrMiDd7e-",
+    "outputId": "d331ebe6-5ed4-4fef-8a8d-41d25ed4b638"
+   },
+   "outputs": [],
+   "source": [
+    "import torch\n",
+    "from transformers import AutoTokenizer, AutoModel, pipeline\n",
+    "\n",
+    "model_name = 'distilbert-base-uncased-finetuned-sst-2-english'\n",
+    "tokenizer = AutoTokenizer.from_pretrained(model_name)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {
+    "id": "dRJOO2c5PT3V"
+   },
+   "outputs": [],
+   "source": [
+    "def preprocess_function(examples):\n",
+    "    return tokenizer(examples[\"text\"], padding=True, truncation=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 49,
+     "referenced_widgets": [
+      "5b49dc833234406da3da7435b9045fd2",
+      "300b70ed57dd493997afb0b3f25f4245",
+      "c03cc68b079c4e23b339e9de5ba38d29",
+      "57c3794731c84c42bb49618482b6b8cc",
+      "e306828f6d7444ddafce604e9a170467",
+      "9e11898bc51e483d91301387099368a4",
+      "a43574fa5fdf47ba9d5598b2b31f2082",
+      "482bae742d2a461cad525888e6ee8b91",
+      "e9c56275d73545a6961efe5704308ede",
+      "d604380b5e444f62ad36c4598230c561",
+      "c52ad745acb3423494b4ea5af5a934c7"
+     ]
+    },
+    "id": "hCxs-HasPQ7s",
+    "outputId": "be4f8483-316c-4677-f804-12c78f358fac"
+   },
+   "outputs": [
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "67689f0c8fb842b2969c4fc584fa3a4b",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "  0%|          | 0/58 [00:00<?, ?ba/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "dataset = dataset.map(preprocess_function, batched=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dataset_splitted = dataset.shuffle(1337).train_test_split(0.1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "DatasetDict({\n",
+       "    train: Dataset({\n",
+       "        features: ['text', 'labels', 'input_ids', 'attention_mask'],\n",
+       "        num_rows: 51492\n",
+       "    })\n",
+       "    test: Dataset({\n",
+       "        features: ['text', 'labels', 'input_ids', 'attention_mask'],\n",
+       "        num_rows: 5722\n",
+       "    })\n",
+       "})"
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "dataset_splitted"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {
+    "id": "NyHknkwcYi6L"
+   },
+   "outputs": [],
+   "source": [
+    "from transformers import AutoModelForSequenceClassification"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/"
+    },
+    "id": "gv_fYzmEYlUm",
+    "outputId": "7a97df03-8f7b-4d54-f8d7-6a6b71d4c8c4"
+   },
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "loading configuration file https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/config.json from cache at C:\\Users\\andry/.cache\\huggingface\\transformers\\4e60bb8efad3d4b7dc9969bf204947c185166a0a3cf37ddb6f481a876a3777b5.9f8326d0b7697c7fd57366cdde57032f46bc10e37ae81cb7eb564d66d23ec96b\n",
+      "Model config DistilBertConfig {\n",
+      "  \"_name_or_path\": \"distilbert-base-uncased-finetuned-sst-2-english\",\n",
+      "  \"activation\": \"gelu\",\n",
+      "  \"architectures\": [\n",
+      "    \"DistilBertForSequenceClassification\"\n",
+      "  ],\n",
+      "  \"attention_dropout\": 0.1,\n",
+      "  \"dim\": 768,\n",
+      "  \"dropout\": 0.1,\n",
+      "  \"finetuning_task\": \"sst-2\",\n",
+      "  \"hidden_dim\": 3072,\n",
+      "  \"id2label\": {\n",
+      "    \"0\": \"NEGATIVE\",\n",
+      "    \"1\": \"POSITIVE\"\n",
+      "  },\n",
+      "  \"initializer_range\": 0.02,\n",
+      "  \"label2id\": {\n",
+      "    \"NEGATIVE\": 0,\n",
+      "    \"POSITIVE\": 1\n",
+      "  },\n",
+      "  \"max_position_embeddings\": 512,\n",
+      "  \"model_type\": \"distilbert\",\n",
+      "  \"n_heads\": 12,\n",
+      "  \"n_layers\": 6,\n",
+      "  \"output_past\": true,\n",
+      "  \"pad_token_id\": 0,\n",
+      "  \"qa_dropout\": 0.1,\n",
+      "  \"seq_classif_dropout\": 0.2,\n",
+      "  \"sinusoidal_pos_embds\": false,\n",
+      "  \"tie_weights_\": true,\n",
+      "  \"transformers_version\": \"4.17.0\",\n",
+      "  \"vocab_size\": 30522\n",
+      "}\n",
+      "\n",
+      "loading weights file https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/pytorch_model.bin from cache at C:\\Users\\andry/.cache\\huggingface\\transformers\\8d04c767d9d4c14d929ce7ad8e067b80c74dbdb212ef4c3fb743db4ee109fae0.9d268a35da669ead745c44d369dc9948b408da5010c6bac414414a7e33d5748c\n",
+      "All model checkpoint weights were used when initializing DistilBertForSequenceClassification.\n",
+      "\n",
+      "All the weights of DistilBertForSequenceClassification were initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english.\n",
+      "If your task is similar to the task the model of the checkpoint was trained on, you can already use DistilBertForSequenceClassification for predictions without further training.\n"
+     ]
+    }
+   ],
+   "source": [
+    "model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 24,
+   "metadata": {
+    "id": "YqcdtMXZelbm"
+   },
+   "outputs": [],
+   "source": [
+    "for name, param in model.named_parameters():\n",
+    "    if name in ['classifier.weight', 'classifier.bias']:\n",
+    "        param.requires_grad = True\n",
+    "    else:\n",
+    "        param.requires_grad = False"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 25,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from sklearn.metrics import accuracy_score\n",
+    "\n",
+    "def compute_metrics(pred):\n",
+    "    labels = pred.label_ids\n",
+    "    preds = pred.predictions.argmax(-1)\n",
+    "    acc = accuracy_score(labels, preds)\n",
+    "    return {'accuracy': acc}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 26,
+   "metadata": {
+    "colab": {
+     "base_uri": "https://localhost:8080/",
+     "height": 608
+    },
+    "id": "DkBWiEiyIgnV",
+    "outputId": "07f58180-8005-4f7e-fd72-62a5d2c78717",
+    "scrolled": false
+   },
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "PyTorch: setting up devices\n",
+      "The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).\n",
+      "The following columns in the training set  don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.\n",
+      "***** Running training *****\n",
+      "  Num examples = 51492\n",
+      "  Num Epochs = 10\n",
+      "  Instantaneous batch size per device = 64\n",
+      "  Total train batch size (w. parallel, distributed & accumulation) = 64\n",
+      "  Gradient Accumulation steps = 1\n",
+      "  Total optimization steps = 8050\n"
+     ]
+    },
+    {
+     "data": {
+      "text/html": [
+       "\n",
+       "    <div>\n",
+       "      \n",
+       "      <progress value='8050' max='8050' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
+       "      [8050/8050 1:31:55, Epoch 10/10]\n",
+       "    </div>\n",
+       "    <table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       " <tr style=\"text-align: left;\">\n",
+       "      <th>Epoch</th>\n",
+       "      <th>Training Loss</th>\n",
+       "      <th>Validation Loss</th>\n",
+       "      <th>Accuracy</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <td>1</td>\n",
+       "      <td>1.124500</td>\n",
+       "      <td>0.655170</td>\n",
+       "      <td>0.631423</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2</td>\n",
+       "      <td>0.635900</td>\n",
+       "      <td>0.616928</td>\n",
+       "      <td>0.696435</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>3</td>\n",
+       "      <td>0.617400</td>\n",
+       "      <td>0.592879</td>\n",
+       "      <td>0.727019</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>4</td>\n",
+       "      <td>0.591200</td>\n",
+       "      <td>0.577941</td>\n",
+       "      <td>0.734533</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>5</td>\n",
+       "      <td>0.577100</td>\n",
+       "      <td>0.564665</td>\n",
+       "      <td>0.747466</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>6</td>\n",
+       "      <td>0.569300</td>\n",
+       "      <td>0.556096</td>\n",
+       "      <td>0.749913</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>7</td>\n",
+       "      <td>0.563200</td>\n",
+       "      <td>0.551389</td>\n",
+       "      <td>0.755330</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>8</td>\n",
+       "      <td>0.559900</td>\n",
+       "      <td>0.546756</td>\n",
+       "      <td>0.754981</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>9</td>\n",
+       "      <td>0.554800</td>\n",
+       "      <td>0.544496</td>\n",
+       "      <td>0.759000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>10</td>\n",
+       "      <td>0.554000</td>\n",
+       "      <td>0.543604</td>\n",
+       "      <td>0.760398</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table><p>"
+      ],
+      "text/plain": [
+       "<IPython.core.display.HTML object>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "The following columns in the evaluation set  don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.\n",
+      "***** Running Evaluation *****\n",
+      "  Num examples = 5722\n",
+      "  Batch size = 64\n",
+      "Saving model checkpoint to ./my_saved_model\\checkpoint-805\n",
+      "Configuration saved in ./my_saved_model\\checkpoint-805\\config.json\n",
+      "Model weights saved in ./my_saved_model\\checkpoint-805\\pytorch_model.bin\n",
+      "The following columns in the evaluation set  don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.\n",
+      "***** Running Evaluation *****\n",
+      "  Num examples = 5722\n",
+      "  Batch size = 64\n",
+      "Saving model checkpoint to ./my_saved_model\\checkpoint-1610\n",
+      "Configuration saved in ./my_saved_model\\checkpoint-1610\\config.json\n",
+      "Model weights saved in ./my_saved_model\\checkpoint-1610\\pytorch_model.bin\n",
+      "The following columns in the evaluation set  don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.\n",
+      "***** Running Evaluation *****\n",
+      "  Num examples = 5722\n",
+      "  Batch size = 64\n",
+      "Saving model checkpoint to ./my_saved_model\\checkpoint-2415\n",
+      "Configuration saved in ./my_saved_model\\checkpoint-2415\\config.json\n",
+      "Model weights saved in ./my_saved_model\\checkpoint-2415\\pytorch_model.bin\n",
+      "Deleting older checkpoint [my_saved_model\\checkpoint-805] due to args.save_total_limit\n",
+      "The following columns in the evaluation set  don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.\n",
+      "***** Running Evaluation *****\n",
+      "  Num examples = 5722\n",
+      "  Batch size = 64\n",
+      "Saving model checkpoint to ./my_saved_model\\checkpoint-3220\n",
+      "Configuration saved in ./my_saved_model\\checkpoint-3220\\config.json\n",
+      "Model weights saved in ./my_saved_model\\checkpoint-3220\\pytorch_model.bin\n",
+      "Deleting older checkpoint [my_saved_model\\checkpoint-1610] due to args.save_total_limit\n",
+      "The following columns in the evaluation set  don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.\n",
+      "***** Running Evaluation *****\n",
+      "  Num examples = 5722\n",
+      "  Batch size = 64\n",
+      "Saving model checkpoint to ./my_saved_model\\checkpoint-4025\n",
+      "Configuration saved in ./my_saved_model\\checkpoint-4025\\config.json\n",
+      "Model weights saved in ./my_saved_model\\checkpoint-4025\\pytorch_model.bin\n",
+      "Deleting older checkpoint [my_saved_model\\checkpoint-2415] due to args.save_total_limit\n",
+      "The following columns in the evaluation set  don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.\n",
+      "***** Running Evaluation *****\n",
+      "  Num examples = 5722\n",
+      "  Batch size = 64\n",
+      "Saving model checkpoint to ./my_saved_model\\checkpoint-4830\n",
+      "Configuration saved in ./my_saved_model\\checkpoint-4830\\config.json\n",
+      "Model weights saved in ./my_saved_model\\checkpoint-4830\\pytorch_model.bin\n",
+      "Deleting older checkpoint [my_saved_model\\checkpoint-3220] due to args.save_total_limit\n",
+      "The following columns in the evaluation set  don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.\n",
+      "***** Running Evaluation *****\n",
+      "  Num examples = 5722\n",
+      "  Batch size = 64\n",
+      "Saving model checkpoint to ./my_saved_model\\checkpoint-5635\n",
+      "Configuration saved in ./my_saved_model\\checkpoint-5635\\config.json\n",
+      "Model weights saved in ./my_saved_model\\checkpoint-5635\\pytorch_model.bin\n",
+      "Deleting older checkpoint [my_saved_model\\checkpoint-4025] due to args.save_total_limit\n",
+      "The following columns in the evaluation set  don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.\n",
+      "***** Running Evaluation *****\n",
+      "  Num examples = 5722\n",
+      "  Batch size = 64\n",
+      "Saving model checkpoint to ./my_saved_model\\checkpoint-6440\n",
+      "Configuration saved in ./my_saved_model\\checkpoint-6440\\config.json\n",
+      "Model weights saved in ./my_saved_model\\checkpoint-6440\\pytorch_model.bin\n",
+      "Deleting older checkpoint [my_saved_model\\checkpoint-4830] due to args.save_total_limit\n",
+      "The following columns in the evaluation set  don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.\n",
+      "***** Running Evaluation *****\n",
+      "  Num examples = 5722\n",
+      "  Batch size = 64\n",
+      "Saving model checkpoint to ./my_saved_model\\checkpoint-7245\n",
+      "Configuration saved in ./my_saved_model\\checkpoint-7245\\config.json\n",
+      "Model weights saved in ./my_saved_model\\checkpoint-7245\\pytorch_model.bin\n",
+      "Deleting older checkpoint [my_saved_model\\checkpoint-5635] due to args.save_total_limit\n",
+      "The following columns in the evaluation set  don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.\n",
+      "***** Running Evaluation *****\n",
+      "  Num examples = 5722\n",
+      "  Batch size = 64\n",
+      "Saving model checkpoint to ./my_saved_model\\checkpoint-8050\n",
+      "Configuration saved in ./my_saved_model\\checkpoint-8050\\config.json\n",
+      "Model weights saved in ./my_saved_model\\checkpoint-8050\\pytorch_model.bin\n",
+      "Deleting older checkpoint [my_saved_model\\checkpoint-6440] due to args.save_total_limit\n",
+      "\n",
+      "\n",
+      "Training completed. Do not forget to share your model on huggingface.co/models =)\n",
+      "\n",
+      "\n",
+      "Loading best model from ./my_saved_model\\checkpoint-8050 (score: 0.543603777885437).\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "TrainOutput(global_step=8050, training_loss=0.6166538418598057, metrics={'train_runtime': 5516.6092, 'train_samples_per_second': 93.34, 'train_steps_per_second': 1.459, 'total_flos': 6.821011291594752e+16, 'train_loss': 0.6166538418598057, 'epoch': 10.0})"
+      ]
+     },
+     "execution_count": 26,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from transformers import Trainer, TrainingArguments\n",
+    "\n",
+    "trainer = Trainer(\n",
+    "    model=model, train_dataset=dataset_splitted['train'], \n",
+    "    eval_dataset=dataset_splitted['test'],\n",
+    "    compute_metrics=compute_metrics,\n",
+    "    args=TrainingArguments(\n",
+    "        load_best_model_at_end=True,\n",
+    "        output_dir=\"./my_saved_model\", overwrite_output_dir=True,\n",
+    "        num_train_epochs=10, per_device_train_batch_size=64, \n",
+    "        per_device_eval_batch_size=64,\n",
+    "        evaluation_strategy = \"epoch\",\n",
+    "        save_strategy = \"epoch\",\n",
+    "        save_steps=10_000, save_total_limit=2),\n",
+    ")\n",
+    "\n",
+    "trainer.train()"
+   ]
+  }
+ ],
+ "metadata": {
+  "accelerator": "GPU",
+  "colab": {
+   "collapsed_sections": [],
+   "name": "Копия блокнота \"ysda_2022.03.07.ipynb\"",
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.12"
+  },
+  "widgets": {
+   "application/vnd.jupyter.widget-state+json": {
+    "300b70ed57dd493997afb0b3f25f4245": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HTMLModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HTMLModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HTMLView",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_9e11898bc51e483d91301387099368a4",
+      "placeholder": "",
+      "style": "IPY_MODEL_a43574fa5fdf47ba9d5598b2b31f2082",
+      "value": "100%"
+     }
+    },
+    "482bae742d2a461cad525888e6ee8b91": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "57c3794731c84c42bb49618482b6b8cc": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HTMLModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HTMLModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HTMLView",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_d604380b5e444f62ad36c4598230c561",
+      "placeholder": "",
+      "style": "IPY_MODEL_c52ad745acb3423494b4ea5af5a934c7",
+      "value": " 58/58 [02:02&lt;00:00,  1.83s/ba]"
+     }
+    },
+    "5b49dc833234406da3da7435b9045fd2": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "HBoxModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "HBoxModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "HBoxView",
+      "box_style": "",
+      "children": [
+       "IPY_MODEL_300b70ed57dd493997afb0b3f25f4245",
+       "IPY_MODEL_c03cc68b079c4e23b339e9de5ba38d29",
+       "IPY_MODEL_57c3794731c84c42bb49618482b6b8cc"
+      ],
+      "layout": "IPY_MODEL_e306828f6d7444ddafce604e9a170467"
+     }
+    },
+    "9e11898bc51e483d91301387099368a4": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "a43574fa5fdf47ba9d5598b2b31f2082": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "DescriptionStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "DescriptionStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "description_width": ""
+     }
+    },
+    "c03cc68b079c4e23b339e9de5ba38d29": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "FloatProgressModel",
+     "state": {
+      "_dom_classes": [],
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "FloatProgressModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/controls",
+      "_view_module_version": "1.5.0",
+      "_view_name": "ProgressView",
+      "bar_style": "success",
+      "description": "",
+      "description_tooltip": null,
+      "layout": "IPY_MODEL_482bae742d2a461cad525888e6ee8b91",
+      "max": 58,
+      "min": 0,
+      "orientation": "horizontal",
+      "style": "IPY_MODEL_e9c56275d73545a6961efe5704308ede",
+      "value": 58
+     }
+    },
+    "c52ad745acb3423494b4ea5af5a934c7": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "DescriptionStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "DescriptionStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "description_width": ""
+     }
+    },
+    "d604380b5e444f62ad36c4598230c561": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "e306828f6d7444ddafce604e9a170467": {
+     "model_module": "@jupyter-widgets/base",
+     "model_module_version": "1.2.0",
+     "model_name": "LayoutModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/base",
+      "_model_module_version": "1.2.0",
+      "_model_name": "LayoutModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "LayoutView",
+      "align_content": null,
+      "align_items": null,
+      "align_self": null,
+      "border": null,
+      "bottom": null,
+      "display": null,
+      "flex": null,
+      "flex_flow": null,
+      "grid_area": null,
+      "grid_auto_columns": null,
+      "grid_auto_flow": null,
+      "grid_auto_rows": null,
+      "grid_column": null,
+      "grid_gap": null,
+      "grid_row": null,
+      "grid_template_areas": null,
+      "grid_template_columns": null,
+      "grid_template_rows": null,
+      "height": null,
+      "justify_content": null,
+      "justify_items": null,
+      "left": null,
+      "margin": null,
+      "max_height": null,
+      "max_width": null,
+      "min_height": null,
+      "min_width": null,
+      "object_fit": null,
+      "object_position": null,
+      "order": null,
+      "overflow": null,
+      "overflow_x": null,
+      "overflow_y": null,
+      "padding": null,
+      "right": null,
+      "top": null,
+      "visibility": null,
+      "width": null
+     }
+    },
+    "e9c56275d73545a6961efe5704308ede": {
+     "model_module": "@jupyter-widgets/controls",
+     "model_module_version": "1.5.0",
+     "model_name": "ProgressStyleModel",
+     "state": {
+      "_model_module": "@jupyter-widgets/controls",
+      "_model_module_version": "1.5.0",
+      "_model_name": "ProgressStyleModel",
+      "_view_count": null,
+      "_view_module": "@jupyter-widgets/base",
+      "_view_module_version": "1.2.0",
+      "_view_name": "StyleView",
+      "bar_color": null,
+      "description_width": ""
+     }
+    }
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}