Dfbenavidesr commited on
Commit
613d916
1 Parent(s): d9bbec3

Upload Train.ipynb

Browse files

En este notebook encontrará la rutina de finetuning del modelo original

Files changed (1) hide show
  1. Train.ipynb +1347 -0
Train.ipynb ADDED
@@ -0,0 +1,1347 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {
6
+ "id": "GZiMfnKVCniS"
7
+ },
8
+ "source": [
9
+ "# PROYECTO III PROGRAMA DE FORMACIÓN MLDS AVANZADO\n",
10
+ "## Daniel F. Benavides R. \n",
11
+ "## Módulo VI - Entrenamiento de modelo de red neuronal y disposición del mismo a nivel local. \n",
12
+ "\n",
13
+ "### OBJETIVO\n",
14
+ "\n",
15
+ "El objetivo de este proyecto es realizar el despliegue de un modelo a nivel local. El mismo se llevará a cabo en dos partes: La primera en la cual se realiza el entrenamiento del modelo. El mismo se guarda a nivel local para su posterior uso. \n",
16
+ "\n",
17
+ "Es así como a continuación se ve el ejercicio de fine-tuning del modelo preentrenado de transformers [_'distilbert-base-uncased'_](https://huggingface.co/distilbert-base-uncased?text=Paris+is+the+%5BMASK%5D+of+France.). Este modelo inicialmente fue entrenado para labores de _fill mask_ y se adaptará como modelo clasificación de **SMS** no deseado. \n"
18
+ ]
19
+ },
20
+ {
21
+ "cell_type": "code",
22
+ "execution_count": 33,
23
+ "metadata": {
24
+ "colab": {
25
+ "base_uri": "https://localhost:8080/"
26
+ },
27
+ "id": "gTYjO-PJIUT-",
28
+ "outputId": "27f1d993-d34f-4f4b-bbe1-ab6c96bc2825"
29
+ },
30
+ "outputs": [
31
+ {
32
+ "output_type": "stream",
33
+ "name": "stdout",
34
+ "text": [
35
+ "Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount(\"/content/drive\", force_remount=True).\n"
36
+ ]
37
+ }
38
+ ],
39
+ "source": [
40
+ "from google.colab import drive\n",
41
+ "drive.mount('/content/drive')"
42
+ ]
43
+ },
44
+ {
45
+ "cell_type": "markdown",
46
+ "metadata": {
47
+ "id": "GRFpQTqzDCLT"
48
+ },
49
+ "source": [
50
+ "### Carga y manipulación de los datos \n",
51
+ "\n",
52
+ "A continuación importamos pandas, por medio del cual hacemos el respectivo cargue del dataset, delimitamos por el espacio la etiqueta del mensaje.\n",
53
+ "\n",
54
+ "Luego por medio de la función _list_ convertimos el mensaje y las etiquetas en un par de listas. luego convertimos las etiquetas en una variable dummie, debido a que tenemos una salida binaria _(el mensaje es spam o no lo es)_\n",
55
+ "\n",
56
+ "## Importamos el dataset de entrenamiento"
57
+ ]
58
+ },
59
+ {
60
+ "cell_type": "code",
61
+ "execution_count": 34,
62
+ "metadata": {
63
+ "colab": {
64
+ "base_uri": "https://localhost:8080/"
65
+ },
66
+ "id": "3Dt9fFKs74zR",
67
+ "outputId": "271ff620-a449-4e87-a163-14621651b48e"
68
+ },
69
+ "outputs": [
70
+ {
71
+ "output_type": "stream",
72
+ "name": "stdout",
73
+ "text": [
74
+ "Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount(\"/content/drive\", force_remount=True).\n"
75
+ ]
76
+ }
77
+ ],
78
+ "source": [
79
+ "from google.colab import drive\n",
80
+ "drive.mount('/content/drive')\n",
81
+ "path= \"/content/drive/MyDrive/MLDS-2/MODULO II/Talleres/SMSSpamCollection\""
82
+ ]
83
+ },
84
+ {
85
+ "cell_type": "code",
86
+ "execution_count": 35,
87
+ "metadata": {
88
+ "id": "awPXefiYqQsF"
89
+ },
90
+ "outputs": [],
91
+ "source": [
92
+ "\n",
93
+ "import pandas as pd\n",
94
+ "df=messages = pd.read_csv(path, sep='\\t',\n",
95
+ " names=[\"label\", \"message\"])\n",
96
+ "X=list(df['message'])\n",
97
+ "y=list(df['label'])\n",
98
+ "y=list(pd.get_dummies(y,drop_first=True)['spam'])\n"
99
+ ]
100
+ },
101
+ {
102
+ "cell_type": "markdown",
103
+ "metadata": {
104
+ "id": "1T8CpN3YDar6"
105
+ },
106
+ "source": [
107
+ "### Preprocesamiento \n",
108
+ "\n",
109
+ "Ahora importamos la función *train_test_split* del módulo *model_selection* de la librería *scikit-learn* y por medio de este dividimos en set de entrenamiento y prueba. Definimos el tamaño de set de prueba en 20% de la muestra. También definimos el parámetro *random_state* para efectos de controlar la generación de los dos conjuntos de tal manera que no sean aleatorios. \n",
110
+ "\n",
111
+ "Luego instalamos la librería transformers, aunque en mi caso ya lo había realizado. \n"
112
+ ]
113
+ },
114
+ {
115
+ "cell_type": "code",
116
+ "execution_count": 36,
117
+ "metadata": {
118
+ "id": "dLFDWda0rIKw"
119
+ },
120
+ "outputs": [],
121
+ "source": [
122
+ "from sklearn.model_selection import train_test_split\n",
123
+ "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state = 0)"
124
+ ]
125
+ },
126
+ {
127
+ "cell_type": "code",
128
+ "execution_count": 37,
129
+ "metadata": {
130
+ "colab": {
131
+ "base_uri": "https://localhost:8080/"
132
+ },
133
+ "id": "AqOBGiGErZgj",
134
+ "outputId": "1a461d33-55ae-4a22-9746-8c01e99d49bd"
135
+ },
136
+ "outputs": [
137
+ {
138
+ "output_type": "stream",
139
+ "name": "stdout",
140
+ "text": [
141
+ "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
142
+ "Requirement already satisfied: transformers in /usr/local/lib/python3.8/dist-packages (4.25.1)\n",
143
+ "Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.8/dist-packages (from transformers) (21.3)\n",
144
+ "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.8/dist-packages (from transformers) (2022.6.2)\n",
145
+ "Requirement already satisfied: huggingface-hub<1.0,>=0.10.0 in /usr/local/lib/python3.8/dist-packages (from transformers) (0.11.1)\n",
146
+ "Requirement already satisfied: filelock in /usr/local/lib/python3.8/dist-packages (from transformers) (3.8.2)\n",
147
+ "Requirement already satisfied: requests in /usr/local/lib/python3.8/dist-packages (from transformers) (2.23.0)\n",
148
+ "Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.8/dist-packages (from transformers) (4.64.1)\n",
149
+ "Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.8/dist-packages (from transformers) (1.21.6)\n",
150
+ "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.8/dist-packages (from transformers) (6.0)\n",
151
+ "Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /usr/local/lib/python3.8/dist-packages (from transformers) (0.13.2)\n",
152
+ "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.8/dist-packages (from huggingface-hub<1.0,>=0.10.0->transformers) (4.4.0)\n",
153
+ "Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.8/dist-packages (from packaging>=20.0->transformers) (3.0.9)\n",
154
+ "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (2022.12.7)\n",
155
+ "Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (1.24.3)\n",
156
+ "Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (2.10)\n",
157
+ "Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (3.0.4)\n"
158
+ ]
159
+ }
160
+ ],
161
+ "source": [
162
+ "!pip install transformers"
163
+ ]
164
+ },
165
+ {
166
+ "cell_type": "markdown",
167
+ "metadata": {
168
+ "id": "31aeQ6u-Dq-K"
169
+ },
170
+ "source": [
171
+ "Ahora debemos invocar los modelos de que vamos a utilizar de la librería transformers en los siguientes pasos: \n",
172
+ "\n",
173
+ "* Llamamos el modelo preentrenado\n",
174
+ "* Llamamos el tokenizador \n",
175
+ "\n",
176
+ "Necesitamos aplicar el tokenizador sobre nuestro conjunto de datos. \n",
177
+ "\n",
178
+ "Así que acontinuación llamamos de la librería transformers el tokenizador _\"DistilBertTokenizerFast\"_ luego lo definimos como nuestro **tokenizer** indicando que el mismo proviene del modelo preentrenado [_'distilbert-base-uncased'_](https://huggingface.co/distilbert-base-uncased?text=Paris+is+the+%5BMASK%5D+of+France.)"
179
+ ]
180
+ },
181
+ {
182
+ "cell_type": "code",
183
+ "execution_count": 38,
184
+ "metadata": {
185
+ "id": "bcNEJ6perOSs"
186
+ },
187
+ "outputs": [],
188
+ "source": [
189
+ "from transformers import DistilBertTokenizerFast\n",
190
+ "tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-uncased')"
191
+ ]
192
+ },
193
+ {
194
+ "cell_type": "markdown",
195
+ "metadata": {
196
+ "id": "3RdR0eZaDyyi"
197
+ },
198
+ "source": [
199
+ "Luego aplicamos el tokenizador que acabamos de definir sobre nuestro conjunto de mensajes de entrenamiento y prueba. Como los SMS no tienen la misma longitud (cantidad de tokens) debemos definir los parámetros truncation y padding como True para que se obtener oraciones del mismo tamaño; uno se encarga de rellenar de ceros (padding) y el otro de truncar las oraciones más largas. Esto para obtener un conjunto y luego tensores rectangulares. "
200
+ ]
201
+ },
202
+ {
203
+ "cell_type": "code",
204
+ "execution_count": 39,
205
+ "metadata": {
206
+ "id": "-OL3fgLvrXvH"
207
+ },
208
+ "outputs": [],
209
+ "source": [
210
+ "train_encodings = tokenizer(X_train, truncation=True, padding=True)\n",
211
+ "test_encodings = tokenizer(X_test, truncation=True, padding=True)\n"
212
+ ]
213
+ },
214
+ {
215
+ "cell_type": "markdown",
216
+ "metadata": {
217
+ "id": "7JTdRQNVD4AK"
218
+ },
219
+ "source": [
220
+ "Ahora se procede a importar Tensorflow para efecto de convertir en tensores los encodings generados en el paso anterior. Acá se junta cada uno a su correspondiente etiqueta. "
221
+ ]
222
+ },
223
+ {
224
+ "cell_type": "code",
225
+ "execution_count": 40,
226
+ "metadata": {
227
+ "id": "9B42CTCnrrEx"
228
+ },
229
+ "outputs": [],
230
+ "source": [
231
+ "import tensorflow as tf\n",
232
+ "\n",
233
+ "train_dataset = tf.data.Dataset.from_tensor_slices((\n",
234
+ " dict(train_encodings),\n",
235
+ " y_train\n",
236
+ "))\n",
237
+ "\n",
238
+ "test_dataset = tf.data.Dataset.from_tensor_slices((\n",
239
+ " dict(test_encodings),\n",
240
+ " y_test\n",
241
+ "))"
242
+ ]
243
+ },
244
+ {
245
+ "cell_type": "markdown",
246
+ "metadata": {
247
+ "id": "G3Wj2cqXD5hx"
248
+ },
249
+ "source": [
250
+ "### Entrenamiento\n",
251
+ "\n",
252
+ "A continuación se importan los módulos de TFDistilBertForSequenceClassification que es usado para la tarea de clasificación de sentimientos. También se importan los módulos y *TFTrainingArguments* y *TFTrainer*que son los encargados de definir los argumentos y posteriormente parametrizar el **trainer** del modelo y hacer las nuevas predicciones. \n"
253
+ ]
254
+ },
255
+ {
256
+ "cell_type": "code",
257
+ "execution_count": 41,
258
+ "metadata": {
259
+ "id": "NH1dupK0rzfn"
260
+ },
261
+ "outputs": [],
262
+ "source": [
263
+ "from transformers import TFDistilBertForSequenceClassification, TFTrainer, TFTrainingArguments\n",
264
+ "\n",
265
+ "training_args = TFTrainingArguments(\n",
266
+ " eval_steps = 10, \n",
267
+ " output_dir='./results', # output directory\n",
268
+ " num_train_epochs=2, # total number of training epochs\n",
269
+ " per_device_train_batch_size=8, # batch size per device during training\n",
270
+ " per_device_eval_batch_size=8, # batch size for evaluation\n",
271
+ " warmup_steps=500, # number of warmup steps for learning rate scheduler\n",
272
+ " weight_decay=0.01, # strength of weight decay\n",
273
+ " logging_dir='./logs', # directory for storing logs\n",
274
+ " logging_steps=10,\n",
275
+ ")"
276
+ ]
277
+ },
278
+ {
279
+ "cell_type": "markdown",
280
+ "metadata": {
281
+ "id": "cqmXOhkSEAgC"
282
+ },
283
+ "source": [
284
+ "Hemos determinado el conjunto de argumentos que serán utilizados en el reentrenamiento del modelo,estos quedan alojados en el objeto *training_args* y ahora definiremos el modelo refiriendo el modelo preentrenado que vamos a utilizar, que en este caso es _\"distilbert-base-uncased\"_. se creará el **trainer** al cual se le pasarán los argumentos antes definidos y los dos tensores de entrenamiento y prueba; para luego entrenar el modelo que hemos definido."
285
+ ]
286
+ },
287
+ {
288
+ "cell_type": "code",
289
+ "execution_count": 42,
290
+ "metadata": {
291
+ "colab": {
292
+ "base_uri": "https://localhost:8080/"
293
+ },
294
+ "id": "PZvTrEcfr7k-",
295
+ "outputId": "f8fe7e3c-c9c8-4b92-c3fc-ea4821386beb"
296
+ },
297
+ "outputs": [
298
+ {
299
+ "output_type": "stream",
300
+ "name": "stderr",
301
+ "text": [
302
+ "Some layers from the model checkpoint at distilbert-base-uncased were not used when initializing TFDistilBertForSequenceClassification: ['vocab_projector', 'activation_13', 'vocab_layer_norm', 'vocab_transform']\n",
303
+ "- This IS expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
304
+ "- This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n",
305
+ "Some layers of TFDistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['dropout_39', 'pre_classifier', 'classifier']\n",
306
+ "You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n",
307
+ "/usr/local/lib/python3.8/dist-packages/transformers/trainer_tf.py:115: FutureWarning: The class `TFTrainer` is deprecated and will be removed in version 5 of Transformers. We recommend using native Keras instead, by calling methods like `fit()` and `predict()` directly on the model object. Detailed examples of the Keras style can be found in our examples at https://github.com/huggingface/transformers/tree/main/examples/tensorflow\n",
308
+ " warnings.warn(\n"
309
+ ]
310
+ }
311
+ ],
312
+ "source": [
313
+ "with training_args.strategy.scope():\n",
314
+ " model = TFDistilBertForSequenceClassification.from_pretrained(\"distilbert-base-uncased\")\n",
315
+ "\n",
316
+ "trainer = TFTrainer(\n",
317
+ " model=model, # the instantiated 🤗 Transformers model to be trained\n",
318
+ " args=training_args, # training arguments, defined above\n",
319
+ " train_dataset=train_dataset, # training dataset\n",
320
+ " eval_dataset=test_dataset # evaluation dataset\n",
321
+ ")\n"
322
+ ]
323
+ },
324
+ {
325
+ "cell_type": "markdown",
326
+ "metadata": {
327
+ "id": "tnAE3agZ21dq"
328
+ },
329
+ "source": [
330
+ "una vez instanciado el modelo que será reentrenado, parametrizados los argumentos para ello, se toma la data y se realiza el reentrenamiento del modelo. "
331
+ ]
332
+ },
333
+ {
334
+ "cell_type": "code",
335
+ "execution_count": 43,
336
+ "metadata": {
337
+ "id": "bIba4vQg7Ecp"
338
+ },
339
+ "outputs": [],
340
+ "source": [
341
+ "trainer.train()"
342
+ ]
343
+ },
344
+ {
345
+ "cell_type": "markdown",
346
+ "metadata": {
347
+ "id": "Zerz-bv8EENp"
348
+ },
349
+ "source": [
350
+ "Ahora solo queda por aplicar modelo que reentrenamos con el dataset de **entrenamiento**, hacer la predicción, y la evaluación de las predicciones. Este procedimiento se encuentra definido en el [manual de fine-tuning](https://huggingface.co/transformers/v3.5.1/training.html) que tiene Hugging Face disponible. "
351
+ ]
352
+ },
353
+ {
354
+ "cell_type": "code",
355
+ "execution_count": 44,
356
+ "metadata": {
357
+ "colab": {
358
+ "base_uri": "https://localhost:8080/"
359
+ },
360
+ "id": "R534aDi3xD0s",
361
+ "outputId": "65c5ac93-eb67-4413-e048-f7b4d9fd8931"
362
+ },
363
+ "outputs": [
364
+ {
365
+ "output_type": "execute_result",
366
+ "data": {
367
+ "text/plain": [
368
+ "{'eval_loss': 0.02398163080215454}"
369
+ ]
370
+ },
371
+ "metadata": {},
372
+ "execution_count": 44
373
+ }
374
+ ],
375
+ "source": [
376
+ "trainer.evaluate(test_dataset)"
377
+ ]
378
+ },
379
+ {
380
+ "cell_type": "markdown",
381
+ "metadata": {
382
+ "id": "4rLF3nApUndt"
383
+ },
384
+ "source": [
385
+ "A continuación aplicamos el modelo reentrenado al set de prueba hacer la respectiva clasificación de cada una de las muestras. "
386
+ ]
387
+ },
388
+ {
389
+ "cell_type": "markdown",
390
+ "metadata": {
391
+ "id": "jpGNNvWEWU9u"
392
+ },
393
+ "source": [
394
+ "### Predicción del modelo\n",
395
+ "\n",
396
+ "Se aplica el modelo reentrenado al dataset de prueba *test_dataset* y se evalúa la precisión del mismo por medio del accuracy, es decir, acá le pasamos mensajes sin etiquetas y le pedimos que prediga si son o no spam. El modelo para la tarea que fue entrenado presenta un accuracy de 1, es decir clasifica perfectamente el set de prueba. "
397
+ ]
398
+ },
399
+ {
400
+ "cell_type": "code",
401
+ "execution_count": 45,
402
+ "metadata": {
403
+ "colab": {
404
+ "base_uri": "https://localhost:8080/"
405
+ },
406
+ "id": "UyBmI1WcxKjG",
407
+ "outputId": "53067a82-55bf-4500-a38e-d890be6f7bf5"
408
+ },
409
+ "outputs": [
410
+ {
411
+ "output_type": "execute_result",
412
+ "data": {
413
+ "text/plain": [
414
+ "PredictionOutput(predictions=array([[ 3.4155877, -3.1767924],\n",
415
+ " [-3.2374823, 3.135958 ],\n",
416
+ " [ 3.348417 , -3.1216612],\n",
417
+ " ...,\n",
418
+ " [ 3.04905 , -2.8354154],\n",
419
+ " [-3.1865208, 3.0687277],\n",
420
+ " [ 3.212608 , -3.0316095]], dtype=float32), label_ids=array([0, 1, 0, ..., 0, 1, 0], dtype=int32), metrics={'eval_loss': 0.023984665530068533})"
421
+ ]
422
+ },
423
+ "metadata": {},
424
+ "execution_count": 45
425
+ }
426
+ ],
427
+ "source": [
428
+ "trainer.predict(test_dataset)"
429
+ ]
430
+ },
431
+ {
432
+ "cell_type": "code",
433
+ "execution_count": 46,
434
+ "metadata": {
435
+ "colab": {
436
+ "base_uri": "https://localhost:8080/"
437
+ },
438
+ "id": "9Qc5FtM8xn9A",
439
+ "outputId": "0d517424-b5d1-4324-be3c-6f90335aa4fd"
440
+ },
441
+ "outputs": [
442
+ {
443
+ "output_type": "execute_result",
444
+ "data": {
445
+ "text/plain": [
446
+ "(1115,)"
447
+ ]
448
+ },
449
+ "metadata": {},
450
+ "execution_count": 46
451
+ }
452
+ ],
453
+ "source": [
454
+ "trainer.predict(test_dataset)[1].shape"
455
+ ]
456
+ },
457
+ {
458
+ "cell_type": "markdown",
459
+ "metadata": {
460
+ "id": "LUHX_tCTWFuu"
461
+ },
462
+ "source": [
463
+ "#### Salidas"
464
+ ]
465
+ },
466
+ {
467
+ "cell_type": "code",
468
+ "execution_count": 47,
469
+ "metadata": {
470
+ "colab": {
471
+ "base_uri": "https://localhost:8080/"
472
+ },
473
+ "id": "fUVX_IhWxkxg",
474
+ "outputId": "a2e94ee6-54a2-414f-c2e7-98950deb7732"
475
+ },
476
+ "outputs": [
477
+ {
478
+ "output_type": "execute_result",
479
+ "data": {
480
+ "text/plain": [
481
+ "array([0, 1, 0, ..., 0, 1, 0], dtype=int32)"
482
+ ]
483
+ },
484
+ "metadata": {},
485
+ "execution_count": 47
486
+ }
487
+ ],
488
+ "source": [
489
+ "output=trainer.predict(test_dataset)[1]\n",
490
+ "output"
491
+ ]
492
+ },
493
+ {
494
+ "cell_type": "markdown",
495
+ "metadata": {
496
+ "id": "lUxvb6JcYB7_"
497
+ },
498
+ "source": [
499
+ "#### Matriz de confusión, Accuracy"
500
+ ]
501
+ },
502
+ {
503
+ "cell_type": "code",
504
+ "execution_count": 48,
505
+ "metadata": {
506
+ "colab": {
507
+ "base_uri": "https://localhost:8080/"
508
+ },
509
+ "id": "cfCE06jQu5cI",
510
+ "outputId": "a1d10897-a36f-47a8-e038-0f68ec5e7ded"
511
+ },
512
+ "outputs": [
513
+ {
514
+ "output_type": "execute_result",
515
+ "data": {
516
+ "text/plain": [
517
+ "array([[955, 0],\n",
518
+ " [ 0, 160]])"
519
+ ]
520
+ },
521
+ "metadata": {},
522
+ "execution_count": 48
523
+ }
524
+ ],
525
+ "source": [
526
+ "from sklearn.metrics import confusion_matrix, accuracy_score\n",
527
+ "\n",
528
+ "confusion_matrix=confusion_matrix(y_test,output)\n",
529
+ "confusion_matrix\n"
530
+ ]
531
+ },
532
+ {
533
+ "cell_type": "code",
534
+ "execution_count": 49,
535
+ "metadata": {
536
+ "colab": {
537
+ "base_uri": "https://localhost:8080/"
538
+ },
539
+ "id": "mv83DD8sl8JO",
540
+ "outputId": "97612c62-b15f-453f-d51e-5cd12e554421"
541
+ },
542
+ "outputs": [
543
+ {
544
+ "output_type": "execute_result",
545
+ "data": {
546
+ "text/plain": [
547
+ "1.0"
548
+ ]
549
+ },
550
+ "metadata": {},
551
+ "execution_count": 49
552
+ }
553
+ ],
554
+ "source": [
555
+ "acc=accuracy_score(y_test,output)\n",
556
+ "acc"
557
+ ]
558
+ },
559
+ {
560
+ "cell_type": "markdown",
561
+ "metadata": {
562
+ "id": "Zm3mF58zYYze"
563
+ },
564
+ "source": [
565
+ "#### Descarga del modelo"
566
+ ]
567
+ },
568
+ {
569
+ "cell_type": "code",
570
+ "execution_count": 59,
571
+ "metadata": {
572
+ "id": "okD5we1NwhQW"
573
+ },
574
+ "outputs": [],
575
+ "source": [
576
+ "trainer.save_model('ft_model')\n",
577
+ "trainer.save_model('/content/drive/MyDrive/MLDS-2/MODULO III/Talleres/Modelo Entrenado')\n"
578
+ ]
579
+ },
580
+ {
581
+ "cell_type": "code",
582
+ "source": [
583
+ "!pip install transformers"
584
+ ],
585
+ "metadata": {
586
+ "colab": {
587
+ "base_uri": "https://localhost:8080/"
588
+ },
589
+ "id": "iuipIt7zN8Ct",
590
+ "outputId": "f078f829-8827-4e20-daed-9baf1e007394"
591
+ },
592
+ "execution_count": 60,
593
+ "outputs": [
594
+ {
595
+ "output_type": "stream",
596
+ "name": "stdout",
597
+ "text": [
598
+ "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
599
+ "Requirement already satisfied: transformers in /usr/local/lib/python3.8/dist-packages (4.25.1)\n",
600
+ "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.8/dist-packages (from transformers) (6.0)\n",
601
+ "Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.8/dist-packages (from transformers) (1.21.6)\n",
602
+ "Requirement already satisfied: huggingface-hub<1.0,>=0.10.0 in /usr/local/lib/python3.8/dist-packages (from transformers) (0.11.1)\n",
603
+ "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.8/dist-packages (from transformers) (2022.6.2)\n",
604
+ "Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.8/dist-packages (from transformers) (4.64.1)\n",
605
+ "Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /usr/local/lib/python3.8/dist-packages (from transformers) (0.13.2)\n",
606
+ "Requirement already satisfied: filelock in /usr/local/lib/python3.8/dist-packages (from transformers) (3.8.2)\n",
607
+ "Requirement already satisfied: requests in /usr/local/lib/python3.8/dist-packages (from transformers) (2.23.0)\n",
608
+ "Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.8/dist-packages (from transformers) (21.3)\n",
609
+ "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.8/dist-packages (from huggingface-hub<1.0,>=0.10.0->transformers) (4.4.0)\n",
610
+ "Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.8/dist-packages (from packaging>=20.0->transformers) (3.0.9)\n",
611
+ "Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (1.24.3)\n",
612
+ "Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (2.10)\n",
613
+ "Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (3.0.4)\n",
614
+ "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests->transformers) (2022.12.7)\n"
615
+ ]
616
+ }
617
+ ]
618
+ },
619
+ {
620
+ "cell_type": "code",
621
+ "source": [
622
+ "!pip install huggingface_hub"
623
+ ],
624
+ "metadata": {
625
+ "colab": {
626
+ "base_uri": "https://localhost:8080/"
627
+ },
628
+ "id": "Oo5x6eVZN7_9",
629
+ "outputId": "21c914ea-8540-4192-e11d-f77d3774fef1"
630
+ },
631
+ "execution_count": 61,
632
+ "outputs": [
633
+ {
634
+ "output_type": "stream",
635
+ "name": "stdout",
636
+ "text": [
637
+ "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
638
+ "Requirement already satisfied: huggingface_hub in /usr/local/lib/python3.8/dist-packages (0.11.1)\n",
639
+ "Requirement already satisfied: packaging>=20.9 in /usr/local/lib/python3.8/dist-packages (from huggingface_hub) (21.3)\n",
640
+ "Requirement already satisfied: filelock in /usr/local/lib/python3.8/dist-packages (from huggingface_hub) (3.8.2)\n",
641
+ "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.8/dist-packages (from huggingface_hub) (6.0)\n",
642
+ "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.8/dist-packages (from huggingface_hub) (4.4.0)\n",
643
+ "Requirement already satisfied: tqdm in /usr/local/lib/python3.8/dist-packages (from huggingface_hub) (4.64.1)\n",
644
+ "Requirement already satisfied: requests in /usr/local/lib/python3.8/dist-packages (from huggingface_hub) (2.23.0)\n",
645
+ "Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.8/dist-packages (from packaging>=20.9->huggingface_hub) (3.0.9)\n",
646
+ "Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests->huggingface_hub) (2.10)\n",
647
+ "Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests->huggingface_hub) (3.0.4)\n",
648
+ "Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests->huggingface_hub) (1.24.3)\n",
649
+ "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests->huggingface_hub) (2022.12.7)\n"
650
+ ]
651
+ }
652
+ ]
653
+ },
654
+ {
655
+ "cell_type": "code",
656
+ "source": [
657
+ "import torch\n",
658
+ "from transformers import BertTokenizer, BertForSequenceClassification, TFDistilBertForSequenceClassification"
659
+ ],
660
+ "metadata": {
661
+ "id": "XXn00BW7N79l"
662
+ },
663
+ "execution_count": 62,
664
+ "outputs": []
665
+ },
666
+ {
667
+ "cell_type": "code",
668
+ "source": [
669
+ "model2 = TFDistilBertForSequenceClassification.from_pretrained('/content/drive/MyDrive/MLDS-2/MODULO III/Talleres/Modelo Entrenado/')\n"
670
+ ],
671
+ "metadata": {
672
+ "colab": {
673
+ "base_uri": "https://localhost:8080/"
674
+ },
675
+ "id": "nnajL6gxN7zN",
676
+ "outputId": "a5477437-8a60-44f3-fa99-c83de5010cc6"
677
+ },
678
+ "execution_count": 63,
679
+ "outputs": [
680
+ {
681
+ "output_type": "stream",
682
+ "name": "stderr",
683
+ "text": [
684
+ "Some layers from the model checkpoint at /content/drive/MyDrive/MLDS-2/MODULO III/Talleres/Modelo Entrenado/ were not used when initializing TFDistilBertForSequenceClassification: ['dropout_39']\n",
685
+ "- This IS expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
686
+ "- This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n",
687
+ "Some layers of TFDistilBertForSequenceClassification were not initialized from the model checkpoint at /content/drive/MyDrive/MLDS-2/MODULO III/Talleres/Modelo Entrenado/ and are newly initialized: ['dropout_59']\n",
688
+ "You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n"
689
+ ]
690
+ }
691
+ ]
692
+ },
693
+ {
694
+ "cell_type": "code",
695
+ "source": [
696
+ "from huggingface_hub import notebook_login\n",
697
+ "\n",
698
+ "notebook_login()\n"
699
+ ],
700
+ "metadata": {
701
+ "colab": {
702
+ "base_uri": "https://localhost:8080/",
703
+ "height": 359,
704
+ "referenced_widgets": [
705
+ "eae418e551f44d90b53b67ab19f2681a",
706
+ "d339a1c63e24406faf2c231028ab0d7f",
707
+ "e4665b549c0f48698529d99f475e07ca",
708
+ "92b55d980482446da8d2aed8b58593ff",
709
+ "2dfe4e10862342918c0c151f3f719fef",
710
+ "5e73bccd9598457b856dd44741261ed3",
711
+ "cdc2b5d89c814b6ca8c2e7a30f17b1e2",
712
+ "d76091dbf8334024b4edf2ec9e5bf32d",
713
+ "f43c0faafc754458bce32f3eee55a153",
714
+ "ff5b269f82684bc9ab81c3caae7299ca",
715
+ "902f6a0765a0469ebd1994f789ad80e2",
716
+ "c14909df983c43aeb3f62b670c376b60",
717
+ "05e9de72aac143918ec0e31263075982",
718
+ "b92abd39e834458d95ed024468835ff4",
719
+ "46133dced4ec4122bbca398b70f4aadb",
720
+ "a026cfb4f04a4e5f95fc6d6dcf53b9e1",
721
+ "40af31601c0b4396bdf2da2d81ba1f1b"
722
+ ]
723
+ },
724
+ "id": "smWmyyktyYKr",
725
+ "outputId": "7ae0dee8-eafb-4134-962e-f3ed0187f2c9"
726
+ },
727
+ "execution_count": 65,
728
+ "outputs": [
729
+ {
730
+ "output_type": "stream",
731
+ "name": "stdout",
732
+ "text": [
733
+ "Token is valid.\n",
734
+ "Your token has been saved in your configured git credential helpers (store).\n",
735
+ "Your token has been saved to /root/.huggingface/token\n",
736
+ "Login successful\n"
737
+ ]
738
+ }
739
+ ]
740
+ },
741
+ {
742
+ "cell_type": "code",
743
+ "source": [
744
+ "model2.push_to_hub(\"Dfbenavidesr/distilbert-base-uncased-finetuned_clf-spam\")"
745
+ ],
746
+ "metadata": {
747
+ "id": "1HTnPWjUTDCt"
748
+ },
749
+ "execution_count": 66,
750
+ "outputs": []
751
+ },
752
+ {
753
+ "cell_type": "code",
754
+ "source": [
755
+ "model2 = TFDistilBertForSequenceClassification.from_pretrained(\"Dfbenavidesr/distilbert-base-uncased-finetuned_clf-spam\")"
756
+ ],
757
+ "metadata": {
758
+ "colab": {
759
+ "base_uri": "https://localhost:8080/"
760
+ },
761
+ "id": "YNboOPrSTwe1",
762
+ "outputId": "7382c934-1760-41d8-dd80-3513fd37168c"
763
+ },
764
+ "execution_count": 70,
765
+ "outputs": [
766
+ {
767
+ "output_type": "stream",
768
+ "name": "stderr",
769
+ "text": [
770
+ "Some layers from the model checkpoint at Dfbenavidesr/distilbert-base-uncased-finetuned_clf-spam were not used when initializing TFDistilBertForSequenceClassification: ['dropout_59']\n",
771
+ "- This IS expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
772
+ "- This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n",
773
+ "Some layers of TFDistilBertForSequenceClassification were not initialized from the model checkpoint at Dfbenavidesr/distilbert-base-uncased-finetuned_clf-spam and are newly initialized: ['dropout_99']\n",
774
+ "You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n"
775
+ ]
776
+ }
777
+ ]
778
+ },
779
+ {
780
+ "cell_type": "code",
781
+ "source": [
782
+ "\n"
783
+ ],
784
+ "metadata": {
785
+ "colab": {
786
+ "base_uri": "https://localhost:8080/"
787
+ },
788
+ "id": "5THCF3RsgirM",
789
+ "outputId": "ddbc0b75-c437-4d97-c623-a62d421d865f"
790
+ },
791
+ "execution_count": 71,
792
+ "outputs": [
793
+ {
794
+ "output_type": "execute_result",
795
+ "data": {
796
+ "text/plain": [
797
+ "<transformers.models.distilbert.modeling_tf_distilbert.TFDistilBertForSequenceClassification at 0x7fd8b2fec8e0>"
798
+ ]
799
+ },
800
+ "metadata": {},
801
+ "execution_count": 71
802
+ }
803
+ ]
804
+ }
805
+ ],
806
+ "metadata": {
807
+ "accelerator": "GPU",
808
+ "colab": {
809
+ "machine_shape": "hm",
810
+ "provenance": []
811
+ },
812
+ "kernelspec": {
813
+ "display_name": "Python 3",
814
+ "name": "python3"
815
+ },
816
+ "language_info": {
817
+ "name": "python"
818
+ },
819
+ "widgets": {
820
+ "application/vnd.jupyter.widget-state+json": {
821
+ "eae418e551f44d90b53b67ab19f2681a": {
822
+ "model_module": "@jupyter-widgets/controls",
823
+ "model_name": "VBoxModel",
824
+ "model_module_version": "1.5.0",
825
+ "state": {
826
+ "_dom_classes": [],
827
+ "_model_module": "@jupyter-widgets/controls",
828
+ "_model_module_version": "1.5.0",
829
+ "_model_name": "VBoxModel",
830
+ "_view_count": null,
831
+ "_view_module": "@jupyter-widgets/controls",
832
+ "_view_module_version": "1.5.0",
833
+ "_view_name": "VBoxView",
834
+ "box_style": "",
835
+ "children": [
836
+ "IPY_MODEL_d339a1c63e24406faf2c231028ab0d7f",
837
+ "IPY_MODEL_e4665b549c0f48698529d99f475e07ca",
838
+ "IPY_MODEL_92b55d980482446da8d2aed8b58593ff",
839
+ "IPY_MODEL_2dfe4e10862342918c0c151f3f719fef",
840
+ "IPY_MODEL_5e73bccd9598457b856dd44741261ed3"
841
+ ],
842
+ "layout": "IPY_MODEL_cdc2b5d89c814b6ca8c2e7a30f17b1e2"
843
+ }
844
+ },
845
+ "d339a1c63e24406faf2c231028ab0d7f": {
846
+ "model_module": "@jupyter-widgets/controls",
847
+ "model_name": "HTMLModel",
848
+ "model_module_version": "1.5.0",
849
+ "state": {
850
+ "_dom_classes": [],
851
+ "_model_module": "@jupyter-widgets/controls",
852
+ "_model_module_version": "1.5.0",
853
+ "_model_name": "HTMLModel",
854
+ "_view_count": null,
855
+ "_view_module": "@jupyter-widgets/controls",
856
+ "_view_module_version": "1.5.0",
857
+ "_view_name": "HTMLView",
858
+ "description": "",
859
+ "description_tooltip": null,
860
+ "layout": "IPY_MODEL_d76091dbf8334024b4edf2ec9e5bf32d",
861
+ "placeholder": "​",
862
+ "style": "IPY_MODEL_f43c0faafc754458bce32f3eee55a153",
863
+ "value": "<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.svg\nalt='Hugging Face'> <br> Copy a token from <a\nhref=\"https://huggingface.co/settings/tokens\" target=\"_blank\">your Hugging Face\ntokens page</a> and paste it below. <br> Immediately click login after copying\nyour token or it might be stored in plain text in this notebook file. </center>"
864
+ }
865
+ },
866
+ "e4665b549c0f48698529d99f475e07ca": {
867
+ "model_module": "@jupyter-widgets/controls",
868
+ "model_name": "PasswordModel",
869
+ "model_module_version": "1.5.0",
870
+ "state": {
871
+ "_dom_classes": [],
872
+ "_model_module": "@jupyter-widgets/controls",
873
+ "_model_module_version": "1.5.0",
874
+ "_model_name": "PasswordModel",
875
+ "_view_count": null,
876
+ "_view_module": "@jupyter-widgets/controls",
877
+ "_view_module_version": "1.5.0",
878
+ "_view_name": "PasswordView",
879
+ "continuous_update": true,
880
+ "description": "Token:",
881
+ "description_tooltip": null,
882
+ "disabled": false,
883
+ "layout": "IPY_MODEL_ff5b269f82684bc9ab81c3caae7299ca",
884
+ "placeholder": "​",
885
+ "style": "IPY_MODEL_902f6a0765a0469ebd1994f789ad80e2",
886
+ "value": ""
887
+ }
888
+ },
889
+ "92b55d980482446da8d2aed8b58593ff": {
890
+ "model_module": "@jupyter-widgets/controls",
891
+ "model_name": "CheckboxModel",
892
+ "model_module_version": "1.5.0",
893
+ "state": {
894
+ "_dom_classes": [],
895
+ "_model_module": "@jupyter-widgets/controls",
896
+ "_model_module_version": "1.5.0",
897
+ "_model_name": "CheckboxModel",
898
+ "_view_count": null,
899
+ "_view_module": "@jupyter-widgets/controls",
900
+ "_view_module_version": "1.5.0",
901
+ "_view_name": "CheckboxView",
902
+ "description": "Add token as git credential?",
903
+ "description_tooltip": null,
904
+ "disabled": false,
905
+ "indent": true,
906
+ "layout": "IPY_MODEL_c14909df983c43aeb3f62b670c376b60",
907
+ "style": "IPY_MODEL_05e9de72aac143918ec0e31263075982",
908
+ "value": true
909
+ }
910
+ },
911
+ "2dfe4e10862342918c0c151f3f719fef": {
912
+ "model_module": "@jupyter-widgets/controls",
913
+ "model_name": "ButtonModel",
914
+ "model_module_version": "1.5.0",
915
+ "state": {
916
+ "_dom_classes": [],
917
+ "_model_module": "@jupyter-widgets/controls",
918
+ "_model_module_version": "1.5.0",
919
+ "_model_name": "ButtonModel",
920
+ "_view_count": null,
921
+ "_view_module": "@jupyter-widgets/controls",
922
+ "_view_module_version": "1.5.0",
923
+ "_view_name": "ButtonView",
924
+ "button_style": "",
925
+ "description": "Login",
926
+ "disabled": false,
927
+ "icon": "",
928
+ "layout": "IPY_MODEL_b92abd39e834458d95ed024468835ff4",
929
+ "style": "IPY_MODEL_46133dced4ec4122bbca398b70f4aadb",
930
+ "tooltip": ""
931
+ }
932
+ },
933
+ "5e73bccd9598457b856dd44741261ed3": {
934
+ "model_module": "@jupyter-widgets/controls",
935
+ "model_name": "HTMLModel",
936
+ "model_module_version": "1.5.0",
937
+ "state": {
938
+ "_dom_classes": [],
939
+ "_model_module": "@jupyter-widgets/controls",
940
+ "_model_module_version": "1.5.0",
941
+ "_model_name": "HTMLModel",
942
+ "_view_count": null,
943
+ "_view_module": "@jupyter-widgets/controls",
944
+ "_view_module_version": "1.5.0",
945
+ "_view_name": "HTMLView",
946
+ "description": "",
947
+ "description_tooltip": null,
948
+ "layout": "IPY_MODEL_a026cfb4f04a4e5f95fc6d6dcf53b9e1",
949
+ "placeholder": "​",
950
+ "style": "IPY_MODEL_40af31601c0b4396bdf2da2d81ba1f1b",
951
+ "value": "\n<b>Pro Tip:</b> If you don't already have one, you can create a dedicated\n'notebooks' token with 'write' access, that you can then easily reuse for all\nnotebooks. </center>"
952
+ }
953
+ },
954
+ "cdc2b5d89c814b6ca8c2e7a30f17b1e2": {
955
+ "model_module": "@jupyter-widgets/base",
956
+ "model_name": "LayoutModel",
957
+ "model_module_version": "1.2.0",
958
+ "state": {
959
+ "_model_module": "@jupyter-widgets/base",
960
+ "_model_module_version": "1.2.0",
961
+ "_model_name": "LayoutModel",
962
+ "_view_count": null,
963
+ "_view_module": "@jupyter-widgets/base",
964
+ "_view_module_version": "1.2.0",
965
+ "_view_name": "LayoutView",
966
+ "align_content": null,
967
+ "align_items": "center",
968
+ "align_self": null,
969
+ "border": null,
970
+ "bottom": null,
971
+ "display": "flex",
972
+ "flex": null,
973
+ "flex_flow": "column",
974
+ "grid_area": null,
975
+ "grid_auto_columns": null,
976
+ "grid_auto_flow": null,
977
+ "grid_auto_rows": null,
978
+ "grid_column": null,
979
+ "grid_gap": null,
980
+ "grid_row": null,
981
+ "grid_template_areas": null,
982
+ "grid_template_columns": null,
983
+ "grid_template_rows": null,
984
+ "height": null,
985
+ "justify_content": null,
986
+ "justify_items": null,
987
+ "left": null,
988
+ "margin": null,
989
+ "max_height": null,
990
+ "max_width": null,
991
+ "min_height": null,
992
+ "min_width": null,
993
+ "object_fit": null,
994
+ "object_position": null,
995
+ "order": null,
996
+ "overflow": null,
997
+ "overflow_x": null,
998
+ "overflow_y": null,
999
+ "padding": null,
1000
+ "right": null,
1001
+ "top": null,
1002
+ "visibility": null,
1003
+ "width": "50%"
1004
+ }
1005
+ },
1006
+ "d76091dbf8334024b4edf2ec9e5bf32d": {
1007
+ "model_module": "@jupyter-widgets/base",
1008
+ "model_name": "LayoutModel",
1009
+ "model_module_version": "1.2.0",
1010
+ "state": {
1011
+ "_model_module": "@jupyter-widgets/base",
1012
+ "_model_module_version": "1.2.0",
1013
+ "_model_name": "LayoutModel",
1014
+ "_view_count": null,
1015
+ "_view_module": "@jupyter-widgets/base",
1016
+ "_view_module_version": "1.2.0",
1017
+ "_view_name": "LayoutView",
1018
+ "align_content": null,
1019
+ "align_items": null,
1020
+ "align_self": null,
1021
+ "border": null,
1022
+ "bottom": null,
1023
+ "display": null,
1024
+ "flex": null,
1025
+ "flex_flow": null,
1026
+ "grid_area": null,
1027
+ "grid_auto_columns": null,
1028
+ "grid_auto_flow": null,
1029
+ "grid_auto_rows": null,
1030
+ "grid_column": null,
1031
+ "grid_gap": null,
1032
+ "grid_row": null,
1033
+ "grid_template_areas": null,
1034
+ "grid_template_columns": null,
1035
+ "grid_template_rows": null,
1036
+ "height": null,
1037
+ "justify_content": null,
1038
+ "justify_items": null,
1039
+ "left": null,
1040
+ "margin": null,
1041
+ "max_height": null,
1042
+ "max_width": null,
1043
+ "min_height": null,
1044
+ "min_width": null,
1045
+ "object_fit": null,
1046
+ "object_position": null,
1047
+ "order": null,
1048
+ "overflow": null,
1049
+ "overflow_x": null,
1050
+ "overflow_y": null,
1051
+ "padding": null,
1052
+ "right": null,
1053
+ "top": null,
1054
+ "visibility": null,
1055
+ "width": null
1056
+ }
1057
+ },
1058
+ "f43c0faafc754458bce32f3eee55a153": {
1059
+ "model_module": "@jupyter-widgets/controls",
1060
+ "model_name": "DescriptionStyleModel",
1061
+ "model_module_version": "1.5.0",
1062
+ "state": {
1063
+ "_model_module": "@jupyter-widgets/controls",
1064
+ "_model_module_version": "1.5.0",
1065
+ "_model_name": "DescriptionStyleModel",
1066
+ "_view_count": null,
1067
+ "_view_module": "@jupyter-widgets/base",
1068
+ "_view_module_version": "1.2.0",
1069
+ "_view_name": "StyleView",
1070
+ "description_width": ""
1071
+ }
1072
+ },
1073
+ "ff5b269f82684bc9ab81c3caae7299ca": {
1074
+ "model_module": "@jupyter-widgets/base",
1075
+ "model_name": "LayoutModel",
1076
+ "model_module_version": "1.2.0",
1077
+ "state": {
1078
+ "_model_module": "@jupyter-widgets/base",
1079
+ "_model_module_version": "1.2.0",
1080
+ "_model_name": "LayoutModel",
1081
+ "_view_count": null,
1082
+ "_view_module": "@jupyter-widgets/base",
1083
+ "_view_module_version": "1.2.0",
1084
+ "_view_name": "LayoutView",
1085
+ "align_content": null,
1086
+ "align_items": null,
1087
+ "align_self": null,
1088
+ "border": null,
1089
+ "bottom": null,
1090
+ "display": null,
1091
+ "flex": null,
1092
+ "flex_flow": null,
1093
+ "grid_area": null,
1094
+ "grid_auto_columns": null,
1095
+ "grid_auto_flow": null,
1096
+ "grid_auto_rows": null,
1097
+ "grid_column": null,
1098
+ "grid_gap": null,
1099
+ "grid_row": null,
1100
+ "grid_template_areas": null,
1101
+ "grid_template_columns": null,
1102
+ "grid_template_rows": null,
1103
+ "height": null,
1104
+ "justify_content": null,
1105
+ "justify_items": null,
1106
+ "left": null,
1107
+ "margin": null,
1108
+ "max_height": null,
1109
+ "max_width": null,
1110
+ "min_height": null,
1111
+ "min_width": null,
1112
+ "object_fit": null,
1113
+ "object_position": null,
1114
+ "order": null,
1115
+ "overflow": null,
1116
+ "overflow_x": null,
1117
+ "overflow_y": null,
1118
+ "padding": null,
1119
+ "right": null,
1120
+ "top": null,
1121
+ "visibility": null,
1122
+ "width": null
1123
+ }
1124
+ },
1125
+ "902f6a0765a0469ebd1994f789ad80e2": {
1126
+ "model_module": "@jupyter-widgets/controls",
1127
+ "model_name": "DescriptionStyleModel",
1128
+ "model_module_version": "1.5.0",
1129
+ "state": {
1130
+ "_model_module": "@jupyter-widgets/controls",
1131
+ "_model_module_version": "1.5.0",
1132
+ "_model_name": "DescriptionStyleModel",
1133
+ "_view_count": null,
1134
+ "_view_module": "@jupyter-widgets/base",
1135
+ "_view_module_version": "1.2.0",
1136
+ "_view_name": "StyleView",
1137
+ "description_width": ""
1138
+ }
1139
+ },
1140
+ "c14909df983c43aeb3f62b670c376b60": {
1141
+ "model_module": "@jupyter-widgets/base",
1142
+ "model_name": "LayoutModel",
1143
+ "model_module_version": "1.2.0",
1144
+ "state": {
1145
+ "_model_module": "@jupyter-widgets/base",
1146
+ "_model_module_version": "1.2.0",
1147
+ "_model_name": "LayoutModel",
1148
+ "_view_count": null,
1149
+ "_view_module": "@jupyter-widgets/base",
1150
+ "_view_module_version": "1.2.0",
1151
+ "_view_name": "LayoutView",
1152
+ "align_content": null,
1153
+ "align_items": null,
1154
+ "align_self": null,
1155
+ "border": null,
1156
+ "bottom": null,
1157
+ "display": null,
1158
+ "flex": null,
1159
+ "flex_flow": null,
1160
+ "grid_area": null,
1161
+ "grid_auto_columns": null,
1162
+ "grid_auto_flow": null,
1163
+ "grid_auto_rows": null,
1164
+ "grid_column": null,
1165
+ "grid_gap": null,
1166
+ "grid_row": null,
1167
+ "grid_template_areas": null,
1168
+ "grid_template_columns": null,
1169
+ "grid_template_rows": null,
1170
+ "height": null,
1171
+ "justify_content": null,
1172
+ "justify_items": null,
1173
+ "left": null,
1174
+ "margin": null,
1175
+ "max_height": null,
1176
+ "max_width": null,
1177
+ "min_height": null,
1178
+ "min_width": null,
1179
+ "object_fit": null,
1180
+ "object_position": null,
1181
+ "order": null,
1182
+ "overflow": null,
1183
+ "overflow_x": null,
1184
+ "overflow_y": null,
1185
+ "padding": null,
1186
+ "right": null,
1187
+ "top": null,
1188
+ "visibility": null,
1189
+ "width": null
1190
+ }
1191
+ },
1192
+ "05e9de72aac143918ec0e31263075982": {
1193
+ "model_module": "@jupyter-widgets/controls",
1194
+ "model_name": "DescriptionStyleModel",
1195
+ "model_module_version": "1.5.0",
1196
+ "state": {
1197
+ "_model_module": "@jupyter-widgets/controls",
1198
+ "_model_module_version": "1.5.0",
1199
+ "_model_name": "DescriptionStyleModel",
1200
+ "_view_count": null,
1201
+ "_view_module": "@jupyter-widgets/base",
1202
+ "_view_module_version": "1.2.0",
1203
+ "_view_name": "StyleView",
1204
+ "description_width": ""
1205
+ }
1206
+ },
1207
+ "b92abd39e834458d95ed024468835ff4": {
1208
+ "model_module": "@jupyter-widgets/base",
1209
+ "model_name": "LayoutModel",
1210
+ "model_module_version": "1.2.0",
1211
+ "state": {
1212
+ "_model_module": "@jupyter-widgets/base",
1213
+ "_model_module_version": "1.2.0",
1214
+ "_model_name": "LayoutModel",
1215
+ "_view_count": null,
1216
+ "_view_module": "@jupyter-widgets/base",
1217
+ "_view_module_version": "1.2.0",
1218
+ "_view_name": "LayoutView",
1219
+ "align_content": null,
1220
+ "align_items": null,
1221
+ "align_self": null,
1222
+ "border": null,
1223
+ "bottom": null,
1224
+ "display": null,
1225
+ "flex": null,
1226
+ "flex_flow": null,
1227
+ "grid_area": null,
1228
+ "grid_auto_columns": null,
1229
+ "grid_auto_flow": null,
1230
+ "grid_auto_rows": null,
1231
+ "grid_column": null,
1232
+ "grid_gap": null,
1233
+ "grid_row": null,
1234
+ "grid_template_areas": null,
1235
+ "grid_template_columns": null,
1236
+ "grid_template_rows": null,
1237
+ "height": null,
1238
+ "justify_content": null,
1239
+ "justify_items": null,
1240
+ "left": null,
1241
+ "margin": null,
1242
+ "max_height": null,
1243
+ "max_width": null,
1244
+ "min_height": null,
1245
+ "min_width": null,
1246
+ "object_fit": null,
1247
+ "object_position": null,
1248
+ "order": null,
1249
+ "overflow": null,
1250
+ "overflow_x": null,
1251
+ "overflow_y": null,
1252
+ "padding": null,
1253
+ "right": null,
1254
+ "top": null,
1255
+ "visibility": null,
1256
+ "width": null
1257
+ }
1258
+ },
1259
+ "46133dced4ec4122bbca398b70f4aadb": {
1260
+ "model_module": "@jupyter-widgets/controls",
1261
+ "model_name": "ButtonStyleModel",
1262
+ "model_module_version": "1.5.0",
1263
+ "state": {
1264
+ "_model_module": "@jupyter-widgets/controls",
1265
+ "_model_module_version": "1.5.0",
1266
+ "_model_name": "ButtonStyleModel",
1267
+ "_view_count": null,
1268
+ "_view_module": "@jupyter-widgets/base",
1269
+ "_view_module_version": "1.2.0",
1270
+ "_view_name": "StyleView",
1271
+ "button_color": null,
1272
+ "font_weight": ""
1273
+ }
1274
+ },
1275
+ "a026cfb4f04a4e5f95fc6d6dcf53b9e1": {
1276
+ "model_module": "@jupyter-widgets/base",
1277
+ "model_name": "LayoutModel",
1278
+ "model_module_version": "1.2.0",
1279
+ "state": {
1280
+ "_model_module": "@jupyter-widgets/base",
1281
+ "_model_module_version": "1.2.0",
1282
+ "_model_name": "LayoutModel",
1283
+ "_view_count": null,
1284
+ "_view_module": "@jupyter-widgets/base",
1285
+ "_view_module_version": "1.2.0",
1286
+ "_view_name": "LayoutView",
1287
+ "align_content": null,
1288
+ "align_items": null,
1289
+ "align_self": null,
1290
+ "border": null,
1291
+ "bottom": null,
1292
+ "display": null,
1293
+ "flex": null,
1294
+ "flex_flow": null,
1295
+ "grid_area": null,
1296
+ "grid_auto_columns": null,
1297
+ "grid_auto_flow": null,
1298
+ "grid_auto_rows": null,
1299
+ "grid_column": null,
1300
+ "grid_gap": null,
1301
+ "grid_row": null,
1302
+ "grid_template_areas": null,
1303
+ "grid_template_columns": null,
1304
+ "grid_template_rows": null,
1305
+ "height": null,
1306
+ "justify_content": null,
1307
+ "justify_items": null,
1308
+ "left": null,
1309
+ "margin": null,
1310
+ "max_height": null,
1311
+ "max_width": null,
1312
+ "min_height": null,
1313
+ "min_width": null,
1314
+ "object_fit": null,
1315
+ "object_position": null,
1316
+ "order": null,
1317
+ "overflow": null,
1318
+ "overflow_x": null,
1319
+ "overflow_y": null,
1320
+ "padding": null,
1321
+ "right": null,
1322
+ "top": null,
1323
+ "visibility": null,
1324
+ "width": null
1325
+ }
1326
+ },
1327
+ "40af31601c0b4396bdf2da2d81ba1f1b": {
1328
+ "model_module": "@jupyter-widgets/controls",
1329
+ "model_name": "DescriptionStyleModel",
1330
+ "model_module_version": "1.5.0",
1331
+ "state": {
1332
+ "_model_module": "@jupyter-widgets/controls",
1333
+ "_model_module_version": "1.5.0",
1334
+ "_model_name": "DescriptionStyleModel",
1335
+ "_view_count": null,
1336
+ "_view_module": "@jupyter-widgets/base",
1337
+ "_view_module_version": "1.2.0",
1338
+ "_view_name": "StyleView",
1339
+ "description_width": ""
1340
+ }
1341
+ }
1342
+ }
1343
+ }
1344
+ },
1345
+ "nbformat": 4,
1346
+ "nbformat_minor": 0
1347
+ }