{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "albert_glue_fine_tuning_tutorial", "provenance": [], "collapsed_sections": [], "toc_visible": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "accelerator": "TPU" }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "y8SJfpgTccDB", "colab_type": "text" }, "source": [ "\n", "\"Open" ] }, { "cell_type": "code", "metadata": { "id": "wHQH4OCHZ9bq", "colab_type": "code", "cellView": "form", "colab": {} }, "source": [ "# @title Copyright 2020 The ALBERT Authors. All Rights Reserved.\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# http://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License.\n", "# ==============================================================================" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "rkTLZ3I4_7c_", "colab_type": "text" }, "source": [ "# ALBERT End to End (Fine-tuning + Predicting) with Cloud TPU" ] }, { "cell_type": "markdown", "metadata": { "id": "1wtjs1QDb3DX", "colab_type": "text" }, "source": [ "## Overview\n", "\n", "ALBERT is \"A Lite\" version of BERT, a popular unsupervised language representation learning algorithm. ALBERT uses parameter-reduction techniques that allow for large-scale configurations, overcome previous memory limitations, and achieve better behavior with respect to model degradation.\n", "\n", "For a technical description of the algorithm, see our paper:\n", "\n", "https://arxiv.org/abs/1909.11942\n", "\n", "Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut\n", "\n", "This Colab demonstates using a free Colab Cloud TPU to fine-tune GLUE tasks built on top of pretrained ALBERT models and \n", "run predictions on tuned model. The colab demonsrates loading pretrained ALBERT models from both [TF Hub](https://www.tensorflow.org/hub) and checkpoints.\n", "\n", "**Note:** You will need a GCP (Google Compute Engine) account and a GCS (Google Cloud \n", "Storage) bucket for this Colab to run.\n", "\n", "Please follow the [Google Cloud TPU quickstart](https://cloud.google.com/tpu/docs/quickstart) for how to create GCP account and GCS bucket. You have [$300 free credit](https://cloud.google.com/free/) to get started with any GCP product. You can learn more about Cloud TPU at https://cloud.google.com/tpu/docs.\n", "\n", "This notebook is hosted on GitHub. To view it in its original repository, after opening the notebook, select **File > View on GitHub**." ] }, { "cell_type": "markdown", "metadata": { "id": "Ld-JXlueIuPH", "colab_type": "text" }, "source": [ "## Instructions" ] }, { "cell_type": "markdown", "metadata": { "id": "POkof5uHaQ_c", "colab_type": "text" }, "source": [ "

  Train on TPU

\n", "\n", " 1. Create a Cloud Storage bucket for your TensorBoard logs at http://console.cloud.google.com/storage and fill in the BUCKET parameter in the \"Parameters\" section below.\n", " \n", " 1. On the main menu, click Runtime and select **Change runtime type**. Set \"TPU\" as the hardware accelerator.\n", " 1. Click Runtime again and select **Runtime > Run All** (Watch out: the \"Colab-only auth for this notebook and the TPU\" cell requires user input). You can also run the cells manually with Shift-ENTER." ] }, { "cell_type": "markdown", "metadata": { "id": "UdMmwCJFaT8F", "colab_type": "text" }, "source": [ "### Set up your TPU environment\n", "\n", "In this section, you perform the following tasks:\n", "\n", "* Set up a Colab TPU running environment\n", "* Verify that you are connected to a TPU device\n", "* Upload your credentials to TPU to access your GCS bucket." ] }, { "cell_type": "code", "metadata": { "id": "191zq3ZErihP", "colab_type": "code", "colab": {} }, "source": [ "# TODO(lanzhzh): Add support for 2.x.\n", "%tensorflow_version 1.x\n", "import os\n", "import pprint\n", "import json\n", "import tensorflow as tf\n", "\n", "assert \"COLAB_TPU_ADDR\" in os.environ, \"ERROR: Not connected to a TPU runtime; please see the first cell in this notebook for instructions!\"\n", "TPU_ADDRESS = \"grpc://\" + os.environ[\"COLAB_TPU_ADDR\"] \n", "TPU_TOPOLOGY = \"2x2\"\n", "print(\"TPU address is\", TPU_ADDRESS)\n", "\n", "from google.colab import auth\n", "auth.authenticate_user()\n", "with tf.Session(TPU_ADDRESS) as session:\n", " print('TPU devices:')\n", " pprint.pprint(session.list_devices())\n", "\n", " # Upload credentials to TPU.\n", " with open('/content/adc.json', 'r') as f:\n", " auth_info = json.load(f)\n", " tf.contrib.cloud.configure_gcs(session, credentials=auth_info)\n", " # Now credentials are set for all future sessions on this TPU." ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "HUBP35oCDmbF", "colab_type": "text" }, "source": [ "### Prepare and import ALBERT modules\n", "​\n", "With your environment configured, you can now prepare and import the ALBERT modules. The following step clones the source code from GitHub." ] }, { "cell_type": "code", "metadata": { "id": "7wzwke0sxS6W", "colab_type": "code", "colab": {}, "cellView": "code" }, "source": [ "#TODO(lanzhzh): Add pip support\n", "import sys\n", "\n", "!test -d albert || git clone https://github.com/google-research/albert albert\n", "if not 'albert' in sys.path:\n", " sys.path += ['albert']\n", " \n", "!pip install sentencepiece\n" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "RRu1aKO1D7-Z", "colab_type": "text" }, "source": [ "### Prepare for training\n", "\n", "This next section of code performs the following tasks:\n", "\n", "* Specify GS bucket, create output directory for model checkpoints and eval results.\n", "* Specify task and download training data.\n", "* Specify ALBERT pretrained model\n", "\n", "\n", "\n" ] }, { "cell_type": "code", "metadata": { "id": "tYkaAlJNfhul", "colab_type": "code", "colab": {}, "cellView": "form" }, "source": [ "# Please find the full list of tasks and their fintuning hyperparameters\n", "# here https://github.com/google-research/albert/blob/master/run_glue.sh\n", "\n", "BUCKET = \"albert_tutorial_glue\" #@param { type: \"string\" }\n", "TASK = 'MRPC' #@param {type:\"string\"}\n", "# Available pretrained model checkpoints:\n", "# base, large, xlarge, xxlarge\n", "ALBERT_MODEL = 'base' #@param {type:\"string\"}\n", "\n", "TASK_DATA_DIR = 'glue_data'\n", "\n", "BASE_DIR = \"gs://\" + BUCKET\n", "if not BASE_DIR or BASE_DIR == \"gs://\":\n", " raise ValueError(\"You must enter a BUCKET.\")\n", "DATA_DIR = os.path.join(BASE_DIR, \"data\")\n", "MODELS_DIR = os.path.join(BASE_DIR, \"models\")\n", "OUTPUT_DIR = 'gs://{}/albert-tfhub/models/{}'.format(BUCKET, TASK)\n", "tf.gfile.MakeDirs(OUTPUT_DIR)\n", "print('***** Model output directory: {} *****'.format(OUTPUT_DIR))\n", "\n", "# Download glue data.\n", "! test -d download_glue_repo || git clone https://gist.github.com/60c2bdb54d156a41194446737ce03e2e.git download_glue_repo\n", "!python download_glue_repo/download_glue_data.py --data_dir=$TASK_DATA_DIR --tasks=$TASK\n", "print('***** Task data directory: {} *****'.format(TASK_DATA_DIR))\n", "\n", "ALBERT_MODEL_HUB = 'https://tfhub.dev/google/albert_' + ALBERT_MODEL + '/3'" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "Hcpfl4N2EdOk", "colab_type": "text" }, "source": [ "Now let's run the fine-tuning scripts. If you use the default MRPC task, this should be finished in around 10 mintues and you will get an accuracy of around 86.5." ] }, { "cell_type": "code", "metadata": { "id": "o8qXPxv8-kBO", "colab_type": "code", "colab": {} }, "source": [ "os.environ['TFHUB_CACHE_DIR'] = OUTPUT_DIR\n", "!python -m albert.run_classifier \\\n", " --data_dir=\"glue_data/\" \\\n", " --output_dir=$OUTPUT_DIR \\\n", " --albert_hub_module_handle=$ALBERT_MODEL_HUB \\\n", " --spm_model_file=\"from_tf_hub\" \\\n", " --do_train=True \\\n", " --do_eval=True \\\n", " --do_predict=False \\\n", " --max_seq_length=512 \\\n", " --optimizer=adamw \\\n", " --task_name=$TASK \\\n", " --warmup_step=200 \\\n", " --learning_rate=2e-5 \\\n", " --train_step=800 \\\n", " --save_checkpoints_steps=100 \\\n", " --train_batch_size=32 \\\n", " --tpu_name=$TPU_ADDRESS \\\n", " --use_tpu=True" ], "execution_count": 0, "outputs": [] } ] }