{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "Q-bj6K7Qv4ft" }, "source": [ "# Fine-Tuning a Generative Pretrained Transformer (`GPT`)\n", "\n", "1. Install required libraries." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "SBWCrz5GfBXo", "outputId": "1e1ab31b-3b54-4c39-a872-fb39e4c5c8a4" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", "Requirement already satisfied: transformers in /usr/local/lib/python3.10/dist-packages (4.30.2)\n", "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers) (3.12.0)\n", "Requirement already satisfied: huggingface-hub<1.0,>=0.14.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.15.1)\n", "Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (1.22.4)\n", "Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from transformers) (23.1)\n", "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (6.0)\n", "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (2022.10.31)\n", "Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers) (2.27.1)\n", "Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.13.3)\n", "Requirement already satisfied: safetensors>=0.3.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.3.1)\n", "Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers) (4.65.0)\n", "Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.14.1->transformers) (2023.4.0)\n", "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.14.1->transformers) (4.5.0)\n", "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (1.26.15)\n", "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2022.12.7)\n", "Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2.0.12)\n", "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.4)\n", "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", "Requirement already satisfied: codecarbon in /usr/local/lib/python3.10/dist-packages (2.2.3)\n", "Requirement already satisfied: arrow in /usr/local/lib/python3.10/dist-packages (from codecarbon) (1.2.3)\n", "Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from codecarbon) (1.5.3)\n", "Requirement already satisfied: pynvml in /usr/local/lib/python3.10/dist-packages (from codecarbon) (11.5.0)\n", "Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from codecarbon) (2.27.1)\n", "Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from codecarbon) (5.9.5)\n", "Requirement already satisfied: py-cpuinfo in /usr/local/lib/python3.10/dist-packages (from codecarbon) (9.0.0)\n", "Requirement already satisfied: fuzzywuzzy in /usr/local/lib/python3.10/dist-packages (from codecarbon) (0.18.0)\n", "Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from codecarbon) (8.1.3)\n", "Requirement already satisfied: python-dateutil>=2.7.0 in /usr/local/lib/python3.10/dist-packages (from arrow->codecarbon) (2.8.2)\n", "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->codecarbon) (2022.7.1)\n", "Requirement already satisfied: numpy>=1.21.0 in /usr/local/lib/python3.10/dist-packages (from pandas->codecarbon) (1.22.4)\n", "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->codecarbon) (1.26.15)\n", "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->codecarbon) (2022.12.7)\n", "Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from requests->codecarbon) (2.0.12)\n", "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->codecarbon) (3.4)\n", "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7.0->arrow->codecarbon) (1.16.0)\n" ] } ], "source": [ "!pip install transformers\n", "!pip install codecarbon" ] }, { "cell_type": "markdown", "metadata": { "id": "y5XnfvSH7w4z" }, "source": [ "2. Load the data from the hub." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 467 }, "id": "7MbpXGu-v4f1", "outputId": "3d8858dc-6ea1-4871-cb04-44dcecce51a6" }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " | prompt | \n", "completion | \n", "
---|---|---|
0 | \n", "Which is a species of fish? Tope or Rope | \n", "Tope | \n", "
1 | \n", "Why can camels survive for long without water? | \n", "Camels use the fat in their humps to keep them... | \n", "
2 | \n", "Alice's parents have three daughters: Amy, Jes... | \n", "The name of the third daughter is Alice | \n", "
3 | \n", "Who gave the UN the land in NY to build their HQ | \n", "John D Rockerfeller | \n", "
4 | \n", "Why mobile is bad for human | \n", "We are always engaged one phone which is not g... | \n", "
... | \n", "... | \n", "... | \n", "
53129 | \n", "How do computers communicate and network with ... | \n", "Computers communicate and network with each ot... | \n", "
53130 | \n", "How are websites different from web applications? | \n", "Websites and web applications are similar in t... | \n", "
53131 | \n", "What is open-source software and its benefits? | \n", "Open-source software is software that is made ... | \n", "
53132 | \n", "What is a cookie and how is it used in web bro... | \n", "A cookie is a small piece of data that a websi... | \n", "
53133 | \n", "What is cloud storage and its advantages for d... | \n", "Cloud storage is a service that allows you to ... | \n", "
53134 rows × 2 columns
\n", "