{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "xf3pUNyVO3WS" }, "source": [ "# Check GPU's Memory Capacity\n", "\n", "By running `nvidia-smi` command, you can find out the GPU's memory capacity on the current system. \n", "\n", "With the standard GPU instance(___T4___) which is free, you can run 7B and 13B models. With the premium GPU instance(___A100 40GB___) which is paid with the compute unit that you own, you can even run 30B model! Choose the instance at the menu `Runtime` -> `Change runtime type` -> `Hardware accelerator (GPU)` -> `GPU class (Standard or Premium)`" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "L2MoM27rfaKK", "outputId": "53175950-3269-4296-9425-3652c81ce9b7" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Wed Mar 22 12:11:41 2023 \n", "+-----------------------------------------------------------------------------+\n", "| NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.0 |\n", "|-------------------------------+----------------------+----------------------+\n", "| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n", "| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n", "| | | MIG M. |\n", "|===============================+======================+======================|\n", "| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |\n", "| N/A 41C P0 24W / 70W | 0MiB / 15360MiB | 0% Default |\n", "| | | N/A |\n", "+-------------------------------+----------------------+----------------------+\n", " \n", "+-----------------------------------------------------------------------------+\n", "| Processes: |\n", "| GPU GI CI PID Type Process name GPU Memory |\n", "| ID ID Usage |\n", "|=============================================================================|\n", "| No running processes found |\n", "+-----------------------------------------------------------------------------+\n" ] } ], "source": [ "!nvidia-smi" ] }, { "cell_type": "markdown", "metadata": { "id": "N0MDD9TuPTfJ" }, "source": [ "# Clone the repository" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "a_i5DKBNnzAK" }, "outputs": [], "source": [ "!git clone https://github.com/deep-diver/LLM-As-Chatbot.git" ] }, { "cell_type": "markdown", "metadata": { "id": "HUuzxWGuPYLq" }, "source": [ "# Move into the directory of the cloned repository" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "wR-M8u7gsQqg", "outputId": "eb7b24ba-10e4-46d5-cf8f-852d9fac8170" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/content/Alpaca-LoRA-Serve\n" ] } ], "source": [ "%cd LLM-As-Chatbot" ] }, { "cell_type": "markdown", "metadata": { "id": "XG8oy7BBPdMh" }, "source": [ "# Install dependencies" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "moN-15x_ifHE", "outputId": "a7ec61ff-28cb-4ac4-a0ca-6a5cba060579" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " Building wheel for transformers (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n", " Created wheel for transformers: filename=transformers-4.28.0.dev0-py3-none-any.whl size=6758864 sha256=028619344608e01338ac944ad0d4e6496fe5c743c90a15dd20c2e436e56106a9\n", " Stored in directory: /tmp/pip-ephem-wheel-cache-vqcgstta/wheels/f7/92/8c/752ff3bfcd3439805d8bbf641614da38ef3226e127ebea86ee\n", " Building wheel for peft (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n", " Created wheel for peft: filename=peft-0.3.0.dev0-py3-none-any.whl size=40669 sha256=bb0afa4164ac44e0a604c781f61767ea3e7255b85b70e2d4cf76a4252119ac27\n", " Stored in directory: /tmp/pip-ephem-wheel-cache-vqcgstta/wheels/2d/60/1b/0edd9dc0f0c489738b1166bc1b0b560ee368f7721f89d06e3a\n", " Building wheel for ffmpy (setup.py) ... \u001b[?25l\u001b[?25hdone\n", " Created wheel for ffmpy: filename=ffmpy-0.3.0-py3-none-any.whl size=4707 sha256=5f7dae7c29ab50f6251f5c864c70d4e485a4338a98c5cc1ee51523ace2758bf1\n", " Stored in directory: /root/.cache/pip/wheels/91/e2/96/f676aa08bfd789328c6576cd0f1fde4a3d686703bb0c247697\n", "Successfully built transformers peft ffmpy\n", "Installing collected packages: tokenizers, sentencepiece, rfc3986, pydub, ffmpy, bitsandbytes, xxhash, websockets, uc-micro-py, python-multipart, pycryptodome, orjson, multidict, mdurl, loralib, h11, frozenlist, dill, async-timeout, aiofiles, yarl, uvicorn, starlette, responses, multiprocess, markdown-it-py, linkify-it-py, huggingface-hub, httpcore, aiosignal, accelerate, transformers, mdit-py-plugins, httpx, fastapi, aiohttp, peft, gradio, datasets\n", "Successfully installed accelerate-0.17.1 aiofiles-23.1.0 aiohttp-3.8.4 aiosignal-1.3.1 async-timeout-4.0.2 bitsandbytes-0.37.2 datasets-2.10.1 dill-0.3.6 fastapi-0.95.0 ffmpy-0.3.0 frozenlist-1.3.3 gradio-3.20.0 h11-0.14.0 httpcore-0.16.3 httpx-0.23.3 huggingface-hub-0.13.3 linkify-it-py-2.0.0 loralib-0.1.1 markdown-it-py-2.2.0 mdit-py-plugins-0.3.3 mdurl-0.1.2 multidict-6.0.4 multiprocess-0.70.14 orjson-3.8.8 peft-0.3.0.dev0 pycryptodome-3.17 pydub-0.25.1 python-multipart-0.0.6 responses-0.18.0 rfc3986-1.5.0 sentencepiece-0.1.97 starlette-0.26.1 tokenizers-0.13.2 transformers-4.28.0.dev0 uc-micro-py-1.0.1 uvicorn-0.21.1 websockets-10.4 xxhash-3.2.0 yarl-1.8.2\n" ] } ], "source": [ "!pip install -r requirements.txt" ] }, { "cell_type": "markdown", "metadata": { "id": "Cr3bQkSePfrG" }, "source": [ "# Run the application" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "id": "4Wg0eqnkPnq-" }, "outputs": [], "source": [ "#@title Choose models\n", "\n", "base_model = 'decapoda-research/llama-13b-hf' #@param [\"decapoda-research/llama-7b-hf\", \"decapoda-research/llama-13b-hf\", \"decapoda-research/llama-30b-hf\"]\n", "finetuned_model = 'chansung/alpaca-lora-13b' #@param [\"tloen/alpaca-lora-7b\", \"chansung/alpaca-lora-13b\", \"chansung/koalpaca-lora-13b\", \"chansung/alpaca-lora-30b\"]\n" ] }, { "cell_type": "markdown", "metadata": { "id": "b81jhdtcQyOP" }, "source": [ "## Run the application\n", "\n", "It will take some time since LLaMA weights are huge. \n", "\n", "Click the URL appeared in the `Running on public URL:` field from the log. That will bring you to a new browser tab which opens up the running application." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "y3qpzBw2jMHq" }, "outputs": [], "source": [ "!python app.py --base-url $base_model --ft-ckpt-url $finetuned_model --share" ] } ], "metadata": { "accelerator": "GPU", "colab": { "machine_shape": "hm", "provenance": [] }, "gpuClass": "premium", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" } }, "nbformat": 4, "nbformat_minor": 4 }