Spaces:
Runtime error
Runtime error
File size: 8,414 Bytes
4df8249 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 |
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "xf3pUNyVO3WS"
},
"source": [
"# Check GPU's Memory Capacity\n",
"\n",
"By running `nvidia-smi` command, you can find out the GPU's memory capacity on the current system. \n",
"\n",
"With the standard GPU instance(___T4___) which is free, you can run 7B and 13B models. With the premium GPU instance(___A100 40GB___) which is paid with the compute unit that you own, you can even run 30B model! Choose the instance at the menu `Runtime` -> `Change runtime type` -> `Hardware accelerator (GPU)` -> `GPU class (Standard or Premium)`"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "L2MoM27rfaKK",
"outputId": "53175950-3269-4296-9425-3652c81ce9b7"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Wed Mar 22 12:11:41 2023 \n",
"+-----------------------------------------------------------------------------+\n",
"| NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.0 |\n",
"|-------------------------------+----------------------+----------------------+\n",
"| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n",
"| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n",
"| | | MIG M. |\n",
"|===============================+======================+======================|\n",
"| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |\n",
"| N/A 41C P0 24W / 70W | 0MiB / 15360MiB | 0% Default |\n",
"| | | N/A |\n",
"+-------------------------------+----------------------+----------------------+\n",
" \n",
"+-----------------------------------------------------------------------------+\n",
"| Processes: |\n",
"| GPU GI CI PID Type Process name GPU Memory |\n",
"| ID ID Usage |\n",
"|=============================================================================|\n",
"| No running processes found |\n",
"+-----------------------------------------------------------------------------+\n"
]
}
],
"source": [
"!nvidia-smi"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "N0MDD9TuPTfJ"
},
"source": [
"# Clone the repository"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "a_i5DKBNnzAK"
},
"outputs": [],
"source": [
"!git clone https://github.com/deep-diver/LLM-As-Chatbot.git"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HUuzxWGuPYLq"
},
"source": [
"# Move into the directory of the cloned repository"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "wR-M8u7gsQqg",
"outputId": "eb7b24ba-10e4-46d5-cf8f-852d9fac8170"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/content/Alpaca-LoRA-Serve\n"
]
}
],
"source": [
"%cd LLM-As-Chatbot"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XG8oy7BBPdMh"
},
"source": [
"# Install dependencies"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "moN-15x_ifHE",
"outputId": "a7ec61ff-28cb-4ac4-a0ca-6a5cba060579"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" Building wheel for transformers (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
" Created wheel for transformers: filename=transformers-4.28.0.dev0-py3-none-any.whl size=6758864 sha256=028619344608e01338ac944ad0d4e6496fe5c743c90a15dd20c2e436e56106a9\n",
" Stored in directory: /tmp/pip-ephem-wheel-cache-vqcgstta/wheels/f7/92/8c/752ff3bfcd3439805d8bbf641614da38ef3226e127ebea86ee\n",
" Building wheel for peft (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
" Created wheel for peft: filename=peft-0.3.0.dev0-py3-none-any.whl size=40669 sha256=bb0afa4164ac44e0a604c781f61767ea3e7255b85b70e2d4cf76a4252119ac27\n",
" Stored in directory: /tmp/pip-ephem-wheel-cache-vqcgstta/wheels/2d/60/1b/0edd9dc0f0c489738b1166bc1b0b560ee368f7721f89d06e3a\n",
" Building wheel for ffmpy (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
" Created wheel for ffmpy: filename=ffmpy-0.3.0-py3-none-any.whl size=4707 sha256=5f7dae7c29ab50f6251f5c864c70d4e485a4338a98c5cc1ee51523ace2758bf1\n",
" Stored in directory: /root/.cache/pip/wheels/91/e2/96/f676aa08bfd789328c6576cd0f1fde4a3d686703bb0c247697\n",
"Successfully built transformers peft ffmpy\n",
"Installing collected packages: tokenizers, sentencepiece, rfc3986, pydub, ffmpy, bitsandbytes, xxhash, websockets, uc-micro-py, python-multipart, pycryptodome, orjson, multidict, mdurl, loralib, h11, frozenlist, dill, async-timeout, aiofiles, yarl, uvicorn, starlette, responses, multiprocess, markdown-it-py, linkify-it-py, huggingface-hub, httpcore, aiosignal, accelerate, transformers, mdit-py-plugins, httpx, fastapi, aiohttp, peft, gradio, datasets\n",
"Successfully installed accelerate-0.17.1 aiofiles-23.1.0 aiohttp-3.8.4 aiosignal-1.3.1 async-timeout-4.0.2 bitsandbytes-0.37.2 datasets-2.10.1 dill-0.3.6 fastapi-0.95.0 ffmpy-0.3.0 frozenlist-1.3.3 gradio-3.20.0 h11-0.14.0 httpcore-0.16.3 httpx-0.23.3 huggingface-hub-0.13.3 linkify-it-py-2.0.0 loralib-0.1.1 markdown-it-py-2.2.0 mdit-py-plugins-0.3.3 mdurl-0.1.2 multidict-6.0.4 multiprocess-0.70.14 orjson-3.8.8 peft-0.3.0.dev0 pycryptodome-3.17 pydub-0.25.1 python-multipart-0.0.6 responses-0.18.0 rfc3986-1.5.0 sentencepiece-0.1.97 starlette-0.26.1 tokenizers-0.13.2 transformers-4.28.0.dev0 uc-micro-py-1.0.1 uvicorn-0.21.1 websockets-10.4 xxhash-3.2.0 yarl-1.8.2\n"
]
}
],
"source": [
"!pip install -r requirements.txt"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Cr3bQkSePfrG"
},
"source": [
"# Run the application"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"id": "4Wg0eqnkPnq-"
},
"outputs": [],
"source": [
"#@title Choose models\n",
"\n",
"base_model = 'decapoda-research/llama-13b-hf' #@param [\"decapoda-research/llama-7b-hf\", \"decapoda-research/llama-13b-hf\", \"decapoda-research/llama-30b-hf\"]\n",
"finetuned_model = 'chansung/alpaca-lora-13b' #@param [\"tloen/alpaca-lora-7b\", \"chansung/alpaca-lora-13b\", \"chansung/koalpaca-lora-13b\", \"chansung/alpaca-lora-30b\"]\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "b81jhdtcQyOP"
},
"source": [
"## Run the application\n",
"\n",
"It will take some time since LLaMA weights are huge. \n",
"\n",
"Click the URL appeared in the `Running on public URL:` field from the log. That will bring you to a new browser tab which opens up the running application."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "y3qpzBw2jMHq"
},
"outputs": [],
"source": [
"!python app.py --base-url $base_model --ft-ckpt-url $finetuned_model --share"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"machine_shape": "hm",
"provenance": []
},
"gpuClass": "premium",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.12"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
|