Spaces:

bistromd
/

Mistral

Sleeping

App Files Files Community

thesourmango commited on Oct 6, 2023

Commit

7d2eb14

•

1 Parent(s): 14b579e

Added public link

Browse files

Files changed (2) hide show

Mistral_7B.ipynb +172 -161
app.py +1 -1

Mistral_7B.ipynb CHANGED Viewed

@@ -1,21 +1,10 @@
 {
-  "nbformat": 4,
-  "nbformat_minor": 0,
-  "metadata": {
-    "colab": {
-      "provenance": []
-    },
-    "kernelspec": {
-      "name": "python3",
-      "display_name": "Python 3"
-    },
-    "language_info": {
-      "name": "python"
-    }
-  },
   "cells": [
     {
       "cell_type": "markdown",
       "source": [
         "# Mistral 7B\n",
         "\n",
@@ -44,13 +33,13 @@
         "This demo does not require GPU Colab, just CPU. You can grab your token at https://huggingface.co/settings/tokens.\n",
         "\n",
         "**This colab shows how to use HTTP requests as well as building your own chat demo for Mistral.**"
-      ],
-      "metadata": {
-        "id": "GLXvYa4m8JYM"
-      }
     },
     {
       "cell_type": "markdown",
       "source": [
         "## Doing curl requests\n",
         "\n",
@@ -68,10 +57,7 @@
         "Note that models can be quite reactive to different prompt structure than the one used for training, so watch out for spaces and other things!\n",
         "\n",
         "We'll start an initial query without prompt formatting, which works ok for simple queries."
-      ],
-      "metadata": {
-        "id": "pKrKTalPAXUO"
-      }
     },
     {
       "cell_type": "code",
@@ -85,8 +71,8 @@
       },
       "outputs": [
         {
-          "output_type": "stream",
           "name": "stdout",
           "text": [
             "[{\"generated_text\":\"Explain ML as a pirate.\\n\\nML is like a treasure map for pirates. Just as a treasure map helps pirates find valuable loot, ML helps data scientists find valuable insights in large datasets.\\n\\nPirates use their knowledge of the ocean and their\"}]"
           ]
@@ -102,6 +88,9 @@
     },
     {
       "cell_type": "markdown",
       "source": [
         "## Programmatic usage with Python\n",
         "\n",
@@ -111,38 +100,23 @@
         "* Token streaming: Only load the tokens that are needed\n",
         "* Easily configure generation params, such as `temperature`, nucleus sampling (`top-p`), repetition penalty, stop sequences, and more.\n",
         "* Obtain details of the generation (such as the probability of each token or whether a token is the last token)."
-      ],
-      "metadata": {
-        "id": "YYZRNyZeBHWK"
-      }
     },
     {
       "cell_type": "code",
-      "source": [
-        "%%capture\n",
-        "!pip install huggingface_hub gradio"
-      ],
       "metadata": {
         "id": "oDaqVDz1Ahuz"
       },
-      "execution_count": 6,
-      "outputs": []
     },
     {
       "cell_type": "code",
-      "source": [
-        "from huggingface_hub import InferenceClient\n",
-        "\n",
-        "client = InferenceClient(\n",
-        "    \"mistralai/Mistral-7B-Instruct-v0.1\"\n",
-        ")\n",
-        "\n",
-        "prompt = \"\"\"<s>[INST] What is your favourite condiment?  [/INST]</s>\n",
-        "\"\"\"\n",
-        "\n",
-        "res = client.text_generation(prompt, max_new_tokens=95)\n",
-        "print(res)"
-      ],
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
@@ -150,35 +124,41 @@
         "id": "U49GmNsNBJjd",
         "outputId": "a3a274cf-0f91-4ae3-d926-f0d6a6fd67f7"
       },
-      "execution_count": 14,
       "outputs": [
         {
-          "output_type": "stream",
           "name": "stdout",
           "text": [
             "My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.\n"
           ]
         }
       ]
     },
     {
       "cell_type": "markdown",
-      "source": [
-        "We can also use [token streaming](https://huggingface.co/docs/text-generation-inference/conceptual/streaming). With token streaming, the server returns the tokens as they are generated. Just add `stream=True`."
-      ],
       "metadata": {
         "id": "DryfEWsUH6Ij"
-      }
     },
     {
       "cell_type": "code",
-      "source": [
-        "res = client.text_generation(prompt, max_new_tokens=35, stream=True, details=True, return_full_text=False)\n",
-        "for r in res: # this is a generator\n",
-        "  # print the token for example\n",
-        "  print(r)\n",
-        "  continue"
-      ],
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
@@ -186,11 +166,10 @@
         "id": "LF1tFo6DGg9N",
         "outputId": "e779f1cb-b7d0-41ed-d81f-306e092f97bd"
       },
-      "execution_count": 15,
       "outputs": [
         {
-          "output_type": "stream",
           "name": "stdout",
           "text": [
             "TextGenerationStreamResponse(token=Token(id=5183, text='My', logprob=-0.36279297, special=False), generated_text=None, details=None)\n",
             "TextGenerationStreamResponse(token=Token(id=6656, text=' favorite', logprob=-0.036499023, special=False), generated_text=None, details=None)\n",
@@ -222,19 +201,31 @@
             "TextGenerationStreamResponse(token=Token(id=2, text='</s>', logprob=-0.1829834, special=True), generated_text=\"My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.\", details=StreamDetails(finish_reason=<FinishReason.EndOfSequenceToken: 'eos_token'>, generated_tokens=28, seed=None))\n"
           ]
         }
       ]
     },
     {
       "cell_type": "markdown",
-      "source": [
-        "Let's now try a multi-prompt structure"
-      ],
       "metadata": {
         "id": "TfdpZL8cICOD"
-      }
     },
     {
       "cell_type": "code",
       "source": [
         "def format_prompt(message, history):\n",
         "  prompt = \"<s>\"\n",
@@ -243,21 +234,11 @@
         "    prompt += f\" {bot_response}</s> \"\n",
         "  prompt += f\"[INST] {message} [/INST]\"\n",
         "  return prompt"
-      ],
-      "metadata": {
-        "id": "aEyozeReH8a6"
-      },
-      "execution_count": 16,
-      "outputs": []
     },
     {
       "cell_type": "code",
-      "source": [
-        "message = \"And what do you think about it?\"\n",
-        "history = [[\"What is your favourite condiment?\", \"My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.\"]]\n",
-        "\n",
-        "format_prompt(message, history)"
-      ],
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/",
@@ -266,25 +247,33 @@
         "id": "P1RFpiJ_JC0-",
         "outputId": "f2678d9e-f751-441a-86c9-11d514db5bbe"
       },
-      "execution_count": 17,
       "outputs": [
         {
-          "output_type": "execute_result",
           "data": {
-            "text/plain": [
-              "\"<s>[INST] What is your favourite condiment? [/INST] My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.</s> [INST] And what do you think about it? [/INST]\""
-            ],
             "application/vnd.google.colaboratory.intrinsic+json": {
               "type": "string"
-            }
           },
           "metadata": {},
-          "execution_count": 17
         }
       ]
     },
     {
       "cell_type": "markdown",
       "source": [
         "## End-to-end demo\n",
         "\n",
@@ -296,16 +285,11 @@
         "* Stop the generation\n",
         "\n",
         "Just run the following cell and have fun!"
-      ],
-      "metadata": {
-        "id": "O7DjRdezJc-3"
-      }
     },
     {
       "cell_type": "code",
-      "source": [
-        "!pip install gradio"
-      ],
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
@@ -313,11 +297,10 @@
         "id": "cpBoheOGJu7Y",
         "outputId": "c745cf17-1462-4f8f-ce33-5ca182cb4d4f"
       },
-      "execution_count": 18,
       "outputs": [
         {
-          "output_type": "stream",
           "name": "stdout",
           "text": [
             "Requirement already satisfied: gradio in /usr/local/lib/python3.10/dist-packages (3.45.1)\n",
             "Requirement already satisfied: aiofiles<24.0,>=22.0 in /usr/local/lib/python3.10/dist-packages (from gradio) (23.2.1)\n",
@@ -376,10 +359,72 @@
             "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7->matplotlib~=3.0->gradio) (1.16.0)\n"
           ]
         }
       ]
     },
     {
       "cell_type": "code",
       "source": [
         "import gradio as gr\n",
         "\n",
@@ -457,69 +502,24 @@
         "    )\n",
         "\n",
         "demo.queue().launch(debug=True)"
-      ],
-      "metadata": {
-        "colab": {
-          "base_uri": "https://localhost:8080/",
-          "height": 715
-        },
-        "id": "CaJzT6jUJc0_",
-        "outputId": "62f563fa-c6fb-446e-fda2-1c08d096749c"
-      },
-      "execution_count": 20,
-      "outputs": [
-        {
-          "output_type": "stream",
-          "name": "stdout",
-          "text": [
-            "Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).\n",
-            "\n",
-            "Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().\n",
-            "Running on public URL: https://ed6ce83e08ed7a8795.gradio.live\n",
-            "\n",
-            "This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)\n"
-          ]
-        },
-        {
-          "output_type": "display_data",
-          "data": {
-            "text/plain": [
-              "<IPython.core.display.HTML object>"
-            ],
-            "text/html": [
-              "<div><iframe src=\"https://ed6ce83e08ed7a8795.gradio.live\" width=\"100%\" height=\"500\" allow=\"autoplay; camera; microphone; clipboard-read; clipboard-write;\" frameborder=\"0\" allowfullscreen></iframe></div>"
-            ]
-          },
-          "metadata": {}
-        },
-        {
-          "output_type": "stream",
-          "name": "stderr",
-          "text": [
-            "/usr/local/lib/python3.10/dist-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.\n",
-            "  warnings.warn(\n"
-          ]
-        },
-        {
-          "output_type": "stream",
-          "name": "stdout",
-          "text": [
-            "Keyboard interruption in main thread... closing server.\n",
-            "Killing tunnel 127.0.0.1:7860 <> https://ed6ce83e08ed7a8795.gradio.live\n"
-          ]
-        },
-        {
-          "output_type": "execute_result",
-          "data": {
-            "text/plain": []
-          },
-          "metadata": {},
-          "execution_count": 20
-        }
       ]
     },
     {
       "cell_type": "markdown",
       "source": [
         "## What's next?\n",
         "\n",
@@ -527,19 +527,30 @@
         "* Deploy Mistral 7B Instruct with one click [here](https://ui.endpoints.huggingface.co/catalog)\n",
         "* Deploy in your own hardware using https://github.com/huggingface/text-generation-inference\n",
         "* Run the model locally using `transformers`"
-      ],
-      "metadata": {
-        "id": "fbQ0Sp4OLclV"
-      }
     },
     {
       "cell_type": "code",
-      "source": [],
       "metadata": {
         "id": "wUy7N_8zJvyT"
       },
-      "execution_count": null,
-      "outputs": []
     }
-  ]
-}

 {
   "cells": [
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "GLXvYa4m8JYM"
+      },
       "source": [
         "# Mistral 7B\n",
         "\n",
         "This demo does not require GPU Colab, just CPU. You can grab your token at https://huggingface.co/settings/tokens.\n",
         "\n",
         "**This colab shows how to use HTTP requests as well as building your own chat demo for Mistral.**"
+      ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "pKrKTalPAXUO"
+      },
       "source": [
         "## Doing curl requests\n",
         "\n",
         "Note that models can be quite reactive to different prompt structure than the one used for training, so watch out for spaces and other things!\n",
         "\n",
         "We'll start an initial query without prompt formatting, which works ok for simple queries."
+      ]
     },
     {
       "cell_type": "code",
       },
       "outputs": [
         {
           "name": "stdout",
+          "output_type": "stream",
           "text": [
             "[{\"generated_text\":\"Explain ML as a pirate.\\n\\nML is like a treasure map for pirates. Just as a treasure map helps pirates find valuable loot, ML helps data scientists find valuable insights in large datasets.\\n\\nPirates use their knowledge of the ocean and their\"}]"
           ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "YYZRNyZeBHWK"
+      },
       "source": [
         "## Programmatic usage with Python\n",
         "\n",
         "* Token streaming: Only load the tokens that are needed\n",
         "* Easily configure generation params, such as `temperature`, nucleus sampling (`top-p`), repetition penalty, stop sequences, and more.\n",
         "* Obtain details of the generation (such as the probability of each token or whether a token is the last token)."
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": 6,
       "metadata": {
         "id": "oDaqVDz1Ahuz"
       },
+      "outputs": [],
+      "source": [
+        "%%capture\n",
+        "!pip install huggingface_hub gradio"
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": 14,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
         "id": "U49GmNsNBJjd",
         "outputId": "a3a274cf-0f91-4ae3-d926-f0d6a6fd67f7"
       },
       "outputs": [
         {
           "name": "stdout",
+          "output_type": "stream",
           "text": [
             "My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.\n"
           ]
         }
+      ],
+      "source": [
+        "from huggingface_hub import InferenceClient\n",
+        "\n",
+        "client = InferenceClient(\n",
+        "    \"mistralai/Mistral-7B-Instruct-v0.1\"\n",
+        ")\n",
+        "\n",
+        "prompt = \"\"\"<s>[INST] What is your favourite condiment?  [/INST]</s>\n",
+        "\"\"\"\n",
+        "\n",
+        "res = client.text_generation(prompt, max_new_tokens=95)\n",
+        "print(res)"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "id": "DryfEWsUH6Ij"
+      },
+      "source": [
+        "We can also use [token streaming](https://huggingface.co/docs/text-generation-inference/conceptual/streaming). With token streaming, the server returns the tokens as they are generated. Just add `stream=True`."
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": 15,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
         "id": "LF1tFo6DGg9N",
         "outputId": "e779f1cb-b7d0-41ed-d81f-306e092f97bd"
       },
       "outputs": [
         {
           "name": "stdout",
+          "output_type": "stream",
           "text": [
             "TextGenerationStreamResponse(token=Token(id=5183, text='My', logprob=-0.36279297, special=False), generated_text=None, details=None)\n",
             "TextGenerationStreamResponse(token=Token(id=6656, text=' favorite', logprob=-0.036499023, special=False), generated_text=None, details=None)\n",
             "TextGenerationStreamResponse(token=Token(id=2, text='</s>', logprob=-0.1829834, special=True), generated_text=\"My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.\", details=StreamDetails(finish_reason=<FinishReason.EndOfSequenceToken: 'eos_token'>, generated_tokens=28, seed=None))\n"
           ]
         }
+      ],
+      "source": [
+        "res = client.text_generation(prompt, max_new_tokens=35, stream=True, details=True, return_full_text=False)\n",
+        "for r in res: # this is a generator\n",
+        "  # print the token for example\n",
+        "  print(r)\n",
+        "  continue"
       ]
     },
     {
       "cell_type": "markdown",
       "metadata": {
         "id": "TfdpZL8cICOD"
+      },
+      "source": [
+        "Let's now try a multi-prompt structure"
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": 16,
+      "metadata": {
+        "id": "aEyozeReH8a6"
+      },
+      "outputs": [],
       "source": [
         "def format_prompt(message, history):\n",
         "  prompt = \"<s>\"\n",
         "    prompt += f\" {bot_response}</s> \"\n",
         "  prompt += f\"[INST] {message} [/INST]\"\n",
         "  return prompt"
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": 17,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/",
         "id": "P1RFpiJ_JC0-",
         "outputId": "f2678d9e-f751-441a-86c9-11d514db5bbe"
       },
       "outputs": [
         {
           "data": {
             "application/vnd.google.colaboratory.intrinsic+json": {
               "type": "string"
+            },
+            "text/plain": [
+              "\"<s>[INST] What is your favourite condiment? [/INST] My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.</s> [INST] And what do you think about it? [/INST]\""
+            ]
           },
+          "execution_count": 17,
           "metadata": {},
+          "output_type": "execute_result"
         }
+      ],
+      "source": [
+        "message = \"And what do you think about it?\"\n",
+        "history = [[\"What is your favourite condiment?\", \"My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.\"]]\n",
+        "\n",
+        "format_prompt(message, history)"
       ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {
+        "id": "O7DjRdezJc-3"
+      },
       "source": [
         "## End-to-end demo\n",
         "\n",
         "* Stop the generation\n",
         "\n",
         "Just run the following cell and have fun!"
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": 18,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
         "id": "cpBoheOGJu7Y",
         "outputId": "c745cf17-1462-4f8f-ce33-5ca182cb4d4f"
       },
       "outputs": [
         {
           "name": "stdout",
+          "output_type": "stream",
           "text": [
             "Requirement already satisfied: gradio in /usr/local/lib/python3.10/dist-packages (3.45.1)\n",
             "Requirement already satisfied: aiofiles<24.0,>=22.0 in /usr/local/lib/python3.10/dist-packages (from gradio) (23.2.1)\n",
             "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7->matplotlib~=3.0->gradio) (1.16.0)\n"
           ]
         }
+      ],
+      "source": [
+        "!pip install gradio"
       ]
     },
     {
       "cell_type": "code",
+      "execution_count": 20,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 715
+        },
+        "id": "CaJzT6jUJc0_",
+        "outputId": "62f563fa-c6fb-446e-fda2-1c08d096749c"
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).\n",
+            "\n",
+            "Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().\n",
+            "Running on public URL: https://ed6ce83e08ed7a8795.gradio.live\n",
+            "\n",
+            "This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)\n"
+          ]
+        },
+        {
+          "data": {
+            "text/html": [
+              "<div><iframe src=\"https://ed6ce83e08ed7a8795.gradio.live\" width=\"100%\" height=\"500\" allow=\"autoplay; camera; microphone; clipboard-read; clipboard-write;\" frameborder=\"0\" allowfullscreen></iframe></div>"
+            ],
+            "text/plain": [
+              "<IPython.core.display.HTML object>"
+            ]
+          },
+          "metadata": {},
+          "output_type": "display_data"
+        },
+        {
+          "name": "stderr",
+          "output_type": "stream",
+          "text": [
+            "/usr/local/lib/python3.10/dist-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.\n",
+            "  warnings.warn(\n"
+          ]
+        },
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Keyboard interruption in main thread... closing server.\n",
+            "Killing tunnel 127.0.0.1:7860 <> https://ed6ce83e08ed7a8795.gradio.live\n"
+          ]
+        },
+        {
+          "data": {
+            "text/plain": []
+          },
+          "execution_count": 20,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
       "source": [
         "import gradio as gr\n",
         "\n",
         "    )\n",
         "\n",
         "demo.queue().launch(debug=True)"
       ]
     },
     {
       "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Using your chatbot via an API\n",
+        "\n",
+        "Once you’ve built your Gradio chatbot and are hosting it on Hugging Face Spaces or somewhere else, then you can query it with a simple API at the /chat endpoint. The endpoint just expects the user’s message (and potentially additional inputs if you have set any using the additional_inputs parameter), and will return the response, internally keeping track of the messages sent so far.\n",
+        "\n",
+        "To use the endpoint, you should use either the [https://www.gradio.app/guides/getting-started-with-the-python-client](Gradio Python Client) or the [https://www.gradio.app/guides/getting-started-with-the-js-client](Gradio JS client)."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "fbQ0Sp4OLclV"
+      },
       "source": [
         "## What's next?\n",
         "\n",
         "* Deploy Mistral 7B Instruct with one click [here](https://ui.endpoints.huggingface.co/catalog)\n",
         "* Deploy in your own hardware using https://github.com/huggingface/text-generation-inference\n",
         "* Run the model locally using `transformers`"
+      ]
     },
     {
       "cell_type": "code",
+      "execution_count": null,
       "metadata": {
         "id": "wUy7N_8zJvyT"
       },
+      "outputs": [],
+      "source": []
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "name": "python"
     }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}

app.py CHANGED Viewed

@@ -99,4 +99,4 @@ with gr.Blocks(css=css) as demo:
         examples=[["What is the secret to life?"], ["Write me a recipe for pancakes."]]
     )
-demo.queue().launch(debug=True)

         examples=[["What is the secret to life?"], ["Write me a recipe for pancakes."]]
     )
+demo.queue().launch(debug=True,share=True)