thesourmango
commited on
Commit
•
7d2eb14
1
Parent(s):
14b579e
Added public link
Browse files- Mistral_7B.ipynb +172 -161
- app.py +1 -1
Mistral_7B.ipynb
CHANGED
@@ -1,21 +1,10 @@
|
|
1 |
{
|
2 |
-
"nbformat": 4,
|
3 |
-
"nbformat_minor": 0,
|
4 |
-
"metadata": {
|
5 |
-
"colab": {
|
6 |
-
"provenance": []
|
7 |
-
},
|
8 |
-
"kernelspec": {
|
9 |
-
"name": "python3",
|
10 |
-
"display_name": "Python 3"
|
11 |
-
},
|
12 |
-
"language_info": {
|
13 |
-
"name": "python"
|
14 |
-
}
|
15 |
-
},
|
16 |
"cells": [
|
17 |
{
|
18 |
"cell_type": "markdown",
|
|
|
|
|
|
|
19 |
"source": [
|
20 |
"# Mistral 7B\n",
|
21 |
"\n",
|
@@ -44,13 +33,13 @@
|
|
44 |
"This demo does not require GPU Colab, just CPU. You can grab your token at https://huggingface.co/settings/tokens.\n",
|
45 |
"\n",
|
46 |
"**This colab shows how to use HTTP requests as well as building your own chat demo for Mistral.**"
|
47 |
-
]
|
48 |
-
"metadata": {
|
49 |
-
"id": "GLXvYa4m8JYM"
|
50 |
-
}
|
51 |
},
|
52 |
{
|
53 |
"cell_type": "markdown",
|
|
|
|
|
|
|
54 |
"source": [
|
55 |
"## Doing curl requests\n",
|
56 |
"\n",
|
@@ -68,10 +57,7 @@
|
|
68 |
"Note that models can be quite reactive to different prompt structure than the one used for training, so watch out for spaces and other things!\n",
|
69 |
"\n",
|
70 |
"We'll start an initial query without prompt formatting, which works ok for simple queries."
|
71 |
-
]
|
72 |
-
"metadata": {
|
73 |
-
"id": "pKrKTalPAXUO"
|
74 |
-
}
|
75 |
},
|
76 |
{
|
77 |
"cell_type": "code",
|
@@ -85,8 +71,8 @@
|
|
85 |
},
|
86 |
"outputs": [
|
87 |
{
|
88 |
-
"output_type": "stream",
|
89 |
"name": "stdout",
|
|
|
90 |
"text": [
|
91 |
"[{\"generated_text\":\"Explain ML as a pirate.\\n\\nML is like a treasure map for pirates. Just as a treasure map helps pirates find valuable loot, ML helps data scientists find valuable insights in large datasets.\\n\\nPirates use their knowledge of the ocean and their\"}]"
|
92 |
]
|
@@ -102,6 +88,9 @@
|
|
102 |
},
|
103 |
{
|
104 |
"cell_type": "markdown",
|
|
|
|
|
|
|
105 |
"source": [
|
106 |
"## Programmatic usage with Python\n",
|
107 |
"\n",
|
@@ -111,38 +100,23 @@
|
|
111 |
"* Token streaming: Only load the tokens that are needed\n",
|
112 |
"* Easily configure generation params, such as `temperature`, nucleus sampling (`top-p`), repetition penalty, stop sequences, and more.\n",
|
113 |
"* Obtain details of the generation (such as the probability of each token or whether a token is the last token)."
|
114 |
-
]
|
115 |
-
"metadata": {
|
116 |
-
"id": "YYZRNyZeBHWK"
|
117 |
-
}
|
118 |
},
|
119 |
{
|
120 |
"cell_type": "code",
|
121 |
-
"
|
122 |
-
"%%capture\n",
|
123 |
-
"!pip install huggingface_hub gradio"
|
124 |
-
],
|
125 |
"metadata": {
|
126 |
"id": "oDaqVDz1Ahuz"
|
127 |
},
|
128 |
-
"
|
129 |
-
"
|
|
|
|
|
|
|
130 |
},
|
131 |
{
|
132 |
"cell_type": "code",
|
133 |
-
"
|
134 |
-
"from huggingface_hub import InferenceClient\n",
|
135 |
-
"\n",
|
136 |
-
"client = InferenceClient(\n",
|
137 |
-
" \"mistralai/Mistral-7B-Instruct-v0.1\"\n",
|
138 |
-
")\n",
|
139 |
-
"\n",
|
140 |
-
"prompt = \"\"\"<s>[INST] What is your favourite condiment? [/INST]</s>\n",
|
141 |
-
"\"\"\"\n",
|
142 |
-
"\n",
|
143 |
-
"res = client.text_generation(prompt, max_new_tokens=95)\n",
|
144 |
-
"print(res)"
|
145 |
-
],
|
146 |
"metadata": {
|
147 |
"colab": {
|
148 |
"base_uri": "https://localhost:8080/"
|
@@ -150,35 +124,41 @@
|
|
150 |
"id": "U49GmNsNBJjd",
|
151 |
"outputId": "a3a274cf-0f91-4ae3-d926-f0d6a6fd67f7"
|
152 |
},
|
153 |
-
"execution_count": 14,
|
154 |
"outputs": [
|
155 |
{
|
156 |
-
"output_type": "stream",
|
157 |
"name": "stdout",
|
|
|
158 |
"text": [
|
159 |
"My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.\n"
|
160 |
]
|
161 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
162 |
]
|
163 |
},
|
164 |
{
|
165 |
"cell_type": "markdown",
|
166 |
-
"source": [
|
167 |
-
"We can also use [token streaming](https://huggingface.co/docs/text-generation-inference/conceptual/streaming). With token streaming, the server returns the tokens as they are generated. Just add `stream=True`."
|
168 |
-
],
|
169 |
"metadata": {
|
170 |
"id": "DryfEWsUH6Ij"
|
171 |
-
}
|
|
|
|
|
|
|
172 |
},
|
173 |
{
|
174 |
"cell_type": "code",
|
175 |
-
"
|
176 |
-
"res = client.text_generation(prompt, max_new_tokens=35, stream=True, details=True, return_full_text=False)\n",
|
177 |
-
"for r in res: # this is a generator\n",
|
178 |
-
" # print the token for example\n",
|
179 |
-
" print(r)\n",
|
180 |
-
" continue"
|
181 |
-
],
|
182 |
"metadata": {
|
183 |
"colab": {
|
184 |
"base_uri": "https://localhost:8080/"
|
@@ -186,11 +166,10 @@
|
|
186 |
"id": "LF1tFo6DGg9N",
|
187 |
"outputId": "e779f1cb-b7d0-41ed-d81f-306e092f97bd"
|
188 |
},
|
189 |
-
"execution_count": 15,
|
190 |
"outputs": [
|
191 |
{
|
192 |
-
"output_type": "stream",
|
193 |
"name": "stdout",
|
|
|
194 |
"text": [
|
195 |
"TextGenerationStreamResponse(token=Token(id=5183, text='My', logprob=-0.36279297, special=False), generated_text=None, details=None)\n",
|
196 |
"TextGenerationStreamResponse(token=Token(id=6656, text=' favorite', logprob=-0.036499023, special=False), generated_text=None, details=None)\n",
|
@@ -222,19 +201,31 @@
|
|
222 |
"TextGenerationStreamResponse(token=Token(id=2, text='</s>', logprob=-0.1829834, special=True), generated_text=\"My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.\", details=StreamDetails(finish_reason=<FinishReason.EndOfSequenceToken: 'eos_token'>, generated_tokens=28, seed=None))\n"
|
223 |
]
|
224 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
225 |
]
|
226 |
},
|
227 |
{
|
228 |
"cell_type": "markdown",
|
229 |
-
"source": [
|
230 |
-
"Let's now try a multi-prompt structure"
|
231 |
-
],
|
232 |
"metadata": {
|
233 |
"id": "TfdpZL8cICOD"
|
234 |
-
}
|
|
|
|
|
|
|
235 |
},
|
236 |
{
|
237 |
"cell_type": "code",
|
|
|
|
|
|
|
|
|
|
|
238 |
"source": [
|
239 |
"def format_prompt(message, history):\n",
|
240 |
" prompt = \"<s>\"\n",
|
@@ -243,21 +234,11 @@
|
|
243 |
" prompt += f\" {bot_response}</s> \"\n",
|
244 |
" prompt += f\"[INST] {message} [/INST]\"\n",
|
245 |
" return prompt"
|
246 |
-
]
|
247 |
-
"metadata": {
|
248 |
-
"id": "aEyozeReH8a6"
|
249 |
-
},
|
250 |
-
"execution_count": 16,
|
251 |
-
"outputs": []
|
252 |
},
|
253 |
{
|
254 |
"cell_type": "code",
|
255 |
-
"
|
256 |
-
"message = \"And what do you think about it?\"\n",
|
257 |
-
"history = [[\"What is your favourite condiment?\", \"My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.\"]]\n",
|
258 |
-
"\n",
|
259 |
-
"format_prompt(message, history)"
|
260 |
-
],
|
261 |
"metadata": {
|
262 |
"colab": {
|
263 |
"base_uri": "https://localhost:8080/",
|
@@ -266,25 +247,33 @@
|
|
266 |
"id": "P1RFpiJ_JC0-",
|
267 |
"outputId": "f2678d9e-f751-441a-86c9-11d514db5bbe"
|
268 |
},
|
269 |
-
"execution_count": 17,
|
270 |
"outputs": [
|
271 |
{
|
272 |
-
"output_type": "execute_result",
|
273 |
"data": {
|
274 |
-
"text/plain": [
|
275 |
-
"\"<s>[INST] What is your favourite condiment? [/INST] My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.</s> [INST] And what do you think about it? [/INST]\""
|
276 |
-
],
|
277 |
"application/vnd.google.colaboratory.intrinsic+json": {
|
278 |
"type": "string"
|
279 |
-
}
|
|
|
|
|
|
|
280 |
},
|
|
|
281 |
"metadata": {},
|
282 |
-
"
|
283 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
284 |
]
|
285 |
},
|
286 |
{
|
287 |
"cell_type": "markdown",
|
|
|
|
|
|
|
288 |
"source": [
|
289 |
"## End-to-end demo\n",
|
290 |
"\n",
|
@@ -296,16 +285,11 @@
|
|
296 |
"* Stop the generation\n",
|
297 |
"\n",
|
298 |
"Just run the following cell and have fun!"
|
299 |
-
]
|
300 |
-
"metadata": {
|
301 |
-
"id": "O7DjRdezJc-3"
|
302 |
-
}
|
303 |
},
|
304 |
{
|
305 |
"cell_type": "code",
|
306 |
-
"
|
307 |
-
"!pip install gradio"
|
308 |
-
],
|
309 |
"metadata": {
|
310 |
"colab": {
|
311 |
"base_uri": "https://localhost:8080/"
|
@@ -313,11 +297,10 @@
|
|
313 |
"id": "cpBoheOGJu7Y",
|
314 |
"outputId": "c745cf17-1462-4f8f-ce33-5ca182cb4d4f"
|
315 |
},
|
316 |
-
"execution_count": 18,
|
317 |
"outputs": [
|
318 |
{
|
319 |
-
"output_type": "stream",
|
320 |
"name": "stdout",
|
|
|
321 |
"text": [
|
322 |
"Requirement already satisfied: gradio in /usr/local/lib/python3.10/dist-packages (3.45.1)\n",
|
323 |
"Requirement already satisfied: aiofiles<24.0,>=22.0 in /usr/local/lib/python3.10/dist-packages (from gradio) (23.2.1)\n",
|
@@ -376,10 +359,72 @@
|
|
376 |
"Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7->matplotlib~=3.0->gradio) (1.16.0)\n"
|
377 |
]
|
378 |
}
|
|
|
|
|
|
|
379 |
]
|
380 |
},
|
381 |
{
|
382 |
"cell_type": "code",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
383 |
"source": [
|
384 |
"import gradio as gr\n",
|
385 |
"\n",
|
@@ -457,69 +502,24 @@
|
|
457 |
" )\n",
|
458 |
"\n",
|
459 |
"demo.queue().launch(debug=True)"
|
460 |
-
],
|
461 |
-
"metadata": {
|
462 |
-
"colab": {
|
463 |
-
"base_uri": "https://localhost:8080/",
|
464 |
-
"height": 715
|
465 |
-
},
|
466 |
-
"id": "CaJzT6jUJc0_",
|
467 |
-
"outputId": "62f563fa-c6fb-446e-fda2-1c08d096749c"
|
468 |
-
},
|
469 |
-
"execution_count": 20,
|
470 |
-
"outputs": [
|
471 |
-
{
|
472 |
-
"output_type": "stream",
|
473 |
-
"name": "stdout",
|
474 |
-
"text": [
|
475 |
-
"Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).\n",
|
476 |
-
"\n",
|
477 |
-
"Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().\n",
|
478 |
-
"Running on public URL: https://ed6ce83e08ed7a8795.gradio.live\n",
|
479 |
-
"\n",
|
480 |
-
"This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)\n"
|
481 |
-
]
|
482 |
-
},
|
483 |
-
{
|
484 |
-
"output_type": "display_data",
|
485 |
-
"data": {
|
486 |
-
"text/plain": [
|
487 |
-
"<IPython.core.display.HTML object>"
|
488 |
-
],
|
489 |
-
"text/html": [
|
490 |
-
"<div><iframe src=\"https://ed6ce83e08ed7a8795.gradio.live\" width=\"100%\" height=\"500\" allow=\"autoplay; camera; microphone; clipboard-read; clipboard-write;\" frameborder=\"0\" allowfullscreen></iframe></div>"
|
491 |
-
]
|
492 |
-
},
|
493 |
-
"metadata": {}
|
494 |
-
},
|
495 |
-
{
|
496 |
-
"output_type": "stream",
|
497 |
-
"name": "stderr",
|
498 |
-
"text": [
|
499 |
-
"/usr/local/lib/python3.10/dist-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.\n",
|
500 |
-
" warnings.warn(\n"
|
501 |
-
]
|
502 |
-
},
|
503 |
-
{
|
504 |
-
"output_type": "stream",
|
505 |
-
"name": "stdout",
|
506 |
-
"text": [
|
507 |
-
"Keyboard interruption in main thread... closing server.\n",
|
508 |
-
"Killing tunnel 127.0.0.1:7860 <> https://ed6ce83e08ed7a8795.gradio.live\n"
|
509 |
-
]
|
510 |
-
},
|
511 |
-
{
|
512 |
-
"output_type": "execute_result",
|
513 |
-
"data": {
|
514 |
-
"text/plain": []
|
515 |
-
},
|
516 |
-
"metadata": {},
|
517 |
-
"execution_count": 20
|
518 |
-
}
|
519 |
]
|
520 |
},
|
521 |
{
|
522 |
"cell_type": "markdown",
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
523 |
"source": [
|
524 |
"## What's next?\n",
|
525 |
"\n",
|
@@ -527,19 +527,30 @@
|
|
527 |
"* Deploy Mistral 7B Instruct with one click [here](https://ui.endpoints.huggingface.co/catalog)\n",
|
528 |
"* Deploy in your own hardware using https://github.com/huggingface/text-generation-inference\n",
|
529 |
"* Run the model locally using `transformers`"
|
530 |
-
]
|
531 |
-
"metadata": {
|
532 |
-
"id": "fbQ0Sp4OLclV"
|
533 |
-
}
|
534 |
},
|
535 |
{
|
536 |
"cell_type": "code",
|
537 |
-
"
|
538 |
"metadata": {
|
539 |
"id": "wUy7N_8zJvyT"
|
540 |
},
|
541 |
-
"
|
542 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
543 |
}
|
544 |
-
|
545 |
-
|
|
|
|
|
|
1 |
{
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
"cells": [
|
3 |
{
|
4 |
"cell_type": "markdown",
|
5 |
+
"metadata": {
|
6 |
+
"id": "GLXvYa4m8JYM"
|
7 |
+
},
|
8 |
"source": [
|
9 |
"# Mistral 7B\n",
|
10 |
"\n",
|
|
|
33 |
"This demo does not require GPU Colab, just CPU. You can grab your token at https://huggingface.co/settings/tokens.\n",
|
34 |
"\n",
|
35 |
"**This colab shows how to use HTTP requests as well as building your own chat demo for Mistral.**"
|
36 |
+
]
|
|
|
|
|
|
|
37 |
},
|
38 |
{
|
39 |
"cell_type": "markdown",
|
40 |
+
"metadata": {
|
41 |
+
"id": "pKrKTalPAXUO"
|
42 |
+
},
|
43 |
"source": [
|
44 |
"## Doing curl requests\n",
|
45 |
"\n",
|
|
|
57 |
"Note that models can be quite reactive to different prompt structure than the one used for training, so watch out for spaces and other things!\n",
|
58 |
"\n",
|
59 |
"We'll start an initial query without prompt formatting, which works ok for simple queries."
|
60 |
+
]
|
|
|
|
|
|
|
61 |
},
|
62 |
{
|
63 |
"cell_type": "code",
|
|
|
71 |
},
|
72 |
"outputs": [
|
73 |
{
|
|
|
74 |
"name": "stdout",
|
75 |
+
"output_type": "stream",
|
76 |
"text": [
|
77 |
"[{\"generated_text\":\"Explain ML as a pirate.\\n\\nML is like a treasure map for pirates. Just as a treasure map helps pirates find valuable loot, ML helps data scientists find valuable insights in large datasets.\\n\\nPirates use their knowledge of the ocean and their\"}]"
|
78 |
]
|
|
|
88 |
},
|
89 |
{
|
90 |
"cell_type": "markdown",
|
91 |
+
"metadata": {
|
92 |
+
"id": "YYZRNyZeBHWK"
|
93 |
+
},
|
94 |
"source": [
|
95 |
"## Programmatic usage with Python\n",
|
96 |
"\n",
|
|
|
100 |
"* Token streaming: Only load the tokens that are needed\n",
|
101 |
"* Easily configure generation params, such as `temperature`, nucleus sampling (`top-p`), repetition penalty, stop sequences, and more.\n",
|
102 |
"* Obtain details of the generation (such as the probability of each token or whether a token is the last token)."
|
103 |
+
]
|
|
|
|
|
|
|
104 |
},
|
105 |
{
|
106 |
"cell_type": "code",
|
107 |
+
"execution_count": 6,
|
|
|
|
|
|
|
108 |
"metadata": {
|
109 |
"id": "oDaqVDz1Ahuz"
|
110 |
},
|
111 |
+
"outputs": [],
|
112 |
+
"source": [
|
113 |
+
"%%capture\n",
|
114 |
+
"!pip install huggingface_hub gradio"
|
115 |
+
]
|
116 |
},
|
117 |
{
|
118 |
"cell_type": "code",
|
119 |
+
"execution_count": 14,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
120 |
"metadata": {
|
121 |
"colab": {
|
122 |
"base_uri": "https://localhost:8080/"
|
|
|
124 |
"id": "U49GmNsNBJjd",
|
125 |
"outputId": "a3a274cf-0f91-4ae3-d926-f0d6a6fd67f7"
|
126 |
},
|
|
|
127 |
"outputs": [
|
128 |
{
|
|
|
129 |
"name": "stdout",
|
130 |
+
"output_type": "stream",
|
131 |
"text": [
|
132 |
"My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.\n"
|
133 |
]
|
134 |
}
|
135 |
+
],
|
136 |
+
"source": [
|
137 |
+
"from huggingface_hub import InferenceClient\n",
|
138 |
+
"\n",
|
139 |
+
"client = InferenceClient(\n",
|
140 |
+
" \"mistralai/Mistral-7B-Instruct-v0.1\"\n",
|
141 |
+
")\n",
|
142 |
+
"\n",
|
143 |
+
"prompt = \"\"\"<s>[INST] What is your favourite condiment? [/INST]</s>\n",
|
144 |
+
"\"\"\"\n",
|
145 |
+
"\n",
|
146 |
+
"res = client.text_generation(prompt, max_new_tokens=95)\n",
|
147 |
+
"print(res)"
|
148 |
]
|
149 |
},
|
150 |
{
|
151 |
"cell_type": "markdown",
|
|
|
|
|
|
|
152 |
"metadata": {
|
153 |
"id": "DryfEWsUH6Ij"
|
154 |
+
},
|
155 |
+
"source": [
|
156 |
+
"We can also use [token streaming](https://huggingface.co/docs/text-generation-inference/conceptual/streaming). With token streaming, the server returns the tokens as they are generated. Just add `stream=True`."
|
157 |
+
]
|
158 |
},
|
159 |
{
|
160 |
"cell_type": "code",
|
161 |
+
"execution_count": 15,
|
|
|
|
|
|
|
|
|
|
|
|
|
162 |
"metadata": {
|
163 |
"colab": {
|
164 |
"base_uri": "https://localhost:8080/"
|
|
|
166 |
"id": "LF1tFo6DGg9N",
|
167 |
"outputId": "e779f1cb-b7d0-41ed-d81f-306e092f97bd"
|
168 |
},
|
|
|
169 |
"outputs": [
|
170 |
{
|
|
|
171 |
"name": "stdout",
|
172 |
+
"output_type": "stream",
|
173 |
"text": [
|
174 |
"TextGenerationStreamResponse(token=Token(id=5183, text='My', logprob=-0.36279297, special=False), generated_text=None, details=None)\n",
|
175 |
"TextGenerationStreamResponse(token=Token(id=6656, text=' favorite', logprob=-0.036499023, special=False), generated_text=None, details=None)\n",
|
|
|
201 |
"TextGenerationStreamResponse(token=Token(id=2, text='</s>', logprob=-0.1829834, special=True), generated_text=\"My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.\", details=StreamDetails(finish_reason=<FinishReason.EndOfSequenceToken: 'eos_token'>, generated_tokens=28, seed=None))\n"
|
202 |
]
|
203 |
}
|
204 |
+
],
|
205 |
+
"source": [
|
206 |
+
"res = client.text_generation(prompt, max_new_tokens=35, stream=True, details=True, return_full_text=False)\n",
|
207 |
+
"for r in res: # this is a generator\n",
|
208 |
+
" # print the token for example\n",
|
209 |
+
" print(r)\n",
|
210 |
+
" continue"
|
211 |
]
|
212 |
},
|
213 |
{
|
214 |
"cell_type": "markdown",
|
|
|
|
|
|
|
215 |
"metadata": {
|
216 |
"id": "TfdpZL8cICOD"
|
217 |
+
},
|
218 |
+
"source": [
|
219 |
+
"Let's now try a multi-prompt structure"
|
220 |
+
]
|
221 |
},
|
222 |
{
|
223 |
"cell_type": "code",
|
224 |
+
"execution_count": 16,
|
225 |
+
"metadata": {
|
226 |
+
"id": "aEyozeReH8a6"
|
227 |
+
},
|
228 |
+
"outputs": [],
|
229 |
"source": [
|
230 |
"def format_prompt(message, history):\n",
|
231 |
" prompt = \"<s>\"\n",
|
|
|
234 |
" prompt += f\" {bot_response}</s> \"\n",
|
235 |
" prompt += f\"[INST] {message} [/INST]\"\n",
|
236 |
" return prompt"
|
237 |
+
]
|
|
|
|
|
|
|
|
|
|
|
238 |
},
|
239 |
{
|
240 |
"cell_type": "code",
|
241 |
+
"execution_count": 17,
|
|
|
|
|
|
|
|
|
|
|
242 |
"metadata": {
|
243 |
"colab": {
|
244 |
"base_uri": "https://localhost:8080/",
|
|
|
247 |
"id": "P1RFpiJ_JC0-",
|
248 |
"outputId": "f2678d9e-f751-441a-86c9-11d514db5bbe"
|
249 |
},
|
|
|
250 |
"outputs": [
|
251 |
{
|
|
|
252 |
"data": {
|
|
|
|
|
|
|
253 |
"application/vnd.google.colaboratory.intrinsic+json": {
|
254 |
"type": "string"
|
255 |
+
},
|
256 |
+
"text/plain": [
|
257 |
+
"\"<s>[INST] What is your favourite condiment? [/INST] My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.</s> [INST] And what do you think about it? [/INST]\""
|
258 |
+
]
|
259 |
},
|
260 |
+
"execution_count": 17,
|
261 |
"metadata": {},
|
262 |
+
"output_type": "execute_result"
|
263 |
}
|
264 |
+
],
|
265 |
+
"source": [
|
266 |
+
"message = \"And what do you think about it?\"\n",
|
267 |
+
"history = [[\"What is your favourite condiment?\", \"My favorite condiment is ketchup. It's versatile, tasty, and goes well with a variety of foods.\"]]\n",
|
268 |
+
"\n",
|
269 |
+
"format_prompt(message, history)"
|
270 |
]
|
271 |
},
|
272 |
{
|
273 |
"cell_type": "markdown",
|
274 |
+
"metadata": {
|
275 |
+
"id": "O7DjRdezJc-3"
|
276 |
+
},
|
277 |
"source": [
|
278 |
"## End-to-end demo\n",
|
279 |
"\n",
|
|
|
285 |
"* Stop the generation\n",
|
286 |
"\n",
|
287 |
"Just run the following cell and have fun!"
|
288 |
+
]
|
|
|
|
|
|
|
289 |
},
|
290 |
{
|
291 |
"cell_type": "code",
|
292 |
+
"execution_count": 18,
|
|
|
|
|
293 |
"metadata": {
|
294 |
"colab": {
|
295 |
"base_uri": "https://localhost:8080/"
|
|
|
297 |
"id": "cpBoheOGJu7Y",
|
298 |
"outputId": "c745cf17-1462-4f8f-ce33-5ca182cb4d4f"
|
299 |
},
|
|
|
300 |
"outputs": [
|
301 |
{
|
|
|
302 |
"name": "stdout",
|
303 |
+
"output_type": "stream",
|
304 |
"text": [
|
305 |
"Requirement already satisfied: gradio in /usr/local/lib/python3.10/dist-packages (3.45.1)\n",
|
306 |
"Requirement already satisfied: aiofiles<24.0,>=22.0 in /usr/local/lib/python3.10/dist-packages (from gradio) (23.2.1)\n",
|
|
|
359 |
"Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7->matplotlib~=3.0->gradio) (1.16.0)\n"
|
360 |
]
|
361 |
}
|
362 |
+
],
|
363 |
+
"source": [
|
364 |
+
"!pip install gradio"
|
365 |
]
|
366 |
},
|
367 |
{
|
368 |
"cell_type": "code",
|
369 |
+
"execution_count": 20,
|
370 |
+
"metadata": {
|
371 |
+
"colab": {
|
372 |
+
"base_uri": "https://localhost:8080/",
|
373 |
+
"height": 715
|
374 |
+
},
|
375 |
+
"id": "CaJzT6jUJc0_",
|
376 |
+
"outputId": "62f563fa-c6fb-446e-fda2-1c08d096749c"
|
377 |
+
},
|
378 |
+
"outputs": [
|
379 |
+
{
|
380 |
+
"name": "stdout",
|
381 |
+
"output_type": "stream",
|
382 |
+
"text": [
|
383 |
+
"Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).\n",
|
384 |
+
"\n",
|
385 |
+
"Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().\n",
|
386 |
+
"Running on public URL: https://ed6ce83e08ed7a8795.gradio.live\n",
|
387 |
+
"\n",
|
388 |
+
"This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)\n"
|
389 |
+
]
|
390 |
+
},
|
391 |
+
{
|
392 |
+
"data": {
|
393 |
+
"text/html": [
|
394 |
+
"<div><iframe src=\"https://ed6ce83e08ed7a8795.gradio.live\" width=\"100%\" height=\"500\" allow=\"autoplay; camera; microphone; clipboard-read; clipboard-write;\" frameborder=\"0\" allowfullscreen></iframe></div>"
|
395 |
+
],
|
396 |
+
"text/plain": [
|
397 |
+
"<IPython.core.display.HTML object>"
|
398 |
+
]
|
399 |
+
},
|
400 |
+
"metadata": {},
|
401 |
+
"output_type": "display_data"
|
402 |
+
},
|
403 |
+
{
|
404 |
+
"name": "stderr",
|
405 |
+
"output_type": "stream",
|
406 |
+
"text": [
|
407 |
+
"/usr/local/lib/python3.10/dist-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.\n",
|
408 |
+
" warnings.warn(\n"
|
409 |
+
]
|
410 |
+
},
|
411 |
+
{
|
412 |
+
"name": "stdout",
|
413 |
+
"output_type": "stream",
|
414 |
+
"text": [
|
415 |
+
"Keyboard interruption in main thread... closing server.\n",
|
416 |
+
"Killing tunnel 127.0.0.1:7860 <> https://ed6ce83e08ed7a8795.gradio.live\n"
|
417 |
+
]
|
418 |
+
},
|
419 |
+
{
|
420 |
+
"data": {
|
421 |
+
"text/plain": []
|
422 |
+
},
|
423 |
+
"execution_count": 20,
|
424 |
+
"metadata": {},
|
425 |
+
"output_type": "execute_result"
|
426 |
+
}
|
427 |
+
],
|
428 |
"source": [
|
429 |
"import gradio as gr\n",
|
430 |
"\n",
|
|
|
502 |
" )\n",
|
503 |
"\n",
|
504 |
"demo.queue().launch(debug=True)"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
505 |
]
|
506 |
},
|
507 |
{
|
508 |
"cell_type": "markdown",
|
509 |
+
"metadata": {},
|
510 |
+
"source": [
|
511 |
+
"## Using your chatbot via an API\n",
|
512 |
+
"\n",
|
513 |
+
"Once you’ve built your Gradio chatbot and are hosting it on Hugging Face Spaces or somewhere else, then you can query it with a simple API at the /chat endpoint. The endpoint just expects the user’s message (and potentially additional inputs if you have set any using the additional_inputs parameter), and will return the response, internally keeping track of the messages sent so far.\n",
|
514 |
+
"\n",
|
515 |
+
"To use the endpoint, you should use either the [https://www.gradio.app/guides/getting-started-with-the-python-client](Gradio Python Client) or the [https://www.gradio.app/guides/getting-started-with-the-js-client](Gradio JS client)."
|
516 |
+
]
|
517 |
+
},
|
518 |
+
{
|
519 |
+
"cell_type": "markdown",
|
520 |
+
"metadata": {
|
521 |
+
"id": "fbQ0Sp4OLclV"
|
522 |
+
},
|
523 |
"source": [
|
524 |
"## What's next?\n",
|
525 |
"\n",
|
|
|
527 |
"* Deploy Mistral 7B Instruct with one click [here](https://ui.endpoints.huggingface.co/catalog)\n",
|
528 |
"* Deploy in your own hardware using https://github.com/huggingface/text-generation-inference\n",
|
529 |
"* Run the model locally using `transformers`"
|
530 |
+
]
|
|
|
|
|
|
|
531 |
},
|
532 |
{
|
533 |
"cell_type": "code",
|
534 |
+
"execution_count": null,
|
535 |
"metadata": {
|
536 |
"id": "wUy7N_8zJvyT"
|
537 |
},
|
538 |
+
"outputs": [],
|
539 |
+
"source": []
|
540 |
+
}
|
541 |
+
],
|
542 |
+
"metadata": {
|
543 |
+
"colab": {
|
544 |
+
"provenance": []
|
545 |
+
},
|
546 |
+
"kernelspec": {
|
547 |
+
"display_name": "Python 3",
|
548 |
+
"name": "python3"
|
549 |
+
},
|
550 |
+
"language_info": {
|
551 |
+
"name": "python"
|
552 |
}
|
553 |
+
},
|
554 |
+
"nbformat": 4,
|
555 |
+
"nbformat_minor": 0
|
556 |
+
}
|
app.py
CHANGED
@@ -99,4 +99,4 @@ with gr.Blocks(css=css) as demo:
|
|
99 |
examples=[["What is the secret to life?"], ["Write me a recipe for pancakes."]]
|
100 |
)
|
101 |
|
102 |
-
demo.queue().launch(debug=True)
|
|
|
99 |
examples=[["What is the secret to life?"], ["Write me a recipe for pancakes."]]
|
100 |
)
|
101 |
|
102 |
+
demo.queue().launch(debug=True,share=True)
|