Prachidwi commited on
Commit
da750df
1 Parent(s): 645b8dc

Upload 16 files

Browse files
Files changed (17) hide show
  1. .env +1 -0
  2. .gitattributes +3 -0
  3. GPT4o_Lanchain_RAG.ipynb +353 -0
  4. acsbr-015.pdf +0 -0
  5. acsbr-016.pdf +3 -0
  6. acsbr-017.pdf +3 -0
  7. agents.ipynb +392 -0
  8. app.py +52 -0
  9. attention.pdf +3 -0
  10. client.py +27 -0
  11. groq.ipynb +278 -0
  12. llama3.py +82 -0
  13. localama.py +33 -0
  14. p70-178.pdf +0 -0
  15. retriever.ipynb +0 -0
  16. simplerag.ipynb +0 -0
  17. speech.txt +13 -0
.env ADDED
@@ -0,0 +1 @@
 
 
1
+
.gitattributes CHANGED
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ acsbr-016.pdf filter=lfs diff=lfs merge=lfs -text
37
+ acsbr-017.pdf filter=lfs diff=lfs merge=lfs -text
38
+ attention.pdf filter=lfs diff=lfs merge=lfs -text
GPT4o_Lanchain_RAG.ipynb ADDED
@@ -0,0 +1,353 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": 16,
6
+ "metadata": {},
7
+ "outputs": [],
8
+ "source": [
9
+ "## RAG Application With GPT-4o"
10
+ ]
11
+ },
12
+ {
13
+ "cell_type": "code",
14
+ "execution_count": 1,
15
+ "metadata": {},
16
+ "outputs": [],
17
+ "source": [
18
+ "from dotenv import load_dotenv\n",
19
+ "\n",
20
+ "load_dotenv()\n",
21
+ "import os\n",
22
+ "os.environ['OPENAI_API_KEY']=os.getenv(\"OPENAI_API_KEY\")"
23
+ ]
24
+ },
25
+ {
26
+ "cell_type": "markdown",
27
+ "metadata": {},
28
+ "source": [
29
+ "# 1. Get a Data Loader\n"
30
+ ]
31
+ },
32
+ {
33
+ "cell_type": "code",
34
+ "execution_count": 2,
35
+ "metadata": {},
36
+ "outputs": [],
37
+ "source": [
38
+ "from langchain_community.document_loaders import WebBaseLoader\n"
39
+ ]
40
+ },
41
+ {
42
+ "cell_type": "code",
43
+ "execution_count": 3,
44
+ "metadata": {},
45
+ "outputs": [
46
+ {
47
+ "data": {
48
+ "text/plain": [
49
+ "[Document(page_content=\"\\n\\n\\n\\n\\nLangSmith User Guide | 🦜️🛠️ LangSmith\\n\\n\\n\\n\\n\\n\\n\\nSkip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookThis is outdated documentation for 🦜️🛠️ LangSmith, which is no longer actively maintained.For up-to-date documentation, see the latest version.User GuideOn this pageLangSmith User GuideLangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.Prototyping\\u200bPrototyping LLM applications often involves quick experimentation between prompts, model types, retrieval strategy and other parameters.\\nThe ability to rapidly understand how the model is performing — and debug where it is failing — is incredibly important for this phase.Debugging\\u200bWhen developing new LLM applications, we suggest having LangSmith tracing enabled by default.\\nOftentimes, it isn’t necessary to look at every single trace. However, when things go wrong (an unexpected end result, infinite agent loop, slower than expected execution, higher than expected token usage), it’s extremely helpful to debug by looking through the application traces. LangSmith gives clear visibility and debugging information at each step of an LLM sequence, making it much easier to identify and root-cause issues.\\nWe provide native rendering of chat messages, functions, and retrieve documents.Initial Test Set\\u200bWhile many developers still ship an initial version of their application based on “vibe checks”, we’ve seen an increasing number of engineering teams start to adopt a more test driven approach. LangSmith allows developers to create datasets, which are collections of inputs and reference outputs, and use these to run tests on their LLM applications.\\nThese test cases can be uploaded in bulk, created on the fly, or exported from application traces. LangSmith also makes it easy to run custom evaluations (both LLM and heuristic based) to score test results.Comparison View\\u200bWhen prototyping different versions of your applications and making changes, it’s important to see whether or not you’ve regressed with respect to your initial test cases.\\nOftentimes, changes in the prompt, retrieval strategy, or model choice can have huge implications in responses produced by your application.\\nIn order to get a sense for which variant is performing better, it’s useful to be able to view results for different configurations on the same datapoints side-by-side. We’ve invested heavily in a user-friendly comparison view for test runs to track and diagnose regressions in test scores across multiple revisions of your application.Playground\\u200bLangSmith provides a playground environment for rapid iteration and experimentation.\\nThis allows you to quickly test out different prompts and models. You can open the playground from any prompt or model run in your trace.\\nEvery playground run is logged in the system and can be used to create test cases or compare with other runs.Beta Testing\\u200bBeta testing allows developers to collect more data on how their LLM applications are performing in real-world scenarios. In this phase, it’s important to develop an understanding for the types of inputs the app is performing well or poorly on and how exactly it’s breaking down in those cases. Both feedback collection and run annotation are critical for this workflow. This will help in curation of test cases that can help track regressions/improvements and development of automatic evaluations.Capturing Feedback\\u200bWhen launching your application to an initial set of users, it’s important to gather human feedback on the responses it’s producing. This helps draw attention to the most interesting runs and highlight edge cases that are causing problematic responses. LangSmith allows you to attach feedback scores to logged traces (oftentimes, this is hooked up to a feedback button in your app), then filter on traces that have a specific feedback tag and score. A common workflow is to filter on traces that receive a poor user feedback score, then drill down into problematic points using the detailed trace view.Annotating Traces\\u200bLangSmith also supports sending runs to annotation queues, which allow annotators to closely inspect interesting traces and annotate them with respect to different criteria. Annotators can be PMs, engineers, or even subject matter experts. This allows users to catch regressions across important evaluation criteria.Adding Runs to a Dataset\\u200bAs your application progresses through the beta testing phase, it's essential to continue collecting data to refine and improve its performance. LangSmith enables you to add runs as examples to datasets (from both the project page and within an annotation queue), expanding your test coverage on real-world scenarios. This is a key benefit in having your logging system and your evaluation/testing system in the same platform.Production\\u200bClosely inspecting key data points, growing benchmarking datasets, annotating traces, and drilling down into important data in trace view are workflows you’ll also want to do once your app hits production.However, especially at the production stage, it’s crucial to get a high-level overview of application performance with respect to latency, cost, and feedback scores. This ensures that it's delivering desirable results at scale.Online evaluations and automations allow you to process and score production traces in near real-time.Additionally, threads provide a seamless way to group traces from a single conversation, making it easier to track the performance of your application across multiple turns.Monitoring and A/B Testing\\u200bLangSmith provides monitoring charts that allow you to track key metrics over time. You can expand to view metrics for a given period and drill down into a specific data point to get a trace table for that time period — this is especially handy for debugging production issues.LangSmith also allows for tag and metadata grouping, which allows users to mark different versions of their applications with different identifiers and view how they are performing side-by-side within each chart. This is helpful for A/B testing changes in prompt, model, or retrieval strategy.Automations\\u200bAutomations are a powerful feature in LangSmith that allow you to perform actions on traces in near real-time. This can be used to automatically score traces, send them to annotation queues, or send them to datasets.To define an automation, simply provide a filter condition, a sampling rate, and an action to perform. Automations are particularly helpful for processing traces at production scale.Threads\\u200bMany LLM applications are multi-turn, meaning that they involve a series of interactions between the user and the application. LangSmith provides a threads view that groups traces from a single conversation together, making it easier to track the performance of and annotate your application across multiple turns.Was this page helpful?PreviousQuick StartNextOverviewPrototypingBeta TestingProductionCommunityDiscordTwitterGitHubDocs CodeLangSmith SDKPythonJS/TSMoreHomepageBlogLangChain Python DocsLangChain JS/TS DocsCopyright © 2024 LangChain, Inc.\\n\\n\\n\\n\", metadata={'source': 'https://docs.smith.langchain.com/user_guide', 'title': 'LangSmith User Guide | 🦜️🛠️ LangSmith', 'description': 'LangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.', 'language': 'en'})]"
50
+ ]
51
+ },
52
+ "execution_count": 3,
53
+ "metadata": {},
54
+ "output_type": "execute_result"
55
+ }
56
+ ],
57
+ "source": [
58
+ "loader = WebBaseLoader(\"https://docs.smith.langchain.com/user_guide\")\n",
59
+ "data = loader.load()\n",
60
+ "data"
61
+ ]
62
+ },
63
+ {
64
+ "cell_type": "markdown",
65
+ "metadata": {},
66
+ "source": [
67
+ "# 2. Convert data to Vector Database\n"
68
+ ]
69
+ },
70
+ {
71
+ "cell_type": "code",
72
+ "execution_count": 4,
73
+ "metadata": {},
74
+ "outputs": [],
75
+ "source": [
76
+ "\n",
77
+ "from langchain_objectbox.vectorstores import ObjectBox ##vector Database\n",
78
+ "from langchain_openai import OpenAIEmbeddings\n"
79
+ ]
80
+ },
81
+ {
82
+ "cell_type": "code",
83
+ "execution_count": 7,
84
+ "metadata": {},
85
+ "outputs": [],
86
+ "source": []
87
+ },
88
+ {
89
+ "cell_type": "code",
90
+ "execution_count": 5,
91
+ "metadata": {},
92
+ "outputs": [],
93
+ "source": [
94
+ "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
95
+ "\n",
96
+ "text_splitter = RecursiveCharacterTextSplitter()\n",
97
+ "documents = text_splitter.split_documents(data)"
98
+ ]
99
+ },
100
+ {
101
+ "cell_type": "code",
102
+ "execution_count": 6,
103
+ "metadata": {},
104
+ "outputs": [
105
+ {
106
+ "data": {
107
+ "text/plain": [
108
+ "[Document(page_content='LangSmith User Guide | 🦜️🛠️ LangSmith', metadata={'source': 'https://docs.smith.langchain.com/user_guide', 'title': 'LangSmith User Guide | 🦜️🛠️ LangSmith', 'description': 'LangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.', 'language': 'en'}),\n",
109
+ " Document(page_content='Skip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookThis is outdated documentation for 🦜️🛠️ LangSmith, which is no longer actively maintained.For up-to-date documentation, see the latest version.User GuideOn this pageLangSmith User GuideLangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.Prototyping\\u200bPrototyping LLM applications often involves quick experimentation between prompts, model types, retrieval strategy and other parameters.\\nThe ability to rapidly understand how the model is performing — and debug where it is failing — is incredibly important for this phase.Debugging\\u200bWhen developing new LLM applications, we suggest having LangSmith tracing enabled by default.\\nOftentimes, it isn’t necessary to look at every single trace. However, when things go wrong (an unexpected end result, infinite agent loop, slower than expected execution, higher than expected token usage), it’s extremely helpful to debug by looking through the application traces. LangSmith gives clear visibility and debugging information at each step of an LLM sequence, making it much easier to identify and root-cause issues.\\nWe provide native rendering of chat messages, functions, and retrieve documents.Initial Test Set\\u200bWhile many developers still ship an initial version of their application based on “vibe checks”, we’ve seen an increasing number of engineering teams start to adopt a more test driven approach. LangSmith allows developers to create datasets, which are collections of inputs and reference outputs, and use these to run tests on their LLM applications.\\nThese test cases can be uploaded in bulk, created on the fly, or exported from application traces. LangSmith also makes it easy to run custom evaluations (both LLM and heuristic based) to score test results.Comparison View\\u200bWhen prototyping different versions of your applications and making changes, it’s important to see whether or not you’ve regressed with respect to your initial test cases.\\nOftentimes, changes in the prompt, retrieval strategy, or model choice can have huge implications in responses produced by your application.\\nIn order to get a sense for which variant is performing better, it’s useful to be able to view results for different configurations on the same datapoints side-by-side. We’ve invested heavily in a user-friendly comparison view for test runs to track and diagnose regressions in test scores across multiple revisions of your application.Playground\\u200bLangSmith provides a playground environment for rapid iteration and experimentation.\\nThis allows you to quickly test out different prompts and models. You can open the playground from any prompt or model run in your trace.', metadata={'source': 'https://docs.smith.langchain.com/user_guide', 'title': 'LangSmith User Guide | 🦜️🛠️ LangSmith', 'description': 'LangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.', 'language': 'en'}),\n",
110
+ " Document(page_content=\"Every playground run is logged in the system and can be used to create test cases or compare with other runs.Beta Testing\\u200bBeta testing allows developers to collect more data on how their LLM applications are performing in real-world scenarios. In this phase, it’s important to develop an understanding for the types of inputs the app is performing well or poorly on and how exactly it’s breaking down in those cases. Both feedback collection and run annotation are critical for this workflow. This will help in curation of test cases that can help track regressions/improvements and development of automatic evaluations.Capturing Feedback\\u200bWhen launching your application to an initial set of users, it’s important to gather human feedback on the responses it’s producing. This helps draw attention to the most interesting runs and highlight edge cases that are causing problematic responses. LangSmith allows you to attach feedback scores to logged traces (oftentimes, this is hooked up to a feedback button in your app), then filter on traces that have a specific feedback tag and score. A common workflow is to filter on traces that receive a poor user feedback score, then drill down into problematic points using the detailed trace view.Annotating Traces\\u200bLangSmith also supports sending runs to annotation queues, which allow annotators to closely inspect interesting traces and annotate them with respect to different criteria. Annotators can be PMs, engineers, or even subject matter experts. This allows users to catch regressions across important evaluation criteria.Adding Runs to a Dataset\\u200bAs your application progresses through the beta testing phase, it's essential to continue collecting data to refine and improve its performance. LangSmith enables you to add runs as examples to datasets (from both the project page and within an annotation queue), expanding your test coverage on real-world scenarios. This is a key benefit in having your logging system and your evaluation/testing system in the same platform.Production\\u200bClosely inspecting key data points, growing benchmarking datasets, annotating traces, and drilling down into important data in trace view are workflows you’ll also want to do once your app hits production.However, especially at the production stage, it’s crucial to get a high-level overview of application performance with respect to latency, cost, and feedback scores. This ensures that it's delivering desirable results at scale.Online evaluations and automations allow you to process and score production traces in near real-time.Additionally, threads provide a seamless way to group traces from a single conversation, making it easier to track the performance of your application across multiple turns.Monitoring and A/B Testing\\u200bLangSmith provides monitoring charts that allow you to track key metrics over time. You can expand to view metrics for a given period and drill down into a specific data point to get a trace table for that time period — this is especially handy for debugging production issues.LangSmith also allows for tag and metadata grouping, which allows users to mark different versions of their applications with different identifiers and view how they are performing side-by-side within each chart. This is helpful for A/B testing changes in prompt, model, or retrieval strategy.Automations\\u200bAutomations are a powerful feature in LangSmith that allow you to perform actions on traces in near real-time. This can be used to automatically score traces, send them to annotation queues, or send them to datasets.To define an automation, simply provide a filter condition, a sampling rate, and an action to perform. Automations are particularly helpful for processing traces at production scale.Threads\\u200bMany LLM applications are multi-turn, meaning that they involve a series of interactions between the user and the application. LangSmith provides a threads view that groups traces from a single conversation together, making it easier to\", metadata={'source': 'https://docs.smith.langchain.com/user_guide', 'title': 'LangSmith User Guide | 🦜️🛠️ LangSmith', 'description': 'LangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.', 'language': 'en'}),\n",
111
+ " Document(page_content='meaning that they involve a series of interactions between the user and the application. LangSmith provides a threads view that groups traces from a single conversation together, making it easier to track the performance of and annotate your application across multiple turns.Was this page helpful?PreviousQuick StartNextOverviewPrototypingBeta TestingProductionCommunityDiscordTwitterGitHubDocs CodeLangSmith SDKPythonJS/TSMoreHomepageBlogLangChain Python DocsLangChain JS/TS DocsCopyright © 2024 LangChain, Inc.', metadata={'source': 'https://docs.smith.langchain.com/user_guide', 'title': 'LangSmith User Guide | 🦜️🛠️ LangSmith', 'description': 'LangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.', 'language': 'en'})]"
112
+ ]
113
+ },
114
+ "execution_count": 6,
115
+ "metadata": {},
116
+ "output_type": "execute_result"
117
+ }
118
+ ],
119
+ "source": [
120
+ "documents"
121
+ ]
122
+ },
123
+ {
124
+ "cell_type": "markdown",
125
+ "metadata": {},
126
+ "source": []
127
+ },
128
+ {
129
+ "cell_type": "code",
130
+ "execution_count": 7,
131
+ "metadata": {},
132
+ "outputs": [
133
+ {
134
+ "data": {
135
+ "text/plain": [
136
+ "<langchain_objectbox.vectorstores.ObjectBox at 0x2bbbfb9ac80>"
137
+ ]
138
+ },
139
+ "execution_count": 7,
140
+ "metadata": {},
141
+ "output_type": "execute_result"
142
+ }
143
+ ],
144
+ "source": [
145
+ "from langchain_openai import OpenAIEmbeddings\n",
146
+ "vector = ObjectBox.from_documents(documents, OpenAIEmbeddings(), embedding_dimensions=768)\n",
147
+ "vector"
148
+ ]
149
+ },
150
+ {
151
+ "cell_type": "markdown",
152
+ "metadata": {},
153
+ "source": [
154
+ "# 3. Make a RAG pipeline\n"
155
+ ]
156
+ },
157
+ {
158
+ "cell_type": "code",
159
+ "execution_count": 8,
160
+ "metadata": {},
161
+ "outputs": [],
162
+ "source": [
163
+ "from langchain_openai import ChatOpenAI\n",
164
+ "from langchain_core.output_parsers import StrOutputParser\n",
165
+ "from langchain_core.prompts import ChatPromptTemplate\n",
166
+ "from langchain.chains import RetrievalQA\n",
167
+ "from langchain import hub"
168
+ ]
169
+ },
170
+ {
171
+ "cell_type": "code",
172
+ "execution_count": 9,
173
+ "metadata": {},
174
+ "outputs": [
175
+ {
176
+ "data": {
177
+ "text/plain": [
178
+ "ChatPromptTemplate(input_variables=['context', 'question'], metadata={'lc_hub_owner': 'rlm', 'lc_hub_repo': 'rag-prompt', 'lc_hub_commit_hash': '50442af133e61576e74536c6556cefe1fac147cad032f4377b60c436e6cdcb6e'}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template=\"You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\\nQuestion: {question} \\nContext: {context} \\nAnswer:\"))])"
179
+ ]
180
+ },
181
+ "execution_count": 9,
182
+ "metadata": {},
183
+ "output_type": "execute_result"
184
+ }
185
+ ],
186
+ "source": [
187
+ "llm = ChatOpenAI(model=\"gpt-4o\") ## Calling Gpt-4o\n",
188
+ "prompt = hub.pull(\"rlm/rag-prompt\")\n",
189
+ "prompt\n",
190
+ "\n"
191
+ ]
192
+ },
193
+ {
194
+ "cell_type": "code",
195
+ "execution_count": 12,
196
+ "metadata": {},
197
+ "outputs": [
198
+ {
199
+ "data": {
200
+ "text/plain": [
201
+ "ChatPromptTemplate(input_variables=['context', 'question'], metadata={'lc_hub_owner': 'rlm', 'lc_hub_repo': 'rag-prompt', 'lc_hub_commit_hash': '50442af133e61576e74536c6556cefe1fac147cad032f4377b60c436e6cdcb6e'}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template=\"You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\\nQuestion: {question} \\nContext: {context} \\nAnswer:\"))])"
202
+ ]
203
+ },
204
+ "execution_count": 12,
205
+ "metadata": {},
206
+ "output_type": "execute_result"
207
+ }
208
+ ],
209
+ "source": [
210
+ "prompt"
211
+ ]
212
+ },
213
+ {
214
+ "cell_type": "code",
215
+ "execution_count": 10,
216
+ "metadata": {},
217
+ "outputs": [],
218
+ "source": [
219
+ "qa_chain = RetrievalQA.from_chain_type(\n",
220
+ " llm,\n",
221
+ " retriever=vector.as_retriever(),\n",
222
+ " chain_type_kwargs={\"prompt\": prompt}\n",
223
+ " )"
224
+ ]
225
+ },
226
+ {
227
+ "cell_type": "code",
228
+ "execution_count": 11,
229
+ "metadata": {},
230
+ "outputs": [
231
+ {
232
+ "name": "stderr",
233
+ "output_type": "stream",
234
+ "text": [
235
+ "e:\\Langchain\\venv\\lib\\site-packages\\langchain_core\\_api\\deprecation.py:119: LangChainDeprecationWarning: The method `Chain.__call__` was deprecated in langchain 0.1.0 and will be removed in 0.2.0. Use invoke instead.\n",
236
+ " warn_deprecated(\n"
237
+ ]
238
+ },
239
+ {
240
+ "data": {
241
+ "text/plain": [
242
+ "{'query': 'Explain what is langsmith',\n",
243
+ " 'result': 'LangSmith is a platform designed for the development, monitoring, and testing of LLM (Large Language Model) applications. It supports various stages of the application lifecycle, including prototyping, debugging, beta testing, and production. The platform offers tools for tracing, evaluation, feedback collection, A/B testing, and automations to ensure optimal performance and quality of LLM applications.'}"
244
+ ]
245
+ },
246
+ "execution_count": 11,
247
+ "metadata": {},
248
+ "output_type": "execute_result"
249
+ }
250
+ ],
251
+ "source": [
252
+ "question = \"Explain what is langsmith\"\n",
253
+ "result = qa_chain({\"query\": question })\n",
254
+ "result"
255
+ ]
256
+ },
257
+ {
258
+ "cell_type": "code",
259
+ "execution_count": 12,
260
+ "metadata": {},
261
+ "outputs": [
262
+ {
263
+ "data": {
264
+ "text/plain": [
265
+ "'LangSmith is a platform designed for the development, monitoring, and testing of LLM (Large Language Model) applications. It supports various stages of the application lifecycle, including prototyping, debugging, beta testing, and production. The platform offers tools for tracing, evaluation, feedback collection, A/B testing, and automations to ensure optimal performance and quality of LLM applications.'"
266
+ ]
267
+ },
268
+ "execution_count": 12,
269
+ "metadata": {},
270
+ "output_type": "execute_result"
271
+ }
272
+ ],
273
+ "source": [
274
+ "result[\"result\"]"
275
+ ]
276
+ },
277
+ {
278
+ "cell_type": "code",
279
+ "execution_count": 13,
280
+ "metadata": {},
281
+ "outputs": [
282
+ {
283
+ "name": "stdout",
284
+ "output_type": "stream",
285
+ "text": [
286
+ "('LangSmith is a platform designed for the development, monitoring, and '\n",
287
+ " 'testing of LLM (Large Language Model) applications. It supports various '\n",
288
+ " 'stages of the application lifecycle, including prototyping, debugging, beta '\n",
289
+ " 'testing, and production. The platform offers tools for tracing, evaluation, '\n",
290
+ " 'feedback collection, A/B testing, and automations to ensure optimal '\n",
291
+ " 'performance and quality of LLM applications.')\n"
292
+ ]
293
+ }
294
+ ],
295
+ "source": [
296
+ "import pprint\n",
297
+ "pp = pprint.PrettyPrinter(indent=5)\n",
298
+ "pp.pprint(result[\"result\"])"
299
+ ]
300
+ },
301
+ {
302
+ "cell_type": "code",
303
+ "execution_count": 14,
304
+ "metadata": {},
305
+ "outputs": [
306
+ {
307
+ "data": {
308
+ "text/plain": [
309
+ "{'query': 'Explain Monitoring and A/B Testing in langsmith',\n",
310
+ " 'result': 'LangSmith provides monitoring charts to track key metrics over time, allowing users to drill down into specific data points for debugging production issues. It also supports A/B testing by enabling tag and metadata grouping, which allows users to compare the performance of different versions of their applications side-by-side within each chart. This helps in evaluating changes in prompt, model, or retrieval strategy effectively.'}"
311
+ ]
312
+ },
313
+ "execution_count": 14,
314
+ "metadata": {},
315
+ "output_type": "execute_result"
316
+ }
317
+ ],
318
+ "source": [
319
+ "question = \"Explain Monitoring and A/B Testing in langsmith\"\n",
320
+ "result = qa_chain({\"query\": question })\n",
321
+ "result"
322
+ ]
323
+ },
324
+ {
325
+ "cell_type": "code",
326
+ "execution_count": null,
327
+ "metadata": {},
328
+ "outputs": [],
329
+ "source": []
330
+ }
331
+ ],
332
+ "metadata": {
333
+ "kernelspec": {
334
+ "display_name": "base",
335
+ "language": "python",
336
+ "name": "python3"
337
+ },
338
+ "language_info": {
339
+ "codemirror_mode": {
340
+ "name": "ipython",
341
+ "version": 3
342
+ },
343
+ "file_extension": ".py",
344
+ "mimetype": "text/x-python",
345
+ "name": "python",
346
+ "nbconvert_exporter": "python",
347
+ "pygments_lexer": "ipython3",
348
+ "version": "3.10.0"
349
+ }
350
+ },
351
+ "nbformat": 4,
352
+ "nbformat_minor": 2
353
+ }
acsbr-015.pdf ADDED
Binary file (872 kB). View file
 
acsbr-016.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:efdd4140ab4bfd3801771525f4c784dedeaec7c4f83aaa382517aae37ea05eed
3
+ size 2286774
acsbr-017.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4cacfe8c64d32bf3a5a7729a271cbf7a526c3bea798c866e075af033f50d5d81
3
+ size 1389492
agents.ipynb ADDED
@@ -0,0 +1,392 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": 3,
6
+ "metadata": {},
7
+ "outputs": [
8
+ {
9
+ "name": "stdout",
10
+ "output_type": "stream",
11
+ "text": [
12
+ "Collecting arxiv\n",
13
+ " Using cached arxiv-2.1.0-py3-none-any.whl.metadata (6.1 kB)\n",
14
+ "Collecting feedparser==6.0.10 (from arxiv)\n",
15
+ " Using cached feedparser-6.0.10-py3-none-any.whl.metadata (2.3 kB)\n",
16
+ "Requirement already satisfied: requests==2.31.0 in e:\\langchain\\venv\\lib\\site-packages (from arxiv) (2.31.0)\n",
17
+ "Collecting sgmllib3k (from feedparser==6.0.10->arxiv)\n",
18
+ " Using cached sgmllib3k-1.0.0-py3-none-any.whl\n",
19
+ "Requirement already satisfied: charset-normalizer<4,>=2 in e:\\langchain\\venv\\lib\\site-packages (from requests==2.31.0->arxiv) (3.3.2)\n",
20
+ "Requirement already satisfied: idna<4,>=2.5 in e:\\langchain\\venv\\lib\\site-packages (from requests==2.31.0->arxiv) (3.6)\n",
21
+ "Requirement already satisfied: urllib3<3,>=1.21.1 in e:\\langchain\\venv\\lib\\site-packages (from requests==2.31.0->arxiv) (2.2.1)\n",
22
+ "Requirement already satisfied: certifi>=2017.4.17 in e:\\langchain\\venv\\lib\\site-packages (from requests==2.31.0->arxiv) (2024.2.2)\n",
23
+ "Using cached arxiv-2.1.0-py3-none-any.whl (11 kB)\n",
24
+ "Using cached feedparser-6.0.10-py3-none-any.whl (81 kB)\n",
25
+ "Installing collected packages: sgmllib3k, feedparser, arxiv\n",
26
+ "Successfully installed arxiv-2.1.0 feedparser-6.0.10 sgmllib3k-1.0.0\n"
27
+ ]
28
+ }
29
+ ],
30
+ "source": [
31
+ "!pip install arxiv"
32
+ ]
33
+ },
34
+ {
35
+ "cell_type": "code",
36
+ "execution_count": 6,
37
+ "metadata": {},
38
+ "outputs": [],
39
+ "source": [
40
+ "from langchain_community.tools import WikipediaQueryRun\n",
41
+ "from langchain_community.utilities import WikipediaAPIWrapper"
42
+ ]
43
+ },
44
+ {
45
+ "cell_type": "code",
46
+ "execution_count": 16,
47
+ "metadata": {},
48
+ "outputs": [],
49
+ "source": [
50
+ "api_wrapper=WikipediaAPIWrapper(top_k_results=1,doc_content_chars_max=200)\n",
51
+ "wiki=WikipediaQueryRun(api_wrapper=api_wrapper)"
52
+ ]
53
+ },
54
+ {
55
+ "cell_type": "code",
56
+ "execution_count": 17,
57
+ "metadata": {},
58
+ "outputs": [
59
+ {
60
+ "data": {
61
+ "text/plain": [
62
+ "'wikipedia'"
63
+ ]
64
+ },
65
+ "execution_count": 17,
66
+ "metadata": {},
67
+ "output_type": "execute_result"
68
+ }
69
+ ],
70
+ "source": [
71
+ "wiki.name"
72
+ ]
73
+ },
74
+ {
75
+ "cell_type": "code",
76
+ "execution_count": 12,
77
+ "metadata": {},
78
+ "outputs": [
79
+ {
80
+ "data": {
81
+ "text/plain": [
82
+ "VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000023FFF852E60>)"
83
+ ]
84
+ },
85
+ "execution_count": 12,
86
+ "metadata": {},
87
+ "output_type": "execute_result"
88
+ }
89
+ ],
90
+ "source": [
91
+ "from langchain_community.document_loaders import WebBaseLoader\n",
92
+ "from langchain_community.vectorstores import FAISS\n",
93
+ "from langchain_openai import OpenAIEmbeddings\n",
94
+ "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
95
+ "\n",
96
+ "loader=WebBaseLoader(\"https://docs.smith.langchain.com/\")\n",
97
+ "docs=loader.load()\n",
98
+ "documents=RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=200).split_documents(docs)\n",
99
+ "vectordb=FAISS.from_documents(documents,OpenAIEmbeddings())\n",
100
+ "retriever=vectordb.as_retriever()\n",
101
+ "retriever\n"
102
+ ]
103
+ },
104
+ {
105
+ "cell_type": "code",
106
+ "execution_count": 13,
107
+ "metadata": {},
108
+ "outputs": [],
109
+ "source": [
110
+ "from langchain.tools.retriever import create_retriever_tool\n",
111
+ "retriever_tool=create_retriever_tool(retriever,\"langsmith_search\",\n",
112
+ " \"Search for information about LangSmith. For any questions about LangSmith, you must use this tool!\")"
113
+ ]
114
+ },
115
+ {
116
+ "cell_type": "code",
117
+ "execution_count": 14,
118
+ "metadata": {},
119
+ "outputs": [
120
+ {
121
+ "data": {
122
+ "text/plain": [
123
+ "'langsmith_search'"
124
+ ]
125
+ },
126
+ "execution_count": 14,
127
+ "metadata": {},
128
+ "output_type": "execute_result"
129
+ }
130
+ ],
131
+ "source": [
132
+ "retriever_tool.name"
133
+ ]
134
+ },
135
+ {
136
+ "cell_type": "code",
137
+ "execution_count": 15,
138
+ "metadata": {},
139
+ "outputs": [
140
+ {
141
+ "data": {
142
+ "text/plain": [
143
+ "'arxiv'"
144
+ ]
145
+ },
146
+ "execution_count": 15,
147
+ "metadata": {},
148
+ "output_type": "execute_result"
149
+ }
150
+ ],
151
+ "source": [
152
+ "## Arxiv Tool\n",
153
+ "from langchain_community.utilities import ArxivAPIWrapper\n",
154
+ "from langchain_community.tools import ArxivQueryRun\n",
155
+ "\n",
156
+ "arxiv_wrapper=ArxivAPIWrapper(top_k_results=1, doc_content_chars_max=200)\n",
157
+ "arxiv=ArxivQueryRun(api_wrapper=arxiv_wrapper)\n",
158
+ "arxiv.name"
159
+ ]
160
+ },
161
+ {
162
+ "cell_type": "code",
163
+ "execution_count": 18,
164
+ "metadata": {},
165
+ "outputs": [],
166
+ "source": [
167
+ "tools=[wiki,arxiv,retriever_tool]"
168
+ ]
169
+ },
170
+ {
171
+ "cell_type": "code",
172
+ "execution_count": 19,
173
+ "metadata": {},
174
+ "outputs": [
175
+ {
176
+ "data": {
177
+ "text/plain": [
178
+ "[WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper(wiki_client=<module 'wikipedia' from 'e:\\\\Langchain\\\\venv\\\\lib\\\\site-packages\\\\wikipedia\\\\__init__.py'>, top_k_results=1, lang='en', load_all_available_meta=False, doc_content_chars_max=200)),\n",
179
+ " ArxivQueryRun(api_wrapper=ArxivAPIWrapper(arxiv_search=<class 'arxiv.Search'>, arxiv_exceptions=(<class 'arxiv.ArxivError'>, <class 'arxiv.UnexpectedEmptyPageError'>, <class 'arxiv.HTTPError'>), top_k_results=1, ARXIV_MAX_QUERY_LENGTH=300, load_max_docs=100, load_all_available_meta=False, doc_content_chars_max=200, arxiv_result=<class 'arxiv.Result'>)),\n",
180
+ " Tool(name='langsmith_search', description='Search for information about LangSmith. For any questions about LangSmith, you must use this tool!', args_schema=<class 'langchain.tools.retriever.RetrieverInput'>, func=functools.partial(<function _get_relevant_documents at 0x0000023F870F6290>, retriever=VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000023FFF852E60>), document_prompt=PromptTemplate(input_variables=['page_content'], template='{page_content}'), document_separator='\\n\\n'), coroutine=functools.partial(<function _aget_relevant_documents at 0x0000023F870F6440>, retriever=VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000023FFF852E60>), document_prompt=PromptTemplate(input_variables=['page_content'], template='{page_content}'), document_separator='\\n\\n'))]"
181
+ ]
182
+ },
183
+ "execution_count": 19,
184
+ "metadata": {},
185
+ "output_type": "execute_result"
186
+ }
187
+ ],
188
+ "source": [
189
+ "tools"
190
+ ]
191
+ },
192
+ {
193
+ "cell_type": "code",
194
+ "execution_count": 20,
195
+ "metadata": {},
196
+ "outputs": [],
197
+ "source": [
198
+ "from dotenv import load_dotenv\n",
199
+ "\n",
200
+ "load_dotenv()\n",
201
+ "import os\n",
202
+ "os.environ['OPENAI_API_KEY']=os.getenv(\"OPENAI_API_KEY\")\n",
203
+ "from langchain_openai import ChatOpenAI\n",
204
+ "\n",
205
+ "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)"
206
+ ]
207
+ },
208
+ {
209
+ "cell_type": "code",
210
+ "execution_count": 22,
211
+ "metadata": {},
212
+ "outputs": [
213
+ {
214
+ "data": {
215
+ "text/plain": [
216
+ "[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant')),\n",
217
+ " MessagesPlaceholder(variable_name='chat_history', optional=True),\n",
218
+ " HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}')),\n",
219
+ " MessagesPlaceholder(variable_name='agent_scratchpad')]"
220
+ ]
221
+ },
222
+ "execution_count": 22,
223
+ "metadata": {},
224
+ "output_type": "execute_result"
225
+ }
226
+ ],
227
+ "source": [
228
+ "from langchain import hub\n",
229
+ "# Get the prompt to use - you can modify this!\n",
230
+ "prompt = hub.pull(\"hwchase17/openai-functions-agent\")\n",
231
+ "prompt.messages"
232
+ ]
233
+ },
234
+ {
235
+ "cell_type": "code",
236
+ "execution_count": 23,
237
+ "metadata": {},
238
+ "outputs": [],
239
+ "source": [
240
+ "### Agents\n",
241
+ "from langchain.agents import create_openai_tools_agent\n",
242
+ "agent=create_openai_tools_agent(llm,tools,prompt)"
243
+ ]
244
+ },
245
+ {
246
+ "cell_type": "code",
247
+ "execution_count": 26,
248
+ "metadata": {},
249
+ "outputs": [
250
+ {
251
+ "data": {
252
+ "text/plain": [
253
+ "AgentExecutor(verbose=True, agent=RunnableMultiActionAgent(runnable=RunnableAssign(mapper={\n",
254
+ " agent_scratchpad: RunnableLambda(lambda x: format_to_openai_tool_messages(x['intermediate_steps']))\n",
255
+ "})\n",
256
+ "| ChatPromptTemplate(input_variables=['agent_scratchpad', 'input'], input_types={'chat_history': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]], 'agent_scratchpad': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, metadata={'lc_hub_owner': 'hwchase17', 'lc_hub_repo': 'openai-functions-agent', 'lc_hub_commit_hash': 'a1655024b06afbd95d17449f21316291e0726f13dcfaf990cc0d18087ad689a5'}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant')), MessagesPlaceholder(variable_name='chat_history', optional=True), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}')), MessagesPlaceholder(variable_name='agent_scratchpad')])\n",
257
+ "| RunnableBinding(bound=ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x0000023FC90ED990>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x0000023FC90EEE30>, model_name='gpt-3.5-turbo-0125', temperature=0.0, openai_api_key=SecretStr('**********'), openai_proxy=''), kwargs={'tools': [{'type': 'function', 'function': {'name': 'wikipedia', 'description': 'A wrapper around Wikipedia. Useful for when you need to answer general questions about people, places, companies, facts, historical events, or other subjects. Input should be a search query.', 'parameters': {'properties': {'__arg1': {'title': '__arg1', 'type': 'string'}}, 'required': ['__arg1'], 'type': 'object'}}}, {'type': 'function', 'function': {'name': 'arxiv', 'description': 'A wrapper around Arxiv.org Useful for when you need to answer questions about Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance, Statistics, Electrical Engineering, and Economics from scientific articles on arxiv.org. Input should be a search query.', 'parameters': {'type': 'object', 'properties': {'query': {'description': 'search query to look up', 'type': 'string'}}, 'required': ['query']}}}, {'type': 'function', 'function': {'name': 'langsmith_search', 'description': 'Search for information about LangSmith. For any questions about LangSmith, you must use this tool!', 'parameters': {'type': 'object', 'properties': {'query': {'description': 'query to look up in retriever', 'type': 'string'}}, 'required': ['query']}}}]})\n",
258
+ "| OpenAIToolsAgentOutputParser(), input_keys_arg=[], return_keys_arg=[], stream_runnable=True), tools=[WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper(wiki_client=<module 'wikipedia' from 'e:\\\\Langchain\\\\venv\\\\lib\\\\site-packages\\\\wikipedia\\\\__init__.py'>, top_k_results=1, lang='en', load_all_available_meta=False, doc_content_chars_max=200)), ArxivQueryRun(api_wrapper=ArxivAPIWrapper(arxiv_search=<class 'arxiv.Search'>, arxiv_exceptions=(<class 'arxiv.ArxivError'>, <class 'arxiv.UnexpectedEmptyPageError'>, <class 'arxiv.HTTPError'>), top_k_results=1, ARXIV_MAX_QUERY_LENGTH=300, load_max_docs=100, load_all_available_meta=False, doc_content_chars_max=200, arxiv_result=<class 'arxiv.Result'>)), Tool(name='langsmith_search', description='Search for information about LangSmith. For any questions about LangSmith, you must use this tool!', args_schema=<class 'langchain.tools.retriever.RetrieverInput'>, func=functools.partial(<function _get_relevant_documents at 0x0000023F870F6290>, retriever=VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000023FFF852E60>), document_prompt=PromptTemplate(input_variables=['page_content'], template='{page_content}'), document_separator='\\n\\n'), coroutine=functools.partial(<function _aget_relevant_documents at 0x0000023F870F6440>, retriever=VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000023FFF852E60>), document_prompt=PromptTemplate(input_variables=['page_content'], template='{page_content}'), document_separator='\\n\\n'))])"
259
+ ]
260
+ },
261
+ "execution_count": 26,
262
+ "metadata": {},
263
+ "output_type": "execute_result"
264
+ }
265
+ ],
266
+ "source": [
267
+ "## Agent Executer\n",
268
+ "from langchain.agents import AgentExecutor\n",
269
+ "agent_executor=AgentExecutor(agent=agent,tools=tools,verbose=True)\n",
270
+ "agent_executor"
271
+ ]
272
+ },
273
+ {
274
+ "cell_type": "code",
275
+ "execution_count": 27,
276
+ "metadata": {},
277
+ "outputs": [
278
+ {
279
+ "name": "stdout",
280
+ "output_type": "stream",
281
+ "text": [
282
+ "\n",
283
+ "\n",
284
+ "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
285
+ "\u001b[32;1m\u001b[1;3m\n",
286
+ "Invoking: `langsmith_search` with `{'query': 'Langsmith'}`\n",
287
+ "\n",
288
+ "\n",
289
+ "\u001b[0m\u001b[38;5;200m\u001b[1;3mGetting started with LangSmith | 🦜️🛠️ LangSmith\n",
290
+ "\n",
291
+ "Skip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookQuick StartOn this pageGetting started with LangSmithIntroduction​LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!Install LangSmith​We offer Python and Typescript SDKs for all your LangSmith needs.PythonTypeScriptpip install -U langsmithyarn add langchain langsmithCreate an API key​To create an API key head to the setting pages. Then click Create API Key.Setup your environment​Shellexport LANGCHAIN_TRACING_V2=trueexport LANGCHAIN_API_KEY=<your-api-key># The below examples use the OpenAI API, though it's not necessary in generalexport OPENAI_API_KEY=<your-openai-api-key>Log your first trace​We provide multiple ways to log traces\n",
292
+ "\n",
293
+ "Learn about the workflows LangSmith supports at each stage of the LLM application lifecycle.Pricing: Learn about the pricing model for LangSmith.Self-Hosting: Learn about self-hosting options for LangSmith.Proxy: Learn about the proxy capabilities of LangSmith.Tracing: Learn about the tracing capabilities of LangSmith.Evaluation: Learn about the evaluation capabilities of LangSmith.Prompt Hub Learn about the Prompt Hub, a prompt management tool built into LangSmith.Additional Resources​LangSmith Cookbook: A collection of tutorials and end-to-end walkthroughs using LangSmith.LangChain Python: Docs for the Python LangChain library.LangChain Python API Reference: documentation to review the core APIs of LangChain.LangChain JS: Docs for the TypeScript LangChain libraryDiscord: Join us on our Discord to discuss all things LangChain!FAQ​How do I migrate projects between organizations?​Currently we do not support project migration betwen organizations. While you can manually imitate this by\n",
294
+ "\n",
295
+ "team deals with sensitive data that cannot be logged. How can I ensure that only my team can access it?​If you are interested in a private deployment of LangSmith or if you need to self-host, please reach out to us at sales@langchain.dev. Self-hosting LangSmith requires an annual enterprise license that also comes with support and formalized access to the LangChain team.Was this page helpful?NextUser GuideIntroductionInstall LangSmithCreate an API keySetup your environmentLog your first traceCreate your first evaluationNext StepsAdditional ResourcesFAQHow do I migrate projects between organizations?Why aren't my runs aren't showing up in my project?My team deals with sensitive data that cannot be logged. How can I ensure that only my team can access it?CommunityDiscordTwitterGitHubDocs CodeLangSmith SDKPythonJS/TSMoreHomepageBlogLangChain Python DocsLangChain JS/TS DocsCopyright © 2024 LangChain, Inc.\u001b[0m\u001b[32;1m\u001b[1;3mLangSmith is a platform for building production-grade LLM (Large Language Model) applications. It allows you to closely monitor and evaluate your application, ensuring you can ship quickly and with confidence. LangSmith offers Python and Typescript SDKs for all your development needs. You can create an API key, set up your environment, and log your first trace using LangSmith.\n",
296
+ "\n",
297
+ "LangSmith supports various workflows at each stage of the LLM application lifecycle, and you can learn more about its pricing model, self-hosting options, proxy capabilities, tracing capabilities, evaluation capabilities, and the Prompt Hub tool. Additionally, LangSmith provides a Cookbook with tutorials and walkthroughs, along with documentation for the Python and TypeScript LangChain libraries.\n",
298
+ "\n",
299
+ "If you have sensitive data that cannot be logged or if you need a private deployment of LangSmith, you can reach out to the LangChain team at sales@langchain.dev for self-hosting options that require an annual enterprise license.\n",
300
+ "\n",
301
+ "For more information, you can visit the LangSmith website or join their Discord community to discuss all things LangChain.\u001b[0m\n",
302
+ "\n",
303
+ "\u001b[1m> Finished chain.\u001b[0m\n"
304
+ ]
305
+ },
306
+ {
307
+ "data": {
308
+ "text/plain": [
309
+ "{'input': 'Tell me about Langsmith',\n",
310
+ " 'output': 'LangSmith is a platform for building production-grade LLM (Large Language Model) applications. It allows you to closely monitor and evaluate your application, ensuring you can ship quickly and with confidence. LangSmith offers Python and Typescript SDKs for all your development needs. You can create an API key, set up your environment, and log your first trace using LangSmith.\\n\\nLangSmith supports various workflows at each stage of the LLM application lifecycle, and you can learn more about its pricing model, self-hosting options, proxy capabilities, tracing capabilities, evaluation capabilities, and the Prompt Hub tool. Additionally, LangSmith provides a Cookbook with tutorials and walkthroughs, along with documentation for the Python and TypeScript LangChain libraries.\\n\\nIf you have sensitive data that cannot be logged or if you need a private deployment of LangSmith, you can reach out to the LangChain team at sales@langchain.dev for self-hosting options that require an annual enterprise license.\\n\\nFor more information, you can visit the LangSmith website or join their Discord community to discuss all things LangChain.'}"
311
+ ]
312
+ },
313
+ "execution_count": 27,
314
+ "metadata": {},
315
+ "output_type": "execute_result"
316
+ }
317
+ ],
318
+ "source": [
319
+ "agent_executor.invoke({\"input\":\"Tell me about Langsmith\"})"
320
+ ]
321
+ },
322
+ {
323
+ "cell_type": "code",
324
+ "execution_count": 29,
325
+ "metadata": {},
326
+ "outputs": [
327
+ {
328
+ "name": "stdout",
329
+ "output_type": "stream",
330
+ "text": [
331
+ "\n",
332
+ "\n",
333
+ "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
334
+ "\u001b[32;1m\u001b[1;3m\n",
335
+ "Invoking: `arxiv` with `{'query': '1605.08386'}`\n",
336
+ "\n",
337
+ "\n",
338
+ "\u001b[0m\u001b[33;1m\u001b[1;3mPublished: 2016-05-26\n",
339
+ "Title: Heat-bath random walks with Markov bases\n",
340
+ "Authors: Caprice Stanley, Tobias Windisch\n",
341
+ "Summary: Graphs on lattice points are studied whose edges come from a finite set of\n",
342
+ "allo\u001b[0m\u001b[32;1m\u001b[1;3mThe paper with the identifier 1605.08386 is titled \"Heat-bath random walks with Markov bases\" by Caprice Stanley and Tobias Windisch. It discusses the study of graphs on lattice points where the edges are derived from a finite set of allocations.\u001b[0m\n",
343
+ "\n",
344
+ "\u001b[1m> Finished chain.\u001b[0m\n"
345
+ ]
346
+ },
347
+ {
348
+ "data": {
349
+ "text/plain": [
350
+ "{'input': \"What's the paper 1605.08386 about?\",\n",
351
+ " 'output': 'The paper with the identifier 1605.08386 is titled \"Heat-bath random walks with Markov bases\" by Caprice Stanley and Tobias Windisch. It discusses the study of graphs on lattice points where the edges are derived from a finite set of allocations.'}"
352
+ ]
353
+ },
354
+ "execution_count": 29,
355
+ "metadata": {},
356
+ "output_type": "execute_result"
357
+ }
358
+ ],
359
+ "source": [
360
+ "agent_executor.invoke({\"input\":\"What's the paper 1605.08386 about?\"})"
361
+ ]
362
+ },
363
+ {
364
+ "cell_type": "code",
365
+ "execution_count": null,
366
+ "metadata": {},
367
+ "outputs": [],
368
+ "source": []
369
+ }
370
+ ],
371
+ "metadata": {
372
+ "kernelspec": {
373
+ "display_name": "Python 3",
374
+ "language": "python",
375
+ "name": "python3"
376
+ },
377
+ "language_info": {
378
+ "codemirror_mode": {
379
+ "name": "ipython",
380
+ "version": 3
381
+ },
382
+ "file_extension": ".py",
383
+ "mimetype": "text/x-python",
384
+ "name": "python",
385
+ "nbconvert_exporter": "python",
386
+ "pygments_lexer": "ipython3",
387
+ "version": "3.10.0"
388
+ }
389
+ },
390
+ "nbformat": 4,
391
+ "nbformat_minor": 2
392
+ }
app.py ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import FastAPI
2
+ from langchain.prompts import ChatPromptTemplate
3
+ from langchain.chat_models import ChatOpenAI
4
+ from langserve import add_routes
5
+ import uvicorn
6
+ import os
7
+ from langchain_community.llms import Ollama
8
+ from dotenv import load_dotenv
9
+
10
+ load_dotenv()
11
+
12
+ os.environ['OPENAI_API_KEY']=os.getenv("OPENAI_API_KEY")
13
+
14
+ app=FastAPI(
15
+ title="Langchain Server",
16
+ version="1.0",
17
+ decsription="A simple API Server"
18
+
19
+ )
20
+
21
+ add_routes(
22
+ app,
23
+ ChatOpenAI(),
24
+ path="/openai"
25
+ )
26
+ model=ChatOpenAI()
27
+ ##ollama llama2
28
+ llm=Ollama(model="llama2")
29
+
30
+ prompt1=ChatPromptTemplate.from_template("Write me an essay about {topic} with 100 words")
31
+ prompt2=ChatPromptTemplate.from_template("Write me an poem about {topic} for a 5 years child with 100 words")
32
+
33
+ add_routes(
34
+ app,
35
+ prompt1|model,
36
+ path="/essay"
37
+
38
+
39
+ )
40
+
41
+ add_routes(
42
+ app,
43
+ prompt2|llm,
44
+ path="/poem"
45
+
46
+
47
+ )
48
+
49
+
50
+ if __name__=="__main__":
51
+ uvicorn.run(app,host="localhost",port=8000)
52
+
attention.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b7d72988fd8107d07f7d278bf0ba6621adb6ed47df74be4014fa4a01f03aff6a
3
+ size 2215244
client.py ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import requests
2
+ import streamlit as st
3
+
4
+ def get_openai_response(input_text):
5
+ response=requests.post("http://localhost:8000/essay/invoke",
6
+ json={'input':{'topic':input_text}})
7
+
8
+ return response.json()['output']['content']
9
+
10
+ def get_ollama_response(input_text):
11
+ response=requests.post(
12
+ "http://localhost:8000/poem/invoke",
13
+ json={'input':{'topic':input_text}})
14
+
15
+ return response.json()['output']
16
+
17
+ ## streamlit framework
18
+
19
+ st.title('Langchain Demo With LLAMA2 API')
20
+ input_text=st.text_input("Write an essay on")
21
+ input_text1=st.text_input("Write a poem on")
22
+
23
+ if input_text:
24
+ st.write(get_openai_response(input_text))
25
+
26
+ if input_text1:
27
+ st.write(get_ollama_response(input_text1))
groq.ipynb ADDED
@@ -0,0 +1,278 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": 32,
6
+ "metadata": {},
7
+ "outputs": [
8
+ {
9
+ "data": {
10
+ "text/plain": [
11
+ "True"
12
+ ]
13
+ },
14
+ "execution_count": 32,
15
+ "metadata": {},
16
+ "output_type": "execute_result"
17
+ }
18
+ ],
19
+ "source": [
20
+ "import os\n",
21
+ "from langchain_groq import ChatGroq\n",
22
+ "from langchain_community.document_loaders import WebBaseLoader\n",
23
+ "from langchain_community.embeddings import OllamaEmbeddings\n",
24
+ "from langchain_community.embeddings import OpenAIEmbeddings\n",
25
+ "from langchain.vectorstores.cassandra import Cassandra\n",
26
+ "import cassio\n",
27
+ "from dotenv import load_dotenv\n",
28
+ "load_dotenv()"
29
+ ]
30
+ },
31
+ {
32
+ "cell_type": "code",
33
+ "execution_count": 33,
34
+ "metadata": {},
35
+ "outputs": [],
36
+ "source": [
37
+ "groq_api_key=os.environ['GROQ_API_KEY']\n",
38
+ "\n",
39
+ "## connection of the ASTRA DB\n",
40
+ "ASTRA_DB_APPLICATION_TOKEN=\"AstraCS:mifTZmAApXkXyN:8ae5390b5181dbf4ed575bbc3ccddbb9845300cb7071b9f98bf3ba72cbca837c\" # enter the \"AstraCS:...\" string found in in your Token JSON file\"\n",
41
+ "ASTRA_DB_ID=\"31d5fd09-8c1f-c-aee0bda20405\"\n",
42
+ "cassio.init(token=ASTRA_DB_APPLICATION_TOKEN,database_id=ASTRA_DB_ID)"
43
+ ]
44
+ },
45
+ {
46
+ "cell_type": "code",
47
+ "execution_count": 34,
48
+ "metadata": {},
49
+ "outputs": [],
50
+ "source": [
51
+ "from langchain_community.document_loaders import WebBaseLoader\n",
52
+ "import bs4\n",
53
+ "loader=WebBaseLoader(web_paths=(\"https://lilianweng.github.io/posts/2023-06-23-agent/\",),\n",
54
+ " bs_kwargs=dict(parse_only=bs4.SoupStrainer(\n",
55
+ " class_=(\"post-title\",\"post-content\",\"post-header\")\n",
56
+ "\n",
57
+ " )))\n",
58
+ "\n",
59
+ "text_documents=loader.load()"
60
+ ]
61
+ },
62
+ {
63
+ "cell_type": "code",
64
+ "execution_count": 35,
65
+ "metadata": {},
66
+ "outputs": [
67
+ {
68
+ "data": {
69
+ "text/plain": [
70
+ "[Document(page_content='\\n\\n LLM Powered Autonomous Agents\\n \\nDate: June 23, 2023 | Estimated Reading Time: 31 min | Author: Lilian Weng\\n\\n\\nBuilding agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\\nAgent System Overview#\\nIn a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:\\n\\nPlanning\\n\\nSubgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\\nReflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.\\n\\n\\nMemory\\n\\nShort-term memory: I would consider all the in-context learning (See Prompt Engineering) as utilizing short-term memory of the model to learn.\\nLong-term memory: This provides the agent with the capability to retain and recall (infinite) information over extended periods, often by leveraging an external vector store and fast retrieval.\\n\\n\\nTool use\\n\\nThe agent learns to call external APIs for extra information that is missing from the model weights (often hard to change after pre-training), including current information, code execution capability, access to proprietary information sources and more.\\n\\n\\n\\n\\nFig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\\nAnother quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.\\nSelf-Reflection#\\nSelf-reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes. It plays a crucial role in real-world tasks where trial and error are inevitable.\\nReAct (Yao et al. 2023) integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space. The former enables LLM to interact with the environment (e.g. use Wikipedia search API), while the latter prompting LLM to generate reasoning traces in natural language.\\nThe ReAct prompt template incorporates explicit steps for LLM to think, roughly formatted as:\\nThought: ...\\nAction: ...\\nObservation: ...\\n... (Repeated many times)\\n\\nFig. 2. Examples of reasoning trajectories for knowledge-intensive tasks (e.g. HotpotQA, FEVER) and decision-making tasks (e.g. AlfWorld Env, WebShop). (Image source: Yao et al. 2023).\\nIn both experiments on knowledge-intensive tasks and decision-making tasks, ReAct works better than the Act-only baseline where Thought: … step is removed.\\nReflexion (Shinn & Labash 2023) is a framework to equips agents with dynamic memory and self-reflection capabilities to improve reasoning skills. Reflexion has a standard RL setup, in which the reward model provides a simple binary reward and the action space follows the setup in ReAct where the task-specific action space is augmented with language to enable complex reasoning steps. After each action $a_t$, the agent computes a heuristic $h_t$ and optionally may decide to reset the environment to start a new trial depending on the self-reflection results.\\n\\nFig. 3. Illustration of the Reflexion framework. (Image source: Shinn & Labash, 2023)\\nThe heuristic function determines when the trajectory is inefficient or contains hallucination and should be stopped. Inefficient planning refers to trajectories that take too long without success. Hallucination is defined as encountering a sequence of consecutive identical actions that lead to the same observation in the environment.\\nSelf-reflection is created by showing two-shot examples to LLM and each example is a pair of (failed trajectory, ideal reflection for guiding future changes in the plan). Then reflections are added into the agent’s working memory, up to three, to be used as context for querying LLM.\\n\\nFig. 4. Experiments on AlfWorld Env and HotpotQA. Hallucination is a more common failure than inefficient planning in AlfWorld. (Image source: Shinn & Labash, 2023)\\nChain of Hindsight (CoH; Liu et al. 2023) encourages the model to improve on its own outputs by explicitly presenting it with a sequence of past outputs, each annotated with feedback. Human feedback data is a collection of $D_h = \\\\{(x, y_i , r_i , z_i)\\\\}_{i=1}^n$, where $x$ is the prompt, each $y_i$ is a model completion, $r_i$ is the human rating of $y_i$, and $z_i$ is the corresponding human-provided hindsight feedback. Assume the feedback tuples are ranked by reward, $r_n \\\\geq r_{n-1} \\\\geq \\\\dots \\\\geq r_1$ The process is supervised fine-tuning where the data is a sequence in the form of $\\\\tau_h = (x, z_i, y_i, z_j, y_j, \\\\dots, z_n, y_n)$, where $\\\\leq i \\\\leq j \\\\leq n$. The model is finetuned to only predict $y_n$ where conditioned on the sequence prefix, such that the model can self-reflect to produce better output based on the feedback sequence. The model can optionally receive multiple rounds of instructions with human annotators at test time.\\nTo avoid overfitting, CoH adds a regularization term to maximize the log-likelihood of the pre-training dataset. To avoid shortcutting and copying (because there are many common words in feedback sequences), they randomly mask 0% - 5% of past tokens during training.\\nThe training dataset in their experiments is a combination of WebGPT comparisons, summarization from human feedback and human preference dataset.\\n\\nFig. 5. After fine-tuning with CoH, the model can follow instructions to produce outputs with incremental improvement in a sequence. (Image source: Liu et al. 2023)\\nThe idea of CoH is to present a history of sequentially improved outputs in context and train the model to take on the trend to produce better outputs. Algorithm Distillation (AD; Laskin et al. 2023) applies the same idea to cross-episode trajectories in reinforcement learning tasks, where an algorithm is encapsulated in a long history-conditioned policy. Considering that an agent interacts with the environment many times and in each episode the agent gets a little better, AD concatenates this learning history and feeds that into the model. Hence we should expect the next predicted action to lead to better performance than previous trials. The goal is to learn the process of RL instead of training a task-specific policy itself.\\n\\nFig. 6. Illustration of how Algorithm Distillation (AD) works. (Image source: Laskin et al. 2023).\\nThe paper hypothesizes that any algorithm that generates a set of learning histories can be distilled into a neural network by performing behavioral cloning over actions. The history data is generated by a set of source policies, each trained for a specific task. At the training stage, during each RL run, a random task is sampled and a subsequence of multi-episode history is used for training, such that the learned policy is task-agnostic.\\nIn reality, the model has limited context window length, so episodes should be short enough to construct multi-episode history. Multi-episodic contexts of 2-4 episodes are necessary to learn a near-optimal in-context RL algorithm. The emergence of in-context RL requires long enough context.\\nIn comparison with three baselines, including ED (expert distillation, behavior cloning with expert trajectories instead of learning history), source policy (used for generating trajectories for distillation by UCB), RL^2 (Duan et al. 2017; used as upper bound since it needs online RL), AD demonstrates in-context RL with performance getting close to RL^2 despite only using offline RL and learns much faster than other baselines. When conditioned on partial training history of the source policy, AD also improves much faster than ED baseline.\\n\\nFig. 7. Comparison of AD, ED, source policy and RL^2 on environments that require memory and exploration. Only binary reward is assigned. The source policies are trained with A3C for \"dark\" environments and DQN for watermaze.(Image source: Laskin et al. 2023)\\nComponent Two: Memory#\\n(Big thank you to ChatGPT for helping me draft this section. I’ve learned a lot about the human brain and data structure for fast MIPS in my conversations with ChatGPT.)\\nTypes of Memory#\\nMemory can be defined as the processes used to acquire, store, retain, and later retrieve information. There are several types of memory in human brains.\\n\\n\\nSensory Memory: This is the earliest stage of memory, providing the ability to retain impressions of sensory information (visual, auditory, etc) after the original stimuli have ended. Sensory memory typically only lasts for up to a few seconds. Subcategories include iconic memory (visual), echoic memory (auditory), and haptic memory (touch).\\n\\n\\nShort-Term Memory (STM) or Working Memory: It stores information that we are currently aware of and needed to carry out complex cognitive tasks such as learning and reasoning. Short-term memory is believed to have the capacity of about 7 items (Miller 1956) and lasts for 20-30 seconds.\\n\\n\\nLong-Term Memory (LTM): Long-term memory can store information for a remarkably long time, ranging from a few days to decades, with an essentially unlimited storage capacity. There are two subtypes of LTM:\\n\\nExplicit / declarative memory: This is memory of facts and events, and refers to those memories that can be consciously recalled, including episodic memory (events and experiences) and semantic memory (facts and concepts).\\nImplicit / procedural memory: This type of memory is unconscious and involves skills and routines that are performed automatically, like riding a bike or typing on a keyboard.\\n\\n\\n\\n\\nFig. 8. Categorization of human memory.\\nWe can roughly consider the following mappings:\\n\\nSensory memory as learning embedding representations for raw inputs, including text, image or other modalities;\\nShort-term memory as in-context learning. It is short and finite, as it is restricted by the finite context window length of Transformer.\\nLong-term memory as the external vector store that the agent can attend to at query time, accessible via fast retrieval.\\n\\nMaximum Inner Product Search (MIPS)#\\nThe external memory can alleviate the restriction of finite attention span. A standard practice is to save the embedding representation of information into a vector store database that can support fast maximum inner-product search (MIPS). To optimize the retrieval speed, the common choice is the approximate nearest neighbors (ANN)\\u200b algorithm to return approximately top k nearest neighbors to trade off a little accuracy lost for a huge speedup.\\nA couple common choices of ANN algorithms for fast MIPS:\\n\\nLSH (Locality-Sensitive Hashing): It introduces a hashing function such that similar input items are mapped to the same buckets with high probability, where the number of buckets is much smaller than the number of inputs.\\nANNOY (Approximate Nearest Neighbors Oh Yeah): The core data structure are random projection trees, a set of binary trees where each non-leaf node represents a hyperplane splitting the input space into half and each leaf stores one data point. Trees are built independently and at random, so to some extent, it mimics a hashing function. ANNOY search happens in all the trees to iteratively search through the half that is closest to the query and then aggregates the results. The idea is quite related to KD tree but a lot more scalable.\\nHNSW (Hierarchical Navigable Small World): It is inspired by the idea of small world networks where most nodes can be reached by any other nodes within a small number of steps; e.g. “six degrees of separation” feature of social networks. HNSW builds hierarchical layers of these small-world graphs, where the bottom layers contain the actual data points. The layers in the middle create shortcuts to speed up search. When performing a search, HNSW starts from a random node in the top layer and navigates towards the target. When it can’t get any closer, it moves down to the next layer, until it reaches the bottom layer. Each move in the upper layers can potentially cover a large distance in the data space, and each move in the lower layers refines the search quality.\\nFAISS (Facebook AI Similarity Search): It operates on the assumption that in high dimensional space, distances between nodes follow a Gaussian distribution and thus there should exist clustering of data points. FAISS applies vector quantization by partitioning the vector space into clusters and then refining the quantization within clusters. Search first looks for cluster candidates with coarse quantization and then further looks into each cluster with finer quantization.\\nScaNN (Scalable Nearest Neighbors): The main innovation in ScaNN is anisotropic vector quantization. It quantizes a data point $x_i$ to $\\\\tilde{x}_i$ such that the inner product $\\\\langle q, x_i \\\\rangle$ is as similar to the original distance of $\\\\angle q, \\\\tilde{x}_i$ as possible, instead of picking the closet quantization centroid points.\\n\\n\\nFig. 9. Comparison of MIPS algorithms, measured in recall@10. (Image source: Google Blog, 2020)\\nCheck more MIPS algorithms and performance comparison in ann-benchmarks.com.\\nComponent Three: Tool Use#\\nTool use is a remarkable and distinguishing characteristic of human beings. We create, modify and utilize external objects to do things that go beyond our physical and cognitive limits. Equipping LLMs with external tools can significantly extend the model capabilities.\\n\\nFig. 10. A picture of a sea otter using rock to crack open a seashell, while floating in the water. While some other animals can use tools, the complexity is not comparable with humans. (Image source: Animals using tools)\\nMRKL (Karpas et al. 2022), short for “Modular Reasoning, Knowledge and Language”, is a neuro-symbolic architecture for autonomous agents. A MRKL system is proposed to contain a collection of “expert” modules and the general-purpose LLM works as a router to route inquiries to the best suitable expert module. These modules can be neural (e.g. deep learning models) or symbolic (e.g. math calculator, currency converter, weather API).\\nThey did an experiment on fine-tuning LLM to call a calculator, using arithmetic as a test case. Their experiments showed that it was harder to solve verbal math problems than explicitly stated math problems because LLMs (7B Jurassic1-large model) failed to extract the right arguments for the basic arithmetic reliably. The results highlight when the external symbolic tools can work reliably, knowing when to and how to use the tools are crucial, determined by the LLM capability.\\nBoth TALM (Tool Augmented Language Models; Parisi et al. 2022) and Toolformer (Schick et al. 2023) fine-tune a LM to learn to use external tool APIs. The dataset is expanded based on whether a newly added API call annotation can improve the quality of model outputs. See more details in the “External APIs” section of Prompt Engineering.\\nChatGPT Plugins and OpenAI API function calling are good examples of LLMs augmented with tool use capability working in practice. The collection of tool APIs can be provided by other developers (as in Plugins) or self-defined (as in function calls).\\nHuggingGPT (Shen et al. 2023) is a framework to use ChatGPT as the task planner to select models available in HuggingFace platform according to the model descriptions and summarize the response based on the execution results.\\n\\nFig. 11. Illustration of how HuggingGPT works. (Image source: Shen et al. 2023)\\nThe system comprises of 4 stages:\\n(1) Task planning: LLM works as the brain and parses the user requests into multiple tasks. There are four attributes associated with each task: task type, ID, dependencies, and arguments. They use few-shot examples to guide LLM to do task parsing and planning.\\nInstruction:\\n\\nThe AI assistant can parse user input to several tasks: [{\"task\": task, \"id\", task_id, \"dep\": dependency_task_ids, \"args\": {\"text\": text, \"image\": URL, \"audio\": URL, \"video\": URL}}]. The \"dep\" field denotes the id of the previous task which generates a new resource that the current task relies on. A special tag \"-task_id\" refers to the generated text image, audio and video in the dependency task with id as task_id. The task MUST be selected from the following options: {{ Available Task List }}. There is a logical relationship between tasks, please note their order. If the user input can\\'t be parsed, you need to reply empty JSON. Here are several cases for your reference: {{ Demonstrations }}. The chat history is recorded as {{ Chat History }}. From this chat history, you can find the path of the user-mentioned resources for your task planning.\\n\\n(2) Model selection: LLM distributes the tasks to expert models, where the request is framed as a multiple-choice question. LLM is presented with a list of models to choose from. Due to the limited context length, task type based filtration is needed.\\nInstruction:\\n\\nGiven the user request and the call command, the AI assistant helps the user to select a suitable model from a list of models to process the user request. The AI assistant merely outputs the model id of the most appropriate model. The output must be in a strict JSON format: \"id\": \"id\", \"reason\": \"your detail reason for the choice\". We have a list of models for you to choose from {{ Candidate Models }}. Please select one model from the list.\\n\\n(3) Task execution: Expert models execute on the specific tasks and log results.\\nInstruction:\\n\\nWith the input and the inference results, the AI assistant needs to describe the process and results. The previous stages can be formed as - User Input: {{ User Input }}, Task Planning: {{ Tasks }}, Model Selection: {{ Model Assignment }}, Task Execution: {{ Predictions }}. You must first answer the user\\'s request in a straightforward manner. Then describe the task process and show your analysis and model inference results to the user in the first person. If inference results contain a file path, must tell the user the complete file path.\\n\\n(4) Response generation: LLM receives the execution results and provides summarized results to users.\\nTo put HuggingGPT into real world usage, a couple challenges need to solve: (1) Efficiency improvement is needed as both LLM inference rounds and interactions with other models slow down the process; (2) It relies on a long context window to communicate over complicated task content; (3) Stability improvement of LLM outputs and external model services.\\nAPI-Bank (Li et al. 2023) is a benchmark for evaluating the performance of tool-augmented LLMs. It contains 53 commonly used API tools, a complete tool-augmented LLM workflow, and 264 annotated dialogues that involve 568 API calls. The selection of APIs is quite diverse, including search engines, calculator, calendar queries, smart home control, schedule management, health data management, account authentication workflow and more. Because there are a large number of APIs, LLM first has access to API search engine to find the right API to call and then uses the corresponding documentation to make a call.\\n\\nFig. 12. Pseudo code of how LLM makes an API call in API-Bank. (Image source: Li et al. 2023)\\nIn the API-Bank workflow, LLMs need to make a couple of decisions and at each step we can evaluate how accurate that decision is. Decisions include:\\n\\nWhether an API call is needed.\\nIdentify the right API to call: if not good enough, LLMs need to iteratively modify the API inputs (e.g. deciding search keywords for Search Engine API).\\nResponse based on the API results: the model can choose to refine and call again if results are not satisfied.\\n\\nThis benchmark evaluates the agent’s tool use capabilities at three levels:\\n\\nLevel-1 evaluates the ability to call the API. Given an API’s description, the model needs to determine whether to call a given API, call it correctly, and respond properly to API returns.\\nLevel-2 examines the ability to retrieve the API. The model needs to search for possible APIs that may solve the user’s requirement and learn how to use them by reading documentation.\\nLevel-3 assesses the ability to plan API beyond retrieve and call. Given unclear user requests (e.g. schedule group meetings, book flight/hotel/restaurant for a trip), the model may have to conduct multiple API calls to solve it.\\n\\nCase Studies#\\nScientific Discovery Agent#\\nChemCrow (Bran et al. 2023) is a domain-specific example in which LLM is augmented with 13 expert-designed tools to accomplish tasks across organic synthesis, drug discovery, and materials design. The workflow, implemented in LangChain, reflects what was previously described in the ReAct and MRKLs and combines CoT reasoning with tools relevant to the tasks:\\n\\nThe LLM is provided with a list of tool names, descriptions of their utility, and details about the expected input/output.\\nIt is then instructed to answer a user-given prompt using the tools provided when necessary. The instruction suggests the model to follow the ReAct format - Thought, Action, Action Input, Observation.\\n\\nOne interesting observation is that while the LLM-based evaluation concluded that GPT-4 and ChemCrow perform nearly equivalently, human evaluations with experts oriented towards the completion and chemical correctness of the solutions showed that ChemCrow outperforms GPT-4 by a large margin. This indicates a potential problem with using LLM to evaluate its own performance on domains that requires deep expertise. The lack of expertise may cause LLMs not knowing its flaws and thus cannot well judge the correctness of task results.\\nBoiko et al. (2023) also looked into LLM-empowered agents for scientific discovery, to handle autonomous design, planning, and performance of complex scientific experiments. This agent can use tools to browse the Internet, read documentation, execute code, call robotics experimentation APIs and leverage other LLMs.\\nFor example, when requested to \"develop a novel anticancer drug\", the model came up with the following reasoning steps:\\n\\ninquired about current trends in anticancer drug discovery;\\nselected a target;\\nrequested a scaffold targeting these compounds;\\nOnce the compound was identified, the model attempted its synthesis.\\n\\nThey also discussed the risks, especially with illicit drugs and bioweapons. They developed a test set containing a list of known chemical weapon agents and asked the agent to synthesize them. 4 out of 11 requests (36%) were accepted to obtain a synthesis solution and the agent attempted to consult documentation to execute the procedure. 7 out of 11 were rejected and among these 7 rejected cases, 5 happened after a Web search while 2 were rejected based on prompt only.\\nGenerative Agents Simulation#\\nGenerative Agents (Park, et al. 2023) is super fun experiment where 25 virtual characters, each controlled by a LLM-powered agent, are living and interacting in a sandbox environment, inspired by The Sims. Generative agents create believable simulacra of human behavior for interactive applications.\\nThe design of generative agents combines LLM with memory, planning and reflection mechanisms to enable agents to behave conditioned on past experience, as well as to interact with other agents.\\n\\nMemory stream: is a long-term memory module (external database) that records a comprehensive list of agents’ experience in natural language.\\n\\nEach element is an observation, an event directly provided by the agent.\\n- Inter-agent communication can trigger new natural language statements.\\n\\n\\nRetrieval model: surfaces the context to inform the agent’s behavior, according to relevance, recency and importance.\\n\\nRecency: recent events have higher scores\\nImportance: distinguish mundane from core memories. Ask LM directly.\\nRelevance: based on how related it is to the current situation / query.\\n\\n\\nReflection mechanism: synthesizes memories into higher level inferences over time and guides the agent’s future behavior. They are higher-level summaries of past events (<- note that this is a bit different from self-reflection above)\\n\\nPrompt LM with 100 most recent observations and to generate 3 most salient high-level questions given a set of observations/statements. Then ask LM to answer those questions.\\n\\n\\nPlanning & Reacting: translate the reflections and the environment information into actions\\n\\nPlanning is essentially in order to optimize believability at the moment vs in time.\\nPrompt template: {Intro of an agent X}. Here is X\\'s plan today in broad strokes: 1)\\nRelationships between agents and observations of one agent by another are all taken into consideration for planning and reacting.\\nEnvironment information is present in a tree structure.\\n\\n\\n\\n\\nFig. 13. The generative agent architecture. (Image source: Park et al. 2023)\\nThis fun simulation results in emergent social behavior, such as information diffusion, relationship memory (e.g. two agents continuing the conversation topic) and coordination of social events (e.g. host a party and invite many others).\\nProof-of-Concept Examples#\\nAutoGPT has drawn a lot of attention into the possibility of setting up autonomous agents with LLM as the main controller. It has quite a lot of reliability issues given the natural language interface, but nevertheless a cool proof-of-concept demo. A lot of code in AutoGPT is about format parsing.\\nHere is the system message used by AutoGPT, where {{...}} are user inputs:\\nYou are {{ai-name}}, {{user-provided AI bot description}}.\\nYour decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.\\n\\nGOALS:\\n\\n1. {{user-provided goal 1}}\\n2. {{user-provided goal 2}}\\n3. ...\\n4. ...\\n5. ...\\n\\nConstraints:\\n1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.\\n2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.\\n3. No user assistance\\n4. Exclusively use the commands listed in double quotes e.g. \"command name\"\\n5. Use subprocesses for commands that will not terminate within a few minutes\\n\\nCommands:\\n1. Google Search: \"google\", args: \"input\": \"<search>\"\\n2. Browse Website: \"browse_website\", args: \"url\": \"<url>\", \"question\": \"<what_you_want_to_find_on_website>\"\\n3. Start GPT Agent: \"start_agent\", args: \"name\": \"<name>\", \"task\": \"<short_task_desc>\", \"prompt\": \"<prompt>\"\\n4. Message GPT Agent: \"message_agent\", args: \"key\": \"<key>\", \"message\": \"<message>\"\\n5. List GPT Agents: \"list_agents\", args:\\n6. Delete GPT Agent: \"delete_agent\", args: \"key\": \"<key>\"\\n7. Clone Repository: \"clone_repository\", args: \"repository_url\": \"<url>\", \"clone_path\": \"<directory>\"\\n8. Write to file: \"write_to_file\", args: \"file\": \"<file>\", \"text\": \"<text>\"\\n9. Read file: \"read_file\", args: \"file\": \"<file>\"\\n10. Append to file: \"append_to_file\", args: \"file\": \"<file>\", \"text\": \"<text>\"\\n11. Delete file: \"delete_file\", args: \"file\": \"<file>\"\\n12. Search Files: \"search_files\", args: \"directory\": \"<directory>\"\\n13. Analyze Code: \"analyze_code\", args: \"code\": \"<full_code_string>\"\\n14. Get Improved Code: \"improve_code\", args: \"suggestions\": \"<list_of_suggestions>\", \"code\": \"<full_code_string>\"\\n15. Write Tests: \"write_tests\", args: \"code\": \"<full_code_string>\", \"focus\": \"<list_of_focus_areas>\"\\n16. Execute Python File: \"execute_python_file\", args: \"file\": \"<file>\"\\n17. Generate Image: \"generate_image\", args: \"prompt\": \"<prompt>\"\\n18. Send Tweet: \"send_tweet\", args: \"text\": \"<text>\"\\n19. Do Nothing: \"do_nothing\", args:\\n20. Task Complete (Shutdown): \"task_complete\", args: \"reason\": \"<reason>\"\\n\\nResources:\\n1. Internet access for searches and information gathering.\\n2. Long Term memory management.\\n3. GPT-3.5 powered Agents for delegation of simple tasks.\\n4. File output.\\n\\nPerformance Evaluation:\\n1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.\\n2. Constructively self-criticize your big-picture behavior constantly.\\n3. Reflect on past decisions and strategies to refine your approach.\\n4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.\\n\\nYou should only respond in JSON format as described below\\nResponse Format:\\n{\\n \"thoughts\": {\\n \"text\": \"thought\",\\n \"reasoning\": \"reasoning\",\\n \"plan\": \"- short bulleted\\\\n- list that conveys\\\\n- long-term plan\",\\n \"criticism\": \"constructive self-criticism\",\\n \"speak\": \"thoughts summary to say to user\"\\n },\\n \"command\": {\\n \"name\": \"command name\",\\n \"args\": {\\n \"arg name\": \"value\"\\n }\\n }\\n}\\nEnsure the response can be parsed by Python json.loads\\nGPT-Engineer is another project to create a whole repository of code given a task specified in natural language. The GPT-Engineer is instructed to think over a list of smaller components to build and ask for user input to clarify questions as needed.\\nHere are a sample conversation for task clarification sent to OpenAI ChatCompletion endpoint used by GPT-Engineer. The user inputs are wrapped in {{user input text}}.\\n[\\n {\\n \"role\": \"system\",\\n \"content\": \"You will read instructions and not carry them out, only seek to clarify them.\\\\nSpecifically you will first summarise a list of super short bullets of areas that need clarification.\\\\nThen you will pick one clarifying question, and wait for an answer from the user.\\\\n\"\\n },\\n {\\n \"role\": \"user\",\\n \"content\": \"We are writing {{a Super Mario game in python. MVC components split in separate files. Keyboard control.}}\\\\n\"\\n },\\n {\\n \"role\": \"assistant\",\\n \"content\": \"Summary of areas that need clarification:\\\\n1. Specifics of the Super Mario game (e.g. level design, characters, gameplay mechanics)\\\\n2. Details about the MVC components (e.g. which components are in each file)\\\\n3. Keyboard control implementation (e.g. which keys to use, how to handle input)\\\\n\\\\nClarifying question:\\\\nCan you provide more details about the Super Mario game, such as level design, characters, and gameplay mechanics?\"\\n },\\n {\\n \"role\": \"user\",\\n \"content\": \"{{There are 10 levels in total. The main character is a plumber named Mario, who can walk and jump. It is a classical platform game just like Super Mario. The main character moves from left to right, trying to get to the destination, where there are many obstacles and attacks from enemies in the process.}}\\\\n\\\\nIs anything else unclear? If yes, only answer in the form:\\\\n{remaining unclear areas} remaining questions.\\\\n{Next question}\\\\nIf everything is sufficiently clear, only answer \\\\\"Nothing more to clarify.\\\\\".\"\\n },\\n {\\n \"role\": \"assistant\",\\n \"content\": \"Remaining unclear areas: 2 remaining questions.\\\\nCan you provide more information about how the MVC components are split into separate files?\"\\n },\\n {\\n \"role\": \"user\",\\n \"content\": \"{{Make your own assumptions and state them explicitly before starting}}\"\\n }\\n]\\nThen after these clarification, the agent moved into the code writing mode with a different system message.\\nSystem message:\\n\\nYou will get instructions for code to write.\\nYou will write a very long answer. Make sure that every detail of the architecture is, in the end, implemented as code.\\nMake sure that every detail of the architecture is, in the end, implemented as code.\\nThink step by step and reason yourself to the right decisions to make sure we get it right.\\nYou will first lay out the names of the core classes, functions, methods that will be necessary, as well as a quick comment on their purpose.\\nThen you will output the content of each file including ALL code.\\nEach file must strictly follow a markdown code block format, where the following tokens must be replaced such that\\nFILENAME is the lowercase file name including the file extension,\\nLANG is the markup code block language for the code’s language, and CODE is the code:\\nFILENAME\\nCODE\\nYou will start with the “entrypoint” file, then go to the ones that are imported by that file, and so on.\\nPlease note that the code should be fully functional. No placeholders.\\nFollow a language and framework appropriate best practice file naming convention.\\nMake sure that files contain all imports, types etc. Make sure that code in different files are compatible with each other.\\nEnsure to implement all code, if you are unsure, write a plausible implementation.\\nInclude module dependency or package manager dependency definition file.\\nBefore you finish, double check that all parts of the architecture is present in the files.\\nUseful to know:\\nYou almost always put different classes in different files.\\nFor Python, you always create an appropriate requirements.txt file.\\nFor NodeJS, you always create an appropriate package.json file.\\nYou always add a comment briefly describing the purpose of the function definition.\\nYou try to add comments explaining very complex bits of logic.\\nYou always follow the best practices for the requested languages in terms of describing the code written as a defined\\npackage/project.\\nPython toolbelt preferences:\\n\\npytest\\ndataclasses\\n\\n\\nConversatin samples:\\n[\\n {\\n \"role\": \"system\",\\n \"content\": \"You will get instructions for code to write.\\\\nYou will write a very long answer. Make sure that every detail of the architecture is, in the end, implemented as code.\\\\nMake sure that every detail of the architecture is, in the end, implemented as code.\\\\n\\\\nThink step by step and reason yourself to the right decisions to make sure we get it right.\\\\nYou will first lay out the names of the core classes, functions, methods that will be necessary, as well as a quick comment on their purpose.\\\\n\\\\nThen you will output the content of each file including ALL code.\\\\nEach file must strictly follow a markdown code block format, where the following tokens must be replaced such that\\\\nFILENAME is the lowercase file name including the file extension,\\\\nLANG is the markup code block language for the code\\'s language, and CODE is the code:\\\\n\\\\nFILENAME\\\\n```LANG\\\\nCODE\\\\n```\\\\n\\\\nYou will start with the \\\\\"entrypoint\\\\\" file, then go to the ones that are imported by that file, and so on.\\\\nPlease note that the code should be fully functional. No placeholders.\\\\n\\\\nFollow a language and framework appropriate best practice file naming convention.\\\\nMake sure that files contain all imports, types etc. Make sure that code in different files are compatible with each other.\\\\nEnsure to implement all code, if you are unsure, write a plausible implementation.\\\\nInclude module dependency or package manager dependency definition file.\\\\nBefore you finish, double check that all parts of the architecture is present in the files.\\\\n\\\\nUseful to know:\\\\nYou almost always put different classes in different files.\\\\nFor Python, you always create an appropriate requirements.txt file.\\\\nFor NodeJS, you always create an appropriate package.json file.\\\\nYou always add a comment briefly describing the purpose of the function definition.\\\\nYou try to add comments explaining very complex bits of logic.\\\\nYou always follow the best practices for the requested languages in terms of describing the code written as a defined\\\\npackage/project.\\\\n\\\\n\\\\nPython toolbelt preferences:\\\\n- pytest\\\\n- dataclasses\\\\n\"\\n },\\n # … same conversation as earlier, ended with \"Make your own assumptions and state them explicitly before starting\".\\n {\\n \"role\": \"assistant\",\\n \"content\": \"Assumptions:\\\\n1. Model: The model will contain the game\\'s data, such as level information, character states, and enemy positions.\\\\n2. View: The view will handle the game\\'s visuals, including rendering the game objects, backgrounds, and updating the display.\\\\n3. Controller: The controller will manage user input, such as keyboard controls, and update the model accordingly.\\\\n\\\\nWith these assumptions in mind, is there any specific aspect of the keyboard control implementation you would like me to clarify?\"\\n },\\n {\\n \"role\": \"user\",\\n \"content\": \"Please now remember the steps:\\\\n\\\\nThink step by step and reason yourself to the right decisions to make sure we get it right.\\\\nFirst lay out the names of the core classes, functions, methods that will be necessary, As well as a quick comment on their purpose.\\\\n\\\\nThen you will output the content of each file including ALL code.\\\\nEach file must strictly follow a markdown code block format, where the following tokens must be replaced such that\\\\nFILENAME is the lowercase file name including the file extension,\\\\nLANG is the markup code block language for the code\\'s language, and CODE is the code:\\\\n\\\\nFILENAME\\\\n```LANG\\\\nCODE\\\\n```\\\\n\\\\nPlease note that the code should be fully functional. No placeholders.\\\\n\\\\nYou will start with the \\\\\"entrypoint\\\\\" file, then go to the ones that are imported by that file, and so on.\\\\nFollow a language and framework appropriate best practice file naming convention.\\\\nMake sure that files contain all imports, types etc. The code should be fully functional. Make sure that code in different files are compatible with each other.\\\\nBefore you finish, double check that all parts of the architecture is present in the files.\\\\n\"\\n }\\n]\\nChallenges#\\nAfter going through key ideas and demos of building LLM-centered agents, I start to see a couple common limitations:\\n\\n\\nFinite context length: The restricted context capacity limits the inclusion of historical information, detailed instructions, API call context, and responses. The design of the system has to work with this limited communication bandwidth, while mechanisms like self-reflection to learn from past mistakes would benefit a lot from long or infinite context windows. Although vector stores and retrieval can provide access to a larger knowledge pool, their representation power is not as powerful as full attention.\\n\\n\\nChallenges in long-term planning and task decomposition: Planning over a lengthy history and effectively exploring the solution space remain challenging. LLMs struggle to adjust plans when faced with unexpected errors, making them less robust compared to humans who learn from trial and error.\\n\\n\\nReliability of natural language interface: Current agent system relies on natural language as an interface between LLMs and external components such as memory and tools. However, the reliability of model outputs is questionable, as LLMs may make formatting errors and occasionally exhibit rebellious behavior (e.g. refuse to follow an instruction). Consequently, much of the agent demo code focuses on parsing model output.\\n\\n\\nCitation#\\nCited as:\\n\\nWeng, Lilian. (Jun 2023). “LLM-powered Autonomous Agents”. Lil’Log. https://lilianweng.github.io/posts/2023-06-23-agent/.\\n\\nOr\\n@article{weng2023agent,\\n title = \"LLM-powered Autonomous Agents\",\\n author = \"Weng, Lilian\",\\n journal = \"lilianweng.github.io\",\\n year = \"2023\",\\n month = \"Jun\",\\n url = \"https://lilianweng.github.io/posts/2023-06-23-agent/\"\\n}\\nReferences#\\n[1] Wei et al. “Chain of thought prompting elicits reasoning in large language models.” NeurIPS 2022\\n[2] Yao et al. “Tree of Thoughts: Dliberate Problem Solving with Large Language Models.” arXiv preprint arXiv:2305.10601 (2023).\\n[3] Liu et al. “Chain of Hindsight Aligns Language Models with Feedback\\n“ arXiv preprint arXiv:2302.02676 (2023).\\n[4] Liu et al. “LLM+P: Empowering Large Language Models with Optimal Planning Proficiency” arXiv preprint arXiv:2304.11477 (2023).\\n[5] Yao et al. “ReAct: Synergizing reasoning and acting in language models.” ICLR 2023.\\n[6] Google Blog. “Announcing ScaNN: Efficient Vector Similarity Search” July 28, 2020.\\n[7] https://chat.openai.com/share/46ff149e-a4c7-4dd7-a800-fc4a642ea389\\n[8] Shinn & Labash. “Reflexion: an autonomous agent with dynamic memory and self-reflection” arXiv preprint arXiv:2303.11366 (2023).\\n[9] Laskin et al. “In-context Reinforcement Learning with Algorithm Distillation” ICLR 2023.\\n[10] Karpas et al. “MRKL Systems A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning.” arXiv preprint arXiv:2205.00445 (2022).\\n[11] Weaviate Blog. Why is Vector Search so fast? Sep 13, 2022.\\n[12] Li et al. “API-Bank: A Benchmark for Tool-Augmented LLMs” arXiv preprint arXiv:2304.08244 (2023).\\n[13] Shen et al. “HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace” arXiv preprint arXiv:2303.17580 (2023).\\n[14] Bran et al. “ChemCrow: Augmenting large-language models with chemistry tools.” arXiv preprint arXiv:2304.05376 (2023).\\n[15] Boiko et al. “Emergent autonomous scientific research capabilities of large language models.” arXiv preprint arXiv:2304.05332 (2023).\\n[16] Joon Sung Park, et al. “Generative Agents: Interactive Simulacra of Human Behavior.” arXiv preprint arXiv:2304.03442 (2023).\\n[17] AutoGPT. https://github.com/Significant-Gravitas/Auto-GPT\\n[18] GPT-Engineer. https://github.com/AntonOsika/gpt-engineer\\n', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'})]"
71
+ ]
72
+ },
73
+ "execution_count": 35,
74
+ "metadata": {},
75
+ "output_type": "execute_result"
76
+ }
77
+ ],
78
+ "source": [
79
+ "text_documents"
80
+ ]
81
+ },
82
+ {
83
+ "cell_type": "code",
84
+ "execution_count": 36,
85
+ "metadata": {},
86
+ "outputs": [
87
+ {
88
+ "data": {
89
+ "text/plain": [
90
+ "[Document(page_content='LLM Powered Autonomous Agents\\n \\nDate: June 23, 2023 | Estimated Reading Time: 31 min | Author: Lilian Weng\\n\\n\\nBuilding agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\\nAgent System Overview#\\nIn a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:\\n\\nPlanning\\n\\nSubgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\\nReflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.\\n\\n\\nMemory', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}),\n",
91
+ " Document(page_content='Memory\\n\\nShort-term memory: I would consider all the in-context learning (See Prompt Engineering) as utilizing short-term memory of the model to learn.\\nLong-term memory: This provides the agent with the capability to retain and recall (infinite) information over extended periods, often by leveraging an external vector store and fast retrieval.\\n\\n\\nTool use\\n\\nThe agent learns to call external APIs for extra information that is missing from the model weights (often hard to change after pre-training), including current information, code execution capability, access to proprietary information sources and more.', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}),\n",
92
+ " Document(page_content='Fig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}),\n",
93
+ " Document(page_content='Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}),\n",
94
+ " Document(page_content='Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.\\nSelf-Reflection#\\nSelf-reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes. It plays a crucial role in real-world tasks where trial and error are inevitable.', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'})]"
95
+ ]
96
+ },
97
+ "execution_count": 36,
98
+ "metadata": {},
99
+ "output_type": "execute_result"
100
+ }
101
+ ],
102
+ "source": [
103
+ "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
104
+ "text_splitter=RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=200)\n",
105
+ "docs=text_splitter.split_documents(text_documents)\n",
106
+ "docs[:5]"
107
+ ]
108
+ },
109
+ {
110
+ "cell_type": "code",
111
+ "execution_count": 37,
112
+ "metadata": {},
113
+ "outputs": [],
114
+ "source": [
115
+ "## Convert Data Into Vectors and store in AstraDB\n",
116
+ "os.environ[\"OPENAI_API_KEY\"]=os.getenv(\"OPENAI_API_KEY\")\n",
117
+ "embeddings=OpenAIEmbeddings()\n",
118
+ "astra_vector_store=Cassandra(\n",
119
+ " embedding=embeddings,\n",
120
+ " table_name=\"qa_mini_demo\",\n",
121
+ " session=None,\n",
122
+ " keyspace=None\n",
123
+ "\n",
124
+ ")"
125
+ ]
126
+ },
127
+ {
128
+ "cell_type": "code",
129
+ "execution_count": 38,
130
+ "metadata": {},
131
+ "outputs": [
132
+ {
133
+ "name": "stdout",
134
+ "output_type": "stream",
135
+ "text": [
136
+ "Inserted 66 headlines.\n"
137
+ ]
138
+ }
139
+ ],
140
+ "source": [
141
+ "from langchain.indexes.vectorstore import VectorStoreIndexWrapper\n",
142
+ "astra_vector_store.add_documents(docs)\n",
143
+ "print(\"Inserted %i headlines.\" % len(docs))\n",
144
+ "\n",
145
+ "astra_vector_index = VectorStoreIndexWrapper(vectorstore=astra_vector_store)"
146
+ ]
147
+ },
148
+ {
149
+ "cell_type": "code",
150
+ "execution_count": 39,
151
+ "metadata": {},
152
+ "outputs": [],
153
+ "source": [
154
+ "llm=ChatGroq(groq_api_key=groq_api_key,\n",
155
+ " model_name=\"mixtral-8x7b-32768\")\n",
156
+ "\n",
157
+ "from langchain_core.prompts import ChatPromptTemplate\n",
158
+ "prompt = ChatPromptTemplate.from_template(\"\"\"\n",
159
+ "Answer the following question based only on the provided context. \n",
160
+ "Think step by step before providing a detailed answer. \n",
161
+ "I will tip you $1000 if the user finds the answer helpful. \n",
162
+ "<context>\n",
163
+ "{context}\n",
164
+ "</context>\n",
165
+ "\n",
166
+ "Question: {input}\"\"\")"
167
+ ]
168
+ },
169
+ {
170
+ "cell_type": "code",
171
+ "execution_count": 40,
172
+ "metadata": {},
173
+ "outputs": [
174
+ {
175
+ "data": {
176
+ "text/plain": [
177
+ "'Chain of Thought (CoT) is a prompting technique introduced by Wei et al. in 2022 that has become a standard method for enhancing model performance on complex tasks. The main idea behind CoT is to instruct the model to \"think step by step,\" thereby utilizing more computational resources during test time. This approach decomposes complex tasks into smaller, more manageable steps, providing insights into the model\\'s thinking process. By breaking down tasks and generating step-by-step solutions, CoT has proven to be an effective technique for addressing complex problems within the limitations of the model\\'s capabilities.'"
178
+ ]
179
+ },
180
+ "execution_count": 40,
181
+ "metadata": {},
182
+ "output_type": "execute_result"
183
+ }
184
+ ],
185
+ "source": [
186
+ "astra_vector_index.query(\"Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique\",llm=llm)"
187
+ ]
188
+ },
189
+ {
190
+ "cell_type": "code",
191
+ "execution_count": 41,
192
+ "metadata": {},
193
+ "outputs": [],
194
+ "source": [
195
+ "from langchain.chains import create_retrieval_chain\n",
196
+ "from langchain.chains.combine_documents import create_stuff_documents_chain\n",
197
+ "\n",
198
+ "retriever=astra_vector_store.as_retriever()\n",
199
+ "document_chain=create_stuff_documents_chain(llm,prompt)\n",
200
+ "retrieval_chain=create_retrieval_chain(retriever,document_chain)"
201
+ ]
202
+ },
203
+ {
204
+ "cell_type": "code",
205
+ "execution_count": 43,
206
+ "metadata": {},
207
+ "outputs": [
208
+ {
209
+ "data": {
210
+ "text/plain": [
211
+ "{'input': 'Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique',\n",
212
+ " 'context': [Document(page_content='Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}),\n",
213
+ " Document(page_content='Fig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}),\n",
214
+ " Document(page_content='Chain of Hindsight (CoH; Liu et al. 2023) encourages the model to improve on its own outputs by explicitly presenting it with a sequence of past outputs, each annotated with feedback. Human feedback data is a collection of $D_h = \\\\{(x, y_i , r_i , z_i)\\\\}_{i=1}^n$, where $x$ is the prompt, each $y_i$ is a model completion, $r_i$ is the human rating of $y_i$, and $z_i$ is the corresponding human-provided hindsight feedback. Assume the feedback tuples are ranked by reward, $r_n \\\\geq r_{n-1} \\\\geq \\\\dots \\\\geq r_1$ The process is supervised fine-tuning where the data is a sequence in the form of $\\\\tau_h = (x, z_i, y_i, z_j, y_j, \\\\dots, z_n, y_n)$, where $\\\\leq i \\\\leq j \\\\leq n$. The model is finetuned to only predict $y_n$ where conditioned on the sequence prefix, such that the model can self-reflect to produce better output based on the feedback sequence. The model can optionally receive multiple rounds of instructions with human annotators at test time.', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}),\n",
215
+ " Document(page_content='ReAct (Yao et al. 2023) integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space. The former enables LLM to interact with the environment (e.g. use Wikipedia search API), while the latter prompting LLM to generate reasoning traces in natural language.\\nThe ReAct prompt template incorporates explicit steps for LLM to think, roughly formatted as:\\nThought: ...\\nAction: ...\\nObservation: ...\\n... (Repeated many times)', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'})],\n",
216
+ " 'answer': 'The Chain of Thought (CoT) is a standard prompting technique introduced by Wei et al. in 2022, which enhances model performance on complex tasks. The main idea behind CoT is to instruct the model to \"think step by step,\" thereby utilizing more test-time computation to decompose hard tasks into smaller and more manageable tasks. This approach sheds light on the model\\'s thinking process and transforms complex tasks into a series of simpler, more approachable sub-tasks.\\n\\nIn the context of AI and language models, the CoT technique helps to break down complex problems into a chain of smaller, interconnected thoughts, allowing the model to tackle each step more effectively. This method can be applied to various tasks, such as mathematical problem-solving, text generation, or understanding complex instructions. By following the chain of thought, the model can better navigate complex problem-solving scenarios, making it a valuable tool for improving language model performance and understanding.'}"
217
+ ]
218
+ },
219
+ "execution_count": 43,
220
+ "metadata": {},
221
+ "output_type": "execute_result"
222
+ }
223
+ ],
224
+ "source": [
225
+ "response=retrieval_chain.invoke({\"input\":\"Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique\"})\n",
226
+ "response"
227
+ ]
228
+ },
229
+ {
230
+ "cell_type": "code",
231
+ "execution_count": 44,
232
+ "metadata": {},
233
+ "outputs": [
234
+ {
235
+ "data": {
236
+ "text/plain": [
237
+ "'The Chain of Thought (CoT) is a standard prompting technique introduced by Wei et al. in 2022, which enhances model performance on complex tasks. The main idea behind CoT is to instruct the model to \"think step by step,\" thereby utilizing more test-time computation to decompose hard tasks into smaller and more manageable tasks. This approach sheds light on the model\\'s thinking process and transforms complex tasks into a series of simpler, more approachable sub-tasks.\\n\\nIn the context of AI and language models, the CoT technique helps to break down complex problems into a chain of smaller, interconnected thoughts, allowing the model to tackle each step more effectively. This method can be applied to various tasks, such as mathematical problem-solving, text generation, or understanding complex instructions. By following the chain of thought, the model can better navigate complex problem-solving scenarios, making it a valuable tool for improving language model performance and understanding.'"
238
+ ]
239
+ },
240
+ "execution_count": 44,
241
+ "metadata": {},
242
+ "output_type": "execute_result"
243
+ }
244
+ ],
245
+ "source": [
246
+ "response[\"answer\"]"
247
+ ]
248
+ },
249
+ {
250
+ "cell_type": "code",
251
+ "execution_count": null,
252
+ "metadata": {},
253
+ "outputs": [],
254
+ "source": []
255
+ }
256
+ ],
257
+ "metadata": {
258
+ "kernelspec": {
259
+ "display_name": "Python 3",
260
+ "language": "python",
261
+ "name": "python3"
262
+ },
263
+ "language_info": {
264
+ "codemirror_mode": {
265
+ "name": "ipython",
266
+ "version": 3
267
+ },
268
+ "file_extension": ".py",
269
+ "mimetype": "text/x-python",
270
+ "name": "python",
271
+ "nbconvert_exporter": "python",
272
+ "pygments_lexer": "ipython3",
273
+ "version": "3.10.0"
274
+ }
275
+ },
276
+ "nbformat": 4,
277
+ "nbformat_minor": 2
278
+ }
llama3.py ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ import os
3
+ from langchain_groq import ChatGroq
4
+ from langchain_openai import OpenAIEmbeddings
5
+ from langchain.text_splitter import RecursiveCharacterTextSplitter
6
+ from langchain.chains.combine_documents import create_stuff_documents_chain
7
+ from langchain_core.prompts import ChatPromptTemplate
8
+ from langchain.chains import create_retrieval_chain
9
+ from langchain_community.vectorstores import FAISS
10
+ from langchain_community.document_loaders import PyPDFDirectoryLoader
11
+
12
+ from dotenv import load_dotenv
13
+
14
+ load_dotenv()
15
+
16
+ ## load the GROQ And OpenAI API KEY
17
+ os.environ['OPENAI_API_KEY']=os.getenv("OPENAI_API_KEY")
18
+ groq_api_key=os.getenv('GROQ_API_KEY')
19
+
20
+ st.title("Chatgroq With Llama3 Demo")
21
+
22
+ llm=ChatGroq(groq_api_key=groq_api_key,
23
+ model_name="Llama3-8b-8192")
24
+
25
+ prompt=ChatPromptTemplate.from_template(
26
+ """
27
+ Answer the questions based on the provided context only.
28
+ Please provide the most accurate response based on the question
29
+ <context>
30
+ {context}
31
+ <context>
32
+ Questions:{input}
33
+
34
+ """
35
+ )
36
+
37
+ def vector_embedding():
38
+
39
+ if "vectors" not in st.session_state:
40
+
41
+ st.session_state.embeddings=OpenAIEmbeddings()
42
+ st.session_state.loader=PyPDFDirectoryLoader("./us_census") ## Data Ingestion
43
+ st.session_state.docs=st.session_state.loader.load() ## Document Loading
44
+ st.session_state.text_splitter=RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=200) ## Chunk Creation
45
+ st.session_state.final_documents=st.session_state.text_splitter.split_documents(st.session_state.docs[:20]) #splitting
46
+ st.session_state.vectors=FAISS.from_documents(st.session_state.final_documents,st.session_state.embeddings) #vector OpenAI embeddings
47
+
48
+
49
+
50
+
51
+
52
+ prompt1=st.text_input("Enter Your Question From Doduments")
53
+
54
+
55
+ if st.button("Documents Embedding"):
56
+ vector_embedding()
57
+ st.write("Vector Store DB Is Ready")
58
+
59
+ import time
60
+
61
+
62
+
63
+ if prompt1:
64
+ document_chain=create_stuff_documents_chain(llm,prompt)
65
+ retriever=st.session_state.vectors.as_retriever()
66
+ retrieval_chain=create_retrieval_chain(retriever,document_chain)
67
+ start=time.process_time()
68
+ response=retrieval_chain.invoke({'input':prompt1})
69
+ print("Response time :",time.process_time()-start)
70
+ st.write(response['answer'])
71
+
72
+ # With a streamlit expander
73
+ with st.expander("Document Similarity Search"):
74
+ # Find the relevant chunks
75
+ for i, doc in enumerate(response["context"]):
76
+ st.write(doc.page_content)
77
+ st.write("--------------------------------")
78
+
79
+
80
+
81
+
82
+
localama.py ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from langchain_openai import ChatOpenAI
2
+ from langchain_core.prompts import ChatPromptTemplate
3
+ from langchain_core.output_parsers import StrOutputParser
4
+ from langchain_community.llms import Ollama
5
+ import streamlit as st
6
+ import os
7
+ from dotenv import load_dotenv
8
+
9
+ load_dotenv()
10
+
11
+ os.environ["LANGCHAIN_TRACING_V2"]="true"
12
+ os.environ["LANGCHAIN_API_KEY"]=os.getenv("LANGCHAIN_API_KEY")
13
+
14
+ ## Prompt Template
15
+
16
+ prompt=ChatPromptTemplate.from_messages(
17
+ [
18
+ ("system","You are a helpful assistant. Please response to the user queries"),
19
+ ("user","Question:{question}")
20
+ ]
21
+ )
22
+ ## streamlit framework
23
+
24
+ st.title('Langchain Demo With LLAMA2 API')
25
+ input_text=st.text_input("Search the topic u want")
26
+
27
+ # ollama LLAma2 LLm
28
+ llm=Ollama(model="llama2")
29
+ output_parser=StrOutputParser()
30
+ chain=prompt|llm|output_parser
31
+
32
+ if input_text:
33
+ st.write(chain.invoke({"question":input_text}))
p70-178.pdf ADDED
Binary file (419 kB). View file
 
retriever.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
simplerag.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
speech.txt ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ The world must be made safe for democracy. Its peace must be planted upon the tested foundations of political liberty. We have no selfish ends to serve. We desire no conquest, no dominion. We seek no indemnities for ourselves, no material compensation for the sacrifices we shall freely make. We are but one of the champions of the rights of mankind. We shall be satisfied when those rights have been made as secure as the faith and the freedom of nations can make them.
2
+
3
+ Just because we fight without rancor and without selfish object, seeking nothing for ourselves but what we shall wish to share with all free peoples, we shall, I feel confident, conduct our operations as belligerents without passion and ourselves observe with proud punctilio the principles of right and of fair play we profess to be fighting for.
4
+
5
+
6
+
7
+ It will be all the easier for us to conduct ourselves as belligerents in a high spirit of right and fairness because we act without animus, not in enmity toward a people or with the desire to bring any injury or disadvantage upon them, but only in armed opposition to an irresponsible government which has thrown aside all considerations of humanity and of right and is running amuck. We are, let me say again, the sincere friends of the German people, and shall desire nothing so much as the early reestablishment of intimate relations of mutual advantage between us—however hard it may be for them, for the time being, to believe that this is spoken from our hearts.
8
+
9
+ We have borne with their present government through all these bitter months because of that friendship—exercising a patience and forbearance which would otherwise have been impossible. We shall, happily, still have an opportunity to prove that friendship in our daily attitude and actions toward the millions of men and women of German birth and native sympathy who live among us and share our life, and we shall be proud to prove it toward all who are in fact loyal to their neighbors and to the government in the hour of test. They are, most of them, as true and loyal Americans as if they had never known any other fealty or allegiance. They will be prompt to stand with us in rebuking and restraining the few who may be of a different mind and purpose. If there should be disloyalty, it will be dealt with with a firm hand of stern repression; but, if it lifts its head at all, it will lift it only here and there and without countenance except from a lawless and malignant few.
10
+
11
+ It is a distressing and oppressive duty, gentlemen of the Congress, which I have performed in thus addressing you. There are, it may be, many months of fiery trial and sacrifice ahead of us. It is a fearful thing to lead this great peaceful people into war, into the most terrible and disastrous of all wars, civilization itself seeming to be in the balance. But the right is more precious than peace, and we shall fight for the things which we have always carried nearest our hearts—for democracy, for the right of those who submit to authority to have a voice in their own governments, for the rights and liberties of small nations, for a universal dominion of right by such a concert of free peoples as shall bring peace and safety to all nations and make the world itself at last free.
12
+
13
+ To such a task we can dedicate our lives and our fortunes, everything that we are and everything that we have, with the pride of those who know that the day has come when America is privileged to spend her blood and her might for the principles that gave her birth and happiness and the peace which she has treasured. God helping her, she can do no other.