momegas commited on
Commit
cfc0db8
β€’
1 Parent(s): 6ec321a

πŸ€– Enter Megabots!

Browse files
.github/ISSUE_TEMPLATE/bug_report.md CHANGED
@@ -1,10 +1,9 @@
1
  ---
2
  name: Bug report
3
  about: Create a report to help us improve
4
- title: ''
5
  labels: bug
6
  assignees: momegas
7
-
8
  ---
9
 
10
  **Describe the bug**
@@ -12,8 +11,9 @@ A clear and concise description of what the bug is.
12
 
13
  **To Reproduce**
14
  Steps to reproduce the behavior:
15
- 1. pip install qnabot
16
- 2. ...
 
17
 
18
  **Code that breaks**
19
  If applicable, add the code that produced the bug to help explain your problem.
@@ -22,9 +22,10 @@ If applicable, add the code that produced the bug to help explain your problem.
22
  A clear and concise description of what you expected to happen.
23
 
24
  **Python version (please complete the following information):**
25
- - [ ] <= 3.9
26
- - [ ] 3.10
27
- - [ ] 3.11
 
28
 
29
  **Additional context**
30
  Add any other context about the problem here.
 
1
  ---
2
  name: Bug report
3
  about: Create a report to help us improve
4
+ title: ""
5
  labels: bug
6
  assignees: momegas
 
7
  ---
8
 
9
  **Describe the bug**
 
11
 
12
  **To Reproduce**
13
  Steps to reproduce the behavior:
14
+
15
+ 1. pip install megabots
16
+ 2. ...
17
 
18
  **Code that breaks**
19
  If applicable, add the code that produced the bug to help explain your problem.
 
22
  A clear and concise description of what you expected to happen.
23
 
24
  **Python version (please complete the following information):**
25
+
26
+ - [ ] <= 3.9
27
+ - [ ] 3.10
28
+ - [ ] 3.11
29
 
30
  **Additional context**
31
  Add any other context about the problem here.
.gitignore CHANGED
@@ -1,7 +1,7 @@
1
  .venv
2
  __pycache__
3
  .pytest_cache
4
- qnabot.egg-info
5
  dist
6
  build
7
  **.pickle
 
1
  .venv
2
  __pycache__
3
  .pytest_cache
4
+ **.egg-info
5
  dist
6
  build
7
  **.pickle
Makefile CHANGED
@@ -1,7 +1,7 @@
1
  # Define variables
2
  PYTHON=python
3
  PIP=pip
4
- PACKAGE=qnabot
5
 
6
  .PHONY: install test clean build publish
7
 
 
1
  # Define variables
2
  PYTHON=python
3
  PIP=pip
4
+ PACKAGE=megabots
5
 
6
  .PHONY: install test clean build publish
7
 
README.md CHANGED
@@ -1,34 +1,35 @@
1
- # πŸ¦ΎπŸ€–πŸ€³ Mega Bots
2
 
3
  [![Tests](https://github.com/momegas/qnabot/actions/workflows/python-package.yml/badge.svg)](https://github.com/momegas/qnabot/actions/workflows/python-package.yml)
4
 
5
- Here is an example of what you build with this library: [Demo](https://huggingface.co/spaces/momegas/megas-bot)
6
 
7
- πŸ¦ΎπŸ€–πŸ€³ Megabots provides ready made production ready bots so you don't have to build them from scratch 🀯
8
-
9
- Note: This is a work in progress. The API is not stable and will change.
10
 
11
  ```bash
12
  pip install megabots
13
  ```
14
 
15
  ```python
16
- from qnabot import QnABot
17
  import os
18
 
19
  os.environ["OPENAI_API_KEY"] = "my key"
20
 
21
- # Create a bot πŸ‘‰ with one line of code
22
- bot = QnABot(directory="./mydata")
23
 
24
  # Ask a question
25
  answer = bot.ask("How do I use this bot?")
26
 
27
  # Save the index to save costs (GPT is used to create the index)
28
- bot.save_index("index.pickle")
29
 
30
  # Load the index from a previous run
31
- bot = QnABot(directory="./mydata", index="index.pickle")
 
 
 
32
  ```
33
 
34
  You can also create a FastAPI app that will expose the bot as an API using the create_app function.
@@ -36,9 +37,9 @@ Assuming you file is called `main.py` run `uvicorn main:app --reload` to run the
36
  You should then be able to visit `http://localhost:8000/docs` to see the API documentation.
37
 
38
  ```python
39
- from qnabot import QnABot, create_app
40
 
41
- app = create_app(QnABot("./mydata"))
42
  ```
43
 
44
  You can expose a gradio UI for the bot using `create_interface` function.
@@ -46,9 +47,9 @@ Assuming your file is called `ui.py` run `gradio qnabot/ui.py` to run the UI loc
46
  You should then be able to visit `http://127.0.0.1:7860` to see the API documentation.
47
 
48
  ```python
49
- from qnabot import QnABot, create_interface
50
 
51
- demo = create_interface(QnABot("./mydata"))
52
  ```
53
 
54
  ### Features
 
1
+ # πŸ€– Megabots
2
 
3
  [![Tests](https://github.com/momegas/qnabot/actions/workflows/python-package.yml/badge.svg)](https://github.com/momegas/qnabot/actions/workflows/python-package.yml)
4
 
5
+ πŸ€– Megabots provides State-of-the-art, production ready bots made mega-easy, so you don't have to build them from scratch 🀯 Create a bot, now 🫡
6
 
7
+ Note: This is a work in progress. The API might change.
 
 
8
 
9
  ```bash
10
  pip install megabots
11
  ```
12
 
13
  ```python
14
+ from megabots import bot
15
  import os
16
 
17
  os.environ["OPENAI_API_KEY"] = "my key"
18
 
19
+ # Create a bot πŸ‘‰ with one line of code. Automatically loads your data from ./index or index.pkl.
20
+ qnabot = bot("qna-over-docs")
21
 
22
  # Ask a question
23
  answer = bot.ask("How do I use this bot?")
24
 
25
  # Save the index to save costs (GPT is used to create the index)
26
+ bot.save_index("index.pkl")
27
 
28
  # Load the index from a previous run
29
+ qnabot = bot("qna-over-docs", index="./index.pkl")
30
+
31
+ # Or create the index from a directory of documents
32
+ qnabot = bot("qna-over-docs", index="./index")
33
  ```
34
 
35
  You can also create a FastAPI app that will expose the bot as an API using the create_app function.
 
37
  You should then be able to visit `http://localhost:8000/docs` to see the API documentation.
38
 
39
  ```python
40
+ from megabots import bot, create_api
41
 
42
+ app = create_app(bot("qna-over-docs"))
43
  ```
44
 
45
  You can expose a gradio UI for the bot using `create_interface` function.
 
47
  You should then be able to visit `http://127.0.0.1:7860` to see the API documentation.
48
 
49
  ```python
50
+ from megabots import bot, create_interface
51
 
52
+ demo = create_interface(QnABot("qna-over-docs"))
53
  ```
54
 
55
  ### Features
example.ipynb CHANGED
@@ -2,7 +2,7 @@
2
  "cells": [
3
  {
4
  "cell_type": "code",
5
- "execution_count": 4,
6
  "metadata": {},
7
  "outputs": [
8
  {
@@ -15,13 +15,12 @@
15
  }
16
  ],
17
  "source": [
18
- "from qnabot import QnABot\n",
19
  "from dotenv import load_dotenv\n",
20
  "\n",
21
  "load_dotenv()\n",
22
  "\n",
23
- "bot = QnABot(directory=\"./examples/files\", index=\"./index.pkl\", verbose=True)\n",
24
- "# bot.save_index(\"./index.pkl\")"
25
  ]
26
  },
27
  {
@@ -30,19 +29,18 @@
30
  "metadata": {},
31
  "outputs": [
32
  {
33
- "name": "stdout",
34
- "output_type": "stream",
35
- "text": [
36
- "The first roster of Avengers in comics included Iron Man, Thor, Hulk, Ant-Man, and the Wasp.\n",
37
- "SOURCES: examples/files/facts.txt\n",
38
- "Vision is an android superhero who was created by Ultron but ultimately joined the Avengers and became an important member of the team.\n",
39
- "SOURCES: examples/files/facts.txt\n"
40
- ]
41
  }
42
  ],
43
  "source": [
44
- "bot.print_answer(\"what was the first roster of avengers in comics?\")\n",
45
- "bot.print_answer(\"Who is Vision?\")"
46
  ]
47
  }
48
  ],
 
2
  "cells": [
3
  {
4
  "cell_type": "code",
5
+ "execution_count": 3,
6
  "metadata": {},
7
  "outputs": [
8
  {
 
15
  }
16
  ],
17
  "source": [
18
+ "from megabots import bot\n",
19
  "from dotenv import load_dotenv\n",
20
  "\n",
21
  "load_dotenv()\n",
22
  "\n",
23
+ "qnabot = bot(\"qna-over-docs\", index=\"./index.pkl\")"
 
24
  ]
25
  },
26
  {
 
29
  "metadata": {},
30
  "outputs": [
31
  {
32
+ "data": {
33
+ "text/plain": [
34
+ "'The document does not provide an answer to the question \"What is the meaning of life?\".\\nSOURCES:'"
35
+ ]
36
+ },
37
+ "execution_count": 2,
38
+ "metadata": {},
39
+ "output_type": "execute_result"
40
  }
41
  ],
42
  "source": [
43
+ "qnabot.ask(\"What is the meaning of life?\")"
 
44
  ]
45
  }
46
  ],
qnabot/QnABot.py β†’ megabots/__init__.py RENAMED
@@ -4,23 +4,28 @@ from langchain.embeddings import OpenAIEmbeddings
4
  from langchain.document_loaders import DirectoryLoader, S3DirectoryLoader
5
  from langchain.chains.qa_with_sources import load_qa_with_sources_chain
6
  from langchain.vectorstores.faiss import FAISS
 
 
7
  import pickle
8
  import os
 
9
 
 
10
 
11
- class QnABot:
 
12
  def __init__(
13
  self,
14
- directory: str | None = None,
15
- index: str | None = None,
16
  model: str | None = None,
 
 
 
 
17
  verbose: bool = False,
18
  temperature: int = 0,
19
  ):
20
- # Initialize the QnABot by selecting a model, creating a loader,
21
- # and loading or creating an index
22
  self.select_model(model, temperature)
23
- self.create_loader(directory)
24
  self.load_or_create_index(index)
25
 
26
  # Load the question-answering chain for the selected model
@@ -36,40 +41,125 @@ class QnABot:
36
  print("Using model: text-davinci-003")
37
  self.llm = OpenAI(temperature=temperature)
38
 
39
- def create_loader(self, directory: str | None):
40
- if directory is None:
41
- return
42
  # Create a loader based on the provided directory (either local or S3)
43
- if directory.startswith("s3://"):
44
- self.loader = S3DirectoryLoader(directory)
45
- else:
46
- self.loader = DirectoryLoader(directory, recursive=True)
47
 
48
- def load_or_create_index(self, index_path: str | None):
49
  # Load an existing index from disk or create a new one if not available
50
- if index_path is not None and os.path.exists(index_path):
 
 
51
  print("Loading path from disk...")
52
  with open(index_path, "rb") as f:
53
  self.search_index = pickle.load(f)
54
- else:
 
 
 
55
  print("Creating index...")
56
  self.search_index = FAISS.from_documents(
57
  self.loader.load_and_split(), OpenAIEmbeddings()
58
  )
 
 
 
 
 
 
 
 
59
 
60
  def save_index(self, index_path: str):
61
  # Save the index to the specified path
62
  with open(index_path, "wb") as f:
63
  pickle.dump(self.search_index, f)
64
 
65
- def print_answer(self, question: str, k=1):
66
- # Retrieve and print the answer to the given question
67
- input_documents = self.search_index.similarity_search(question, k=k)
68
- a = self.chain.run(input_documents=input_documents, question=question)
69
- print(a)
70
-
71
  def ask(self, question: str, k=1) -> str:
72
  # Retrieve the answer to the given question and return it
73
  input_documents = self.search_index.similarity_search(question, k=k)
74
  answer = self.chain.run(input_documents=input_documents, question=question)
75
  return answer
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  from langchain.document_loaders import DirectoryLoader, S3DirectoryLoader
5
  from langchain.chains.qa_with_sources import load_qa_with_sources_chain
6
  from langchain.vectorstores.faiss import FAISS
7
+ import gradio as gr
8
+ from fastapi import FastAPI
9
  import pickle
10
  import os
11
+ from dotenv import load_dotenv
12
 
13
+ load_dotenv()
14
 
15
+
16
+ class Bot:
17
  def __init__(
18
  self,
 
 
19
  model: str | None = None,
20
+ prompt: str | None = None,
21
+ memory: str | None = None,
22
+ index: str | None = None,
23
+ source: str | None = None,
24
  verbose: bool = False,
25
  temperature: int = 0,
26
  ):
 
 
27
  self.select_model(model, temperature)
28
+ self.create_loader(index)
29
  self.load_or_create_index(index)
30
 
31
  # Load the question-answering chain for the selected model
 
41
  print("Using model: text-davinci-003")
42
  self.llm = OpenAI(temperature=temperature)
43
 
44
+ def create_loader(self, index: str | None):
 
 
45
  # Create a loader based on the provided directory (either local or S3)
46
+ self.loader = DirectoryLoader(index, recursive=True)
 
 
 
47
 
48
+ def load_or_create_index(self, index_path: str):
49
  # Load an existing index from disk or create a new one if not available
50
+
51
+ # Is pickle
52
+ if index_path is not None and "pkl" in index_path or "pickle" in index_path:
53
  print("Loading path from disk...")
54
  with open(index_path, "rb") as f:
55
  self.search_index = pickle.load(f)
56
+ return
57
+
58
+ # Is directory
59
+ if index_path is not None and os.path.isdir(index_path):
60
  print("Creating index...")
61
  self.search_index = FAISS.from_documents(
62
  self.loader.load_and_split(), OpenAIEmbeddings()
63
  )
64
+ return
65
+
66
+ raise RuntimeError(
67
+ """
68
+ Impossible to find a valid index.
69
+ Either provide a valid path to a pickle file or a directory.
70
+ """
71
+ )
72
 
73
  def save_index(self, index_path: str):
74
  # Save the index to the specified path
75
  with open(index_path, "wb") as f:
76
  pickle.dump(self.search_index, f)
77
 
 
 
 
 
 
 
78
  def ask(self, question: str, k=1) -> str:
79
  # Retrieve the answer to the given question and return it
80
  input_documents = self.search_index.similarity_search(question, k=k)
81
  answer = self.chain.run(input_documents=input_documents, question=question)
82
  return answer
83
+
84
+
85
+ SUPPORTED_TASKS = {
86
+ "qna-over-docs": {
87
+ "impl": Bot,
88
+ "default": {
89
+ "model": "gpt-3.5-turbo",
90
+ "prompt": "",
91
+ "temperature": 0,
92
+ "index": "./files",
93
+ },
94
+ }
95
+ }
96
+
97
+ SUPPORTED_MODELS = {}
98
+
99
+
100
+ def bot(
101
+ task: str | None = None,
102
+ model: str | None = None,
103
+ prompt: str | None = None,
104
+ memory: str | None = None,
105
+ index: str | None = None,
106
+ source: str | None = None,
107
+ verbose: bool = False,
108
+ temperature: int = 0,
109
+ **kwargs,
110
+ ) -> Bot:
111
+ """Instanciate a bot based on the provided task. Each supported tasks has it's own default sane defaults.
112
+
113
+ Args:
114
+ task (str | None, optional): The given task. Can be one of the SUPPORTED_TASKS.
115
+ model (str | None, optional): Model to be used. Can be one of the SUPPORTED_MODELS.
116
+ index (str | None, optional): Data that the model will load and store index info.
117
+ Can be either a local file path, a pickle file, or a url of a vector database.
118
+ By default it will look for a local directory called "files" in the current working directory.
119
+ verbose (bool, optional): Verbocity. Defaults to False.
120
+
121
+ Raises:
122
+ RuntimeError: _description_
123
+ ValueError: _description_
124
+
125
+ Returns:
126
+ Bot: Bot instance
127
+ """
128
+
129
+ if task is None:
130
+ raise RuntimeError("Impossible to instantiate a bot without a task.")
131
+ if task not in SUPPORTED_TASKS:
132
+ raise ValueError(f"Task {task} is not supported.")
133
+
134
+ task_defaults = SUPPORTED_TASKS[task]["default"]
135
+ return SUPPORTED_TASKS[task]["impl"](
136
+ model=model or task_defaults["model"],
137
+ index=index or task_defaults["index"],
138
+ verbose=verbose,
139
+ **kwargs,
140
+ )
141
+
142
+
143
+ def create_api(bot: Bot):
144
+ app = FastAPI()
145
+
146
+ @app.get("/v1/ask/{question}")
147
+ async def ask(question: str):
148
+ answer = bot.ask(question)
149
+ return {"answer": answer}
150
+
151
+ return app
152
+
153
+
154
+ def create_interface(bot: Bot, examples: list[list[str]] = []):
155
+ def ask(question: str):
156
+ return bot.ask(question)
157
+
158
+ interface = gr.Interface(
159
+ fn=ask,
160
+ inputs=gr.components.Textbox(lines=5, label="Question"),
161
+ outputs=gr.components.Textbox(lines=5, label="Answer"),
162
+ examples=examples,
163
+ )
164
+
165
+ return interface
{qnabot β†’ megabots}/tests/__init__.py RENAMED
File without changes
{qnabot β†’ megabots}/tests/test_api.py RENAMED
@@ -1,9 +1,9 @@
1
  import json
2
  from fastapi.testclient import TestClient
3
- from qnabot import QnABot, create_app
4
 
5
- bot = QnABot(directory="./examples/files")
6
- app = create_app(bot)
7
 
8
  client = TestClient(app)
9
 
 
1
  import json
2
  from fastapi.testclient import TestClient
3
+ from megabots import bot, create_api
4
 
5
+ qnabot = bot("qna-over-docs", index="./examples/files")
6
+ app = create_api(qnabot)
7
 
8
  client = TestClient(app)
9
 
qnabot/tests/test_QnABot.py β†’ megabots/tests/test_bots.py RENAMED
@@ -1,6 +1,6 @@
1
  import os
2
  import tempfile
3
- from qnabot import QnABot
4
  import pickle
5
  from langchain.vectorstores.faiss import FAISS
6
  from dotenv import load_dotenv
@@ -15,8 +15,8 @@ sources = "SOURCES:"
15
 
16
 
17
  def test_ask():
18
- bot = QnABot(directory=test_directory)
19
- answer = bot.ask(test_question)
20
 
21
  # Assert that the answer contains the correct answer
22
  assert correct_answer in answer
@@ -30,15 +30,15 @@ def test_save_load_index():
30
  index_path = os.path.join(temp_dir, "test_index.pkl")
31
 
32
  # Create a bot and save the index to the temporary file path
33
- bot = QnABot(directory=test_directory, index=index_path)
34
- bot.save_index(index_path)
35
 
36
  # Load the saved index and assert that it is the same as the original index
37
  with open(index_path, "rb") as f:
38
  saved_index = pickle.load(f)
39
  assert isinstance(saved_index, FAISS)
40
 
41
- bot_with_predefined_index = QnABot(directory=test_directory, index=index_path)
42
 
43
  # Assert that the bot returns the correct answer to the test question
44
  assert correct_answer in bot_with_predefined_index.ask(test_question)
 
1
  import os
2
  import tempfile
3
+ from megabots import bot
4
  import pickle
5
  from langchain.vectorstores.faiss import FAISS
6
  from dotenv import load_dotenv
 
15
 
16
 
17
  def test_ask():
18
+ qnabot = bot("qna-over-docs", index=test_directory)
19
+ answer = qnabot.ask(test_question)
20
 
21
  # Assert that the answer contains the correct answer
22
  assert correct_answer in answer
 
30
  index_path = os.path.join(temp_dir, "test_index.pkl")
31
 
32
  # Create a bot and save the index to the temporary file path
33
+ qnabot = bot("qna-over-docs", index=test_directory)
34
+ qnabot.save_index(index_path)
35
 
36
  # Load the saved index and assert that it is the same as the original index
37
  with open(index_path, "rb") as f:
38
  saved_index = pickle.load(f)
39
  assert isinstance(saved_index, FAISS)
40
 
41
+ bot_with_predefined_index = bot("qna-over-docs", index=index_path)
42
 
43
  # Assert that the bot returns the correct answer to the test question
44
  assert correct_answer in bot_with_predefined_index.ask(test_question)
{qnabot β†’ megabots}/tests/test_ui.py RENAMED
@@ -1,9 +1,9 @@
1
  import gradio as gr
2
- from qnabot import create_interface
3
 
4
 
5
  def test_create_interface():
6
- # create a mock QnABot object
7
  class MockBot:
8
  def ask(self, question: str):
9
  return "Answer"
 
1
  import gradio as gr
2
+ from megabots import create_interface
3
 
4
 
5
  def test_create_interface():
6
+ # create a mock Bot object
7
  class MockBot:
8
  def ask(self, question: str):
9
  return "Answer"
qnabot/__init__.py DELETED
@@ -1,3 +0,0 @@
1
- from .QnABot import QnABot
2
- from .api import create_app
3
- from .ui import create_interface
 
 
 
 
qnabot/api.py DELETED
@@ -1,13 +0,0 @@
1
- from fastapi import FastAPI
2
- from qnabot import QnABot
3
-
4
-
5
- def create_app(bot: QnABot):
6
- app = FastAPI()
7
-
8
- @app.get("/v1/ask/{question}")
9
- async def ask(question: str):
10
- answer = bot.ask(question)
11
- return {"answer": answer}
12
-
13
- return app
 
 
 
 
 
 
 
 
 
 
 
 
 
 
qnabot/ui.py DELETED
@@ -1,16 +0,0 @@
1
- import gradio as gr
2
- from qnabot import QnABot
3
-
4
-
5
- def create_interface(bot: QnABot, examples: list[list[str]] = []):
6
- def ask(question: str):
7
- return bot.ask(question)
8
-
9
- interface = gr.Interface(
10
- fn=ask,
11
- inputs=gr.components.Textbox(lines=5, label="Question"),
12
- outputs=gr.components.Textbox(lines=5, label="Answer"),
13
- examples=examples,
14
- )
15
-
16
- return interface