Spaces:

praneeth-hakeem-patrick
/

backend

Sleeping

App Files Files Community

Praneeth Yerrapragada commited on May 23

Commit

2636575

•

0 Parent(s):

feat: repo setup

Browse files

Files changed (25) hide show

.env.example +45 -0
.gitignore +4 -0
Dockerfile +26 -0
README.md +101 -0
app/__init__.py +0 -0
app/api/__init__.py +0 -0
app/api/routers/__init__.py +0 -0
app/api/routers/chat.py +148 -0
app/api/routers/messaging.py +141 -0
app/api/routers/vercel_response.py +29 -0
app/engine/__init__.py +23 -0
app/engine/generate.py +80 -0
app/engine/index.py +17 -0
app/engine/loaders/__init__.py +39 -0
app/engine/loaders/db.py +26 -0
app/engine/loaders/file.py +57 -0
app/engine/loaders/web.py +36 -0
app/engine/vectordb.py +19 -0
app/observability.py +5 -0
app/settings.py +96 -0
config/loaders.yaml +3 -0
main.py +51 -0
poetry.lock +0 -0
pyproject.toml +39 -0
tests/__init__.py +0 -0

.env.example ADDED Viewed

	@@ -0,0 +1,45 @@

+# The Llama Cloud API key.
+LLAMA_CLOUD_API_KEY=
+# The provider for the AI models to use.
+MODEL_PROVIDER=openai
+# The name of LLM model to use.
+MODEL=gpt-3.5-turbo
+# Name of the embedding model to use.
+EMBEDDING_MODEL=text-embedding-3-large
+# Dimension of the embedding model to use.
+EMBEDDING_DIM=1024
+# The OpenAI API key to use.
+OPENAI_API_KEY=
+# Temperature for sampling from the model.
+# LLM_TEMPERATURE=
+# Maximum number of tokens to generate.
+# LLM_MAX_TOKENS=
+# The number of similar embeddings to return when retrieving documents.
+TOP_K=3
+# Custom system prompt.
+# Example:
+# SYSTEM_PROMPT="You are a helpful assistant who helps users with their questions."
+# SYSTEM_PROMPT=
+# Configuration for Pinecone vector store
+# The Pinecone API key.
+# PINECONE_API_KEY=
+# PINECONE_ENVIRONMENT=
+# PINECONE_INDEX_NAME=
+# The address to start the backend app.
+APP_HOST=0.0.0.0
+# The port to start the backend app.
+APP_PORT=8000

.gitignore ADDED Viewed

	@@ -0,0 +1,4 @@

+__pycache__
+storage
+.env
+data/*

Dockerfile ADDED Viewed

	@@ -0,0 +1,26 @@

+FROM python:3.11 as build
+WORKDIR /app
+ENV PYTHONPATH=/app
+# Install Poetry
+RUN curl -sSL https://install.python-poetry.org | POETRY_HOME=/opt/poetry python && \
+    cd /usr/local/bin && \
+    ln -s /opt/poetry/bin/poetry && \
+    poetry config virtualenvs.create false
+# Install Chromium for web loader
+# Can disable this if you don't use the web loader to reduce the image size
+RUN apt update && apt install -y chromium chromium-driver
+# Install dependencies
+COPY ./pyproject.toml ./poetry.lock* /app/
+RUN poetry install --no-root --no-cache --only main
+# ====================================
+FROM build as release
+COPY . .
+CMD ["python", "main.py"]

README.md ADDED Viewed

	@@ -0,0 +1,101 @@

+This is a [LlamaIndex](https://www.llamaindex.ai/) project using [FastAPI](https://fastapi.tiangolo.com/) bootstrapped with [`create-llama`](https://github.com/run-llama/LlamaIndexTS/tree/main/packages/create-llama).
+## Getting Started
+First, setup the environment with poetry:
+> **_Note:_** This step is not needed if you are using the dev-container.
+```
+poetry install
+poetry shell
+```
+Then check the parameters that have been pre-configured in the `.env` file in this directory. (E.g. you might need to configure an `OPENAI_API_KEY` if you're using OpenAI as model provider).
+If you are using any tools or data sources, you can update their config files in the `config` folder.
+Second, generate the embeddings of the documents in the `./data` directory (if this folder exists - otherwise, skip this step):
+```
+poetry run generate
+```
+Third, run the development server:
+```
+python main.py
+```
+The example provides two different API endpoints:
+1. `/api/chat` - a streaming chat endpoint
+2. `/api/chat/request` - a non-streaming chat endpoint
+You can test the streaming endpoint with the following curl request:
+```
+curl --location 'localhost:8000/api/chat' \
+--header 'Content-Type: application/json' \
+--data '{ "messages": [{ "role": "user", "content": "Hello" }] }'
+```
+And for the non-streaming endpoint run:
+```
+curl --location 'localhost:8000/api/chat/request' \
+--header 'Content-Type: application/json' \
+--data '{ "messages": [{ "role": "user", "content": "Hello" }] }'
+```
+You can start editing the API endpoints by modifying `app/api/routers/chat.py`. The endpoints auto-update as you save the file. You can delete the endpoint you're not using.
+Open [http://localhost:8000/docs](http://localhost:8000/docs) with your browser to see the Swagger UI of the API.
+The API allows CORS for all origins to simplify development. You can change this behavior by setting the `ENVIRONMENT` environment variable to `prod`:
+```
+ENVIRONMENT=prod python main.py
+```
+## Using Docker
+1. Build an image for the FastAPI app:
+```
+docker build -t <your_backend_image_name> .
+```
+2. Generate embeddings:
+Parse the data and generate the vector embeddings if the `./data` folder exists - otherwise, skip this step:
+```
+docker run \
+  --rm \
+  -v $(pwd)/.env:/app/.env \ # Use ENV variables and configuration from your file-system
+  -v $(pwd)/config:/app/config \
+  -v $(pwd)/data:/app/data \ # Use your local folder to read the data
+  -v $(pwd)/storage:/app/storage \ # Use your file system to store the vector database
+  <your_backend_image_name> \
+  poetry run generate
+```
+3. Start the API:
+```
+docker run \
+  -v $(pwd)/.env:/app/.env \ # Use ENV variables and configuration from your file-system
+  -v $(pwd)/config:/app/config \
+  -v $(pwd)/storage:/app/storage \ # Use your file system to store gea vector database
+  -p 8000:8000 \
+  <your_backend_image_name>
+```
+## Learn More
+To learn more about LlamaIndex, take a look at the following resources:
+- [LlamaIndex Documentation](https://docs.llamaindex.ai) - learn about LlamaIndex.
+You can check out [the LlamaIndex GitHub repository](https://github.com/run-llama/llama_index) - your feedback and contributions are welcome!

app/__init__.py ADDED Viewed

File without changes

app/api/__init__.py ADDED Viewed

File without changes

app/api/routers/__init__.py ADDED Viewed

File without changes

app/api/routers/chat.py ADDED Viewed

	@@ -0,0 +1,148 @@

+from pydantic import BaseModel
+from typing import List, Any, Optional, Dict, Tuple
+from fastapi import APIRouter, Depends, HTTPException, Request, status
+from llama_index.core.chat_engine.types import BaseChatEngine
+from llama_index.core.schema import NodeWithScore
+from llama_index.core.llms import ChatMessage, MessageRole
+from app.engine import get_chat_engine
+from app.api.routers.vercel_response import VercelStreamResponse
+from app.api.routers.messaging import EventCallbackHandler
+from aiostream import stream
+chat_router = r = APIRouter()
+class _Message(BaseModel):
+    role: MessageRole
+    content: str
+class _ChatData(BaseModel):
+    messages: List[_Message]
+    class Config:
+        json_schema_extra = {
+            "example": {
+                "messages": [
+                    {
+                        "role": "user",
+                        "content": "What standards for letters exist?",
+                    }
+                ]
+            }
+        }
+class _SourceNodes(BaseModel):
+    id: str
+    metadata: Dict[str, Any]
+    score: Optional[float]
+    text: str
+    @classmethod
+    def from_source_node(cls, source_node: NodeWithScore):
+        return cls(
+            id=source_node.node.node_id,
+            metadata=source_node.node.metadata,
+            score=source_node.score,
+            text=source_node.node.text,  # type: ignore
+        )
+    @classmethod
+    def from_source_nodes(cls, source_nodes: List[NodeWithScore]):
+        return [cls.from_source_node(node) for node in source_nodes]
+class _Result(BaseModel):
+    result: _Message
+    nodes: List[_SourceNodes]
+async def parse_chat_data(data: _ChatData) -> Tuple[str, List[ChatMessage]]:
+    # check preconditions and get last message
+    if len(data.messages) == 0:
+        raise HTTPException(
+            status_code=status.HTTP_400_BAD_REQUEST,
+            detail="No messages provided",
+        )
+    last_message = data.messages.pop()
+    if last_message.role != MessageRole.USER:
+        raise HTTPException(
+            status_code=status.HTTP_400_BAD_REQUEST,
+            detail="Last message must be from user",
+        )
+    # convert messages coming from the request to type ChatMessage
+    messages = [
+        ChatMessage(
+            role=m.role,
+            content=m.content,
+        )
+        for m in data.messages
+    ]
+    return last_message.content, messages
+# streaming endpoint - delete if not needed
+@r.post("")
+async def chat(
+    request: Request,
+    data: _ChatData,
+    chat_engine: BaseChatEngine = Depends(get_chat_engine),
+):
+    last_message_content, messages = await parse_chat_data(data)
+    event_handler = EventCallbackHandler()
+    chat_engine.callback_manager.handlers.append(event_handler)  # type: ignore
+    response = await chat_engine.astream_chat(last_message_content, messages)
+    async def content_generator():
+        # Yield the text response
+        async def _text_generator():
+            async for token in response.async_response_gen():
+                yield VercelStreamResponse.convert_text(token)
+            # the text_generator is the leading stream, once it's finished, also finish the event stream
+            event_handler.is_done = True
+        # Yield the events from the event handler
+        async def _event_generator():
+            async for event in event_handler.async_event_gen():
+                event_response = event.to_response()
+                if event_response is not None:
+                    yield VercelStreamResponse.convert_data(event_response)
+        combine = stream.merge(_text_generator(), _event_generator())
+        async with combine.stream() as streamer:
+            async for item in streamer:
+                if await request.is_disconnected():
+                    break
+                yield item
+        # Yield the source nodes
+        yield VercelStreamResponse.convert_data(
+            {
+                "type": "sources",
+                "data": {
+                    "nodes": [
+                        _SourceNodes.from_source_node(node).dict()
+                        for node in response.source_nodes
+                    ]
+                },
+            }
+        )
+    return VercelStreamResponse(content=content_generator())
+# non-streaming endpoint - delete if not needed
+@r.post("/request")
+async def chat_request(
+    data: _ChatData,
+    chat_engine: BaseChatEngine = Depends(get_chat_engine),
+) -> _Result:
+    last_message_content, messages = await parse_chat_data(data)
+    response = await chat_engine.achat(last_message_content, messages)
+    return _Result(
+        result=_Message(role=MessageRole.ASSISTANT, content=response.response),
+        nodes=_SourceNodes.from_source_nodes(response.source_nodes),
+    )

app/api/routers/messaging.py ADDED Viewed

	@@ -0,0 +1,141 @@

+import json
+import asyncio
+from typing import AsyncGenerator, Dict, Any, List, Optional
+from llama_index.core.callbacks.base import BaseCallbackHandler
+from llama_index.core.callbacks.schema import CBEventType
+from llama_index.core.tools.types import ToolOutput
+from pydantic import BaseModel
+class CallbackEvent(BaseModel):
+    event_type: CBEventType
+    payload: Optional[Dict[str, Any]] = None
+    event_id: str = ""
+    def get_retrieval_message(self) -> dict | None:
+        if self.payload:
+            nodes = self.payload.get("nodes")
+            if nodes:
+                msg = f"Retrieved {len(nodes)} sources to use as context for the query"
+            else:
+                msg = f"Retrieving context for query: '{self.payload.get('query_str')}'"
+            return {
+                "type": "events",
+                "data": {"title": msg},
+            }
+        else:
+            return None
+    def get_tool_message(self) -> dict | None:
+        func_call_args = self.payload.get("function_call")
+        if func_call_args is not None and "tool" in self.payload:
+            tool = self.payload.get("tool")
+            return {
+                "type": "events",
+                "data": {
+                    "title": f"Calling tool: {tool.name} with inputs: {func_call_args}",
+                },
+            }
+    def _is_output_serializable(self, output: Any) -> bool:
+        try:
+            json.dumps(output)
+            return True
+        except TypeError:
+            return False
+    def get_agent_tool_response(self) -> dict | None:
+        response = self.payload.get("response")
+        if response is not None:
+            sources = response.sources
+            for source in sources:
+                # Return the tool response here to include the toolCall information
+                if isinstance(source, ToolOutput):
+                    if self._is_output_serializable(source.raw_output):
+                        output = source.raw_output
+                    else:
+                        output = source.content
+                    return {
+                        "type": "tools",
+                        "data": {
+                            "toolOutput": {
+                                "output": output,
+                                "isError": source.is_error,
+                            },
+                            "toolCall": {
+                                "id": None,  # There is no tool id in the ToolOutput
+                                "name": source.tool_name,
+                                "input": source.raw_input,
+                            },
+                        },
+                    }
+    def to_response(self):
+        match self.event_type:
+            case "retrieve":
+                return self.get_retrieval_message()
+            case "function_call":
+                return self.get_tool_message()
+            case "agent_step":
+                return self.get_agent_tool_response()
+            case _:
+                return None
+class EventCallbackHandler(BaseCallbackHandler):
+    _aqueue: asyncio.Queue
+    is_done: bool = False
+    def __init__(
+        self,
+    ):
+        """Initialize the base callback handler."""
+        ignored_events = [
+            CBEventType.CHUNKING,
+            CBEventType.NODE_PARSING,
+            CBEventType.EMBEDDING,
+            CBEventType.LLM,
+            CBEventType.TEMPLATING,
+        ]
+        super().__init__(ignored_events, ignored_events)
+        self._aqueue = asyncio.Queue()
+    def on_event_start(
+        self,
+        event_type: CBEventType,
+        payload: Optional[Dict[str, Any]] = None,
+        event_id: str = "",
+        **kwargs: Any,
+    ) -> str:
+        event = CallbackEvent(event_id=event_id, event_type=event_type, payload=payload)
+        if event.to_response() is not None:
+            self._aqueue.put_nowait(event)
+    def on_event_end(
+        self,
+        event_type: CBEventType,
+        payload: Optional[Dict[str, Any]] = None,
+        event_id: str = "",
+        **kwargs: Any,
+    ) -> None:
+        event = CallbackEvent(event_id=event_id, event_type=event_type, payload=payload)
+        if event.to_response() is not None:
+            self._aqueue.put_nowait(event)
+    def start_trace(self, trace_id: Optional[str] = None) -> None:
+        """No-op."""
+    def end_trace(
+        self,
+        trace_id: Optional[str] = None,
+        trace_map: Optional[Dict[str, List[str]]] = None,
+    ) -> None:
+        """No-op."""
+    async def async_event_gen(self) -> AsyncGenerator[CallbackEvent, None]:
+        while not self._aqueue.empty() or not self.is_done:
+            try:
+                yield await asyncio.wait_for(self._aqueue.get(), timeout=0.1)
+            except asyncio.TimeoutError:
+                pass

app/api/routers/vercel_response.py ADDED Viewed

	@@ -0,0 +1,29 @@

+import json
+from typing import Any
+from fastapi.responses import StreamingResponse
+class VercelStreamResponse(StreamingResponse):
+    """
+    Class to convert the response from the chat engine to the streaming format expected by Vercel
+    """
+    TEXT_PREFIX = "0:"
+    DATA_PREFIX = "8:"
+    @classmethod
+    def convert_text(cls, token: str):
+        # Escape newlines and double quotes to avoid breaking the stream
+        token = json.dumps(token)
+        return f"{cls.TEXT_PREFIX}{token}\n"
+    @classmethod
+    def convert_data(cls, data: dict):
+        data_str = json.dumps(data)
+        return f"{cls.DATA_PREFIX}[{data_str}]\n"
+    def __init__(self, content: Any, **kwargs):
+        super().__init__(
+            content=content,
+            **kwargs,
+        )

app/engine/__init__.py ADDED Viewed

	@@ -0,0 +1,23 @@

+import os
+from app.engine.index import get_index
+from fastapi import HTTPException
+def get_chat_engine():
+    system_prompt = os.getenv("SYSTEM_PROMPT")
+    top_k = os.getenv("TOP_K", 3)
+    index = get_index()
+    if index is None:
+        raise HTTPException(
+            status_code=500,
+            detail=str(
+                "StorageContext is empty - call 'poetry run generate' to generate the storage first"
+            ),
+        )
+    return index.as_chat_engine(
+        similarity_top_k=int(top_k),
+        system_prompt=system_prompt,
+        chat_mode="condense_plus_context",
+    )

app/engine/generate.py ADDED Viewed

	@@ -0,0 +1,80 @@

+from dotenv import load_dotenv
+load_dotenv()
+import os
+import logging
+from llama_index.core.settings import Settings
+from llama_index.core.ingestion import IngestionPipeline
+from llama_index.core.node_parser import SentenceSplitter
+from llama_index.core.storage.docstore import SimpleDocumentStore
+from llama_index.core.storage import StorageContext
+from app.settings import init_settings
+from app.engine.loaders import get_documents
+from app.engine.vectordb import get_vector_store
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger()
+STORAGE_DIR = os.getenv("STORAGE_DIR", "storage")
+def get_doc_store():
+    # If the storage directory is there, load the document store from it.
+    # If not, set up an in-memory document store since we can't load from a directory that doesn't exist.
+    if os.path.exists(STORAGE_DIR):
+        return SimpleDocumentStore.from_persist_dir(STORAGE_DIR)
+    else:
+        return SimpleDocumentStore()
+def run_pipeline(docstore, vector_store, documents):
+    pipeline = IngestionPipeline(
+        transformations=[
+            SentenceSplitter(
+                chunk_size=Settings.chunk_size,
+                chunk_overlap=Settings.chunk_overlap,
+            ),
+            Settings.embed_model,
+        ],
+        docstore=docstore,
+        docstore_strategy="upserts_and_delete",
+        vector_store=vector_store,
+    )
+    # Run the ingestion pipeline and store the results
+    nodes = pipeline.run(show_progress=True, documents=documents)
+    return nodes
+def persist_storage(docstore, vector_store):
+    storage_context = StorageContext.from_defaults(
+        docstore=docstore,
+        vector_store=vector_store,
+    )
+    storage_context.persist(STORAGE_DIR)
+def generate_datasource():
+    init_settings()
+    logger.info("Generate index for the provided data")
+    # Get the stores and documents or create new ones
+    documents = get_documents()
+    docstore = get_doc_store()
+    vector_store = get_vector_store()
+    # Run the ingestion pipeline
+    _ = run_pipeline(docstore, vector_store, documents)
+    # Build the index and persist storage
+    persist_storage(docstore, vector_store)
+    logger.info("Finished generating the index")
+if __name__ == "__main__":
+    generate_datasource()

app/engine/index.py ADDED Viewed

	@@ -0,0 +1,17 @@

+import logging
+from llama_index.core.indices import VectorStoreIndex
+from app.engine.vectordb import get_vector_store
+logger = logging.getLogger("uvicorn")
+def get_index():
+    logger.info("Connecting vector store...")
+    store = get_vector_store()
+    # Load the index from the vector store
+    # If you are using a vector store that doesn't store text,
+    # you must load the index from both the vector store and the document store
+    index = VectorStoreIndex.from_vector_store(store)
+    logger.info("Finished load index from vector store.")
+    return index

app/engine/loaders/__init__.py ADDED Viewed

	@@ -0,0 +1,39 @@

+import os
+import yaml
+import importlib
+import logging
+from typing import Dict
+from app.engine.loaders.file import FileLoaderConfig, get_file_documents
+from app.engine.loaders.web import WebLoaderConfig, get_web_documents
+from app.engine.loaders.db import DBLoaderConfig, get_db_documents
+logger = logging.getLogger(__name__)
+def load_configs():
+    with open("config/loaders.yaml") as f:
+        configs = yaml.safe_load(f)
+    return configs
+def get_documents():
+    documents = []
+    config = load_configs()
+    for loader_type, loader_config in config.items():
+        logger.info(
+            f"Loading documents from loader: {loader_type}, config: {loader_config}"
+        )
+        match loader_type:
+            case "file":
+                document = get_file_documents(FileLoaderConfig(**loader_config))
+            case "web":
+                document = get_web_documents(WebLoaderConfig(**loader_config))
+            case "db":
+                document = get_db_documents(
+                    configs=[DBLoaderConfig(**cfg) for cfg in loader_config]
+                )
+            case _:
+                raise ValueError(f"Invalid loader type: {loader_type}")
+        documents.extend(document)
+    return documents

app/engine/loaders/db.py ADDED Viewed

	@@ -0,0 +1,26 @@

+import os
+import logging
+from typing import List
+from pydantic import BaseModel, validator
+from llama_index.core.indices.vector_store import VectorStoreIndex
+logger = logging.getLogger(__name__)
+class DBLoaderConfig(BaseModel):
+    uri: str
+    queries: List[str]
+def get_db_documents(configs: list[DBLoaderConfig]):
+    from llama_index.readers.database import DatabaseReader
+    docs = []
+    for entry in configs:
+        loader = DatabaseReader(uri=entry.uri)
+        for query in entry.queries:
+            logger.info(f"Loading data from database with query: {query}")
+            documents = loader.load_data(query=query)
+            docs.extend(documents)
+    return documents

app/engine/loaders/file.py ADDED Viewed

	@@ -0,0 +1,57 @@

+import os
+import logging
+from llama_parse import LlamaParse
+from pydantic import BaseModel, validator
+logger = logging.getLogger(__name__)
+class FileLoaderConfig(BaseModel):
+    data_dir: str = "data"
+    use_llama_parse: bool = False
+    @validator("data_dir")
+    def data_dir_must_exist(cls, v):
+        if not os.path.isdir(v):
+            raise ValueError(f"Directory '{v}' does not exist")
+        return v
+def llama_parse_parser():
+    if os.getenv("LLAMA_CLOUD_API_KEY") is None:
+        raise ValueError(
+            "LLAMA_CLOUD_API_KEY environment variable is not set. "
+            "Please set it in .env file or in your shell environment then run again!"
+        )
+    parser = LlamaParse(result_type="markdown", verbose=True, language="en")
+    return parser
+def get_file_documents(config: FileLoaderConfig):
+    from llama_index.core.readers import SimpleDirectoryReader
+    try:
+        reader = SimpleDirectoryReader(
+            config.data_dir,
+            recursive=True,
+            filename_as_id=True,
+        )
+        if config.use_llama_parse:
+            parser = llama_parse_parser()
+            reader.file_extractor = {".pdf": parser}
+        return reader.load_data()
+    except ValueError as e:
+        import sys, traceback
+        # Catch the error if the data dir is empty
+        # and return as empty document list
+        _, _, exc_traceback = sys.exc_info()
+        function_name = traceback.extract_tb(exc_traceback)[-1].name
+        if function_name == "_add_files":
+            logger.warning(
+                f"Failed to load file documents, error message: {e} . Return as empty document list."
+            )
+            return []
+        else:
+            # Raise the error if it is not the case of empty data dir
+            raise e

app/engine/loaders/web.py ADDED Viewed

	@@ -0,0 +1,36 @@

+import os
+import json
+from pydantic import BaseModel, Field
+class CrawlUrl(BaseModel):
+    base_url: str
+    prefix: str
+    max_depth: int = Field(default=1, ge=0)
+class WebLoaderConfig(BaseModel):
+    driver_arguments: list[str] = Field(default=None)
+    urls: list[CrawlUrl]
+def get_web_documents(config: WebLoaderConfig):
+    from llama_index.readers.web import WholeSiteReader
+    from selenium import webdriver
+    from selenium.webdriver.chrome.options import Options
+    options = Options()
+    driver_arguments = config.driver_arguments or []
+    for arg in driver_arguments:
+        options.add_argument(arg)
+    docs = []
+    for url in config.urls:
+        scraper = WholeSiteReader(
+            prefix=url.prefix,
+            max_depth=url.max_depth,
+            driver=webdriver.Chrome(options=options),
+        )
+        docs.extend(scraper.load_data(url.base_url))
+    return docs

app/engine/vectordb.py ADDED Viewed

	@@ -0,0 +1,19 @@

+import os
+from llama_index.vector_stores.pinecone import PineconeVectorStore
+def get_vector_store():
+    api_key = os.getenv("PINECONE_API_KEY")
+    index_name = os.getenv("PINECONE_INDEX_NAME")
+    environment = os.getenv("PINECONE_ENVIRONMENT")
+    if not api_key or not index_name or not environment:
+        raise ValueError(
+            "Please set PINECONE_API_KEY, PINECONE_INDEX_NAME, and PINECONE_ENVIRONMENT"
+            " to your environment variables or config them in the .env file"
+        )
+    store = PineconeVectorStore(
+        api_key=api_key,
+        index_name=index_name,
+        environment=environment,
+    )
+    return store

app/observability.py ADDED Viewed

	@@ -0,0 +1,5 @@

+from traceloop.sdk import Traceloop
+def init_observability():
+    Traceloop.init()

app/settings.py ADDED Viewed

	@@ -0,0 +1,96 @@

+import os
+from typing import Dict
+from llama_index.core.settings import Settings
+def init_settings():
+    model_provider = os.getenv("MODEL_PROVIDER")
+    if model_provider == "openai":
+        init_openai()
+    elif model_provider == "ollama":
+        init_ollama()
+    elif model_provider == "anthropic":
+        init_anthropic()
+    elif model_provider == "gemini":
+        init_gemini()
+    else:
+        raise ValueError(f"Invalid model provider: {model_provider}")
+    Settings.chunk_size = int(os.getenv("CHUNK_SIZE", "1024"))
+    Settings.chunk_overlap = int(os.getenv("CHUNK_OVERLAP", "20"))
+def init_ollama():
+    from llama_index.llms.ollama import Ollama
+    from llama_index.embeddings.ollama import OllamaEmbedding
+    base_url = os.getenv("OLLAMA_BASE_URL") or "http://127.0.0.1:11434"
+    Settings.embed_model = OllamaEmbedding(
+        base_url=base_url,
+        model_name=os.getenv("EMBEDDING_MODEL"),
+    )
+    Settings.llm = Ollama(base_url=base_url, model=os.getenv("MODEL"))
+def init_openai():
+    from llama_index.llms.openai import OpenAI
+    from llama_index.embeddings.openai import OpenAIEmbedding
+    from llama_index.core.constants import DEFAULT_TEMPERATURE
+    max_tokens = os.getenv("LLM_MAX_TOKENS")
+    config = {
+        "model": os.getenv("MODEL"),
+        "temperature": float(os.getenv("LLM_TEMPERATURE", DEFAULT_TEMPERATURE)),
+        "max_tokens": int(max_tokens) if max_tokens is not None else None,
+    }
+    Settings.llm = OpenAI(**config)
+    dimensions = os.getenv("EMBEDDING_DIM")
+    config = {
+        "model": os.getenv("EMBEDDING_MODEL"),
+        "dimensions": int(dimensions) if dimensions is not None else None,
+    }
+    Settings.embed_model = OpenAIEmbedding(**config)
+def init_anthropic():
+    from llama_index.llms.anthropic import Anthropic
+    from llama_index.embeddings.huggingface import HuggingFaceEmbedding
+    model_map: Dict[str, str] = {
+        "claude-3-opus": "claude-3-opus-20240229",
+        "claude-3-sonnet": "claude-3-sonnet-20240229",
+        "claude-3-haiku": "claude-3-haiku-20240307",
+        "claude-2.1": "claude-2.1",
+        "claude-instant-1.2": "claude-instant-1.2",
+    }
+    embed_model_map: Dict[str, str] = {
+        "all-MiniLM-L6-v2": "sentence-transformers/all-MiniLM-L6-v2",
+        "all-mpnet-base-v2": "sentence-transformers/all-mpnet-base-v2",
+    }
+    Settings.llm = Anthropic(model=model_map[os.getenv("MODEL")])
+    Settings.embed_model = HuggingFaceEmbedding(
+        model_name=embed_model_map[os.getenv("EMBEDDING_MODEL")]
+    )
+def init_gemini():
+    from llama_index.llms.gemini import Gemini
+    from llama_index.embeddings.gemini import GeminiEmbedding
+    model_map: Dict[str, str] = {
+        "gemini-1.5-pro-latest": "models/gemini-1.5-pro-latest",
+        "gemini-pro": "models/gemini-pro",
+        "gemini-pro-vision": "models/gemini-pro-vision",
+    }
+    embed_model_map: Dict[str, str] = {
+        "embedding-001": "models/embedding-001",
+        "text-embedding-004": "models/text-embedding-004",
+    }
+    Settings.llm = Gemini(model=model_map[os.getenv("MODEL")])
+    Settings.embed_model = GeminiEmbedding(
+        model_name=embed_model_map[os.getenv("EMBEDDING_MODEL")]
+    )

config/loaders.yaml ADDED Viewed

	@@ -0,0 +1,3 @@

+file:
+  # use_llama_parse: Use LlamaParse if `true`. Needs a `LLAMA_CLOUD_API_KEY` from https://cloud.llamaindex.ai set as environment variable
+  use_llama_parse: true

main.py ADDED Viewed

	@@ -0,0 +1,51 @@

+from dotenv import load_dotenv
+load_dotenv()
+import logging
+import os
+import uvicorn
+from fastapi import FastAPI
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import RedirectResponse
+from app.api.routers.chat import chat_router
+from app.settings import init_settings
+from app.observability import init_observability
+from fastapi.staticfiles import StaticFiles
+app = FastAPI()
+init_settings()
+init_observability()
+environment = os.getenv("ENVIRONMENT", "dev")  # Default to 'development' if not set
+if environment == "dev":
+    logger = logging.getLogger("uvicorn")
+    logger.warning("Running in development mode - allowing CORS for all origins")
+    app.add_middleware(
+        CORSMiddleware,
+        allow_origins=["*"],
+        allow_credentials=True,
+        allow_methods=["*"],
+        allow_headers=["*"],
+    )
+    # Redirect to documentation page when accessing base URL
+    @app.get("/")
+    async def redirect_to_docs():
+        return RedirectResponse(url="/docs")
+if os.path.exists("data"):
+    app.mount("/api/data", StaticFiles(directory="data"), name="static")
+app.include_router(chat_router, prefix="/api/chat")
+if __name__ == "__main__":
+    app_host = os.getenv("APP_HOST", "0.0.0.0")
+    app_port = int(os.getenv("APP_PORT", "8000"))
+    reload = True if environment == "dev" else False
+    uvicorn.run(app="main:app", host=app_host, port=app_port, reload=reload)

poetry.lock ADDED Viewed

The diff for this file is too large to render. See raw diff

pyproject.toml ADDED Viewed

	@@ -0,0 +1,39 @@

+[tool]
+[tool.poetry]
+name = "app"
+version = "0.1.0"
+description = ""
+authors = [ "Marcus Schiesser <mail@marcusschiesser.de>" ]
+readme = "README.md"
+[tool.poetry.scripts]
+generate = "app.engine.generate:generate_datasource"
+[tool.poetry.dependencies]
+python = "^3.11,<3.12"
+fastapi = "^0.109.1"
+python-dotenv = "^1.0.0"
+aiostream = "^0.5.2"
+llama-index = "0.10.28"
+llama-index-core = "0.10.28"
+cachetools = "^5.3.3"
+[tool.poetry.dependencies.uvicorn]
+extras = [ "standard" ]
+version = "^0.23.2"
+[tool.poetry.dependencies.llama-index-vector-stores-pinecone]
+version = "^0.1.3"
+[tool.poetry.dependencies.docx2txt]
+version = "^0.8"
+[tool.poetry.dependencies.llama-index-agent-openai]
+version = "0.2.2"
+[tool.poetry.dependencies.traceloop-sdk]
+version = "^0.15.11"
+[build-system]
+requires = [ "poetry-core" ]
+build-backend = "poetry.core.masonry.api"

tests/__init__.py ADDED Viewed

File without changes