Spaces:

SurajTechAI
/

Multi-Agent-RAG-system-using-LangChain

Running

App Files Files Community

SurajTechAI commited on 8 days ago

Commit

ef072ca

0 Parent(s):

Initial commit: Multi-Agent RAG System

Browse files

Files changed (40) hide show

.env.example +27 -0
.gitignore +31 -0
Dockerfile +36 -0
README.md +295 -0
README_HF.md +50 -0
app.py +22 -0
app/__init__.py +17 -0
app/agents/__init__.py +44 -0
app/agents/action_agent.py +386 -0
app/agents/base_agent.py +190 -0
app/agents/reasoning_agent.py +237 -0
app/agents/retriever_agent.py +160 -0
app/agents/router_agent.py +211 -0
app/api/__init__.py +10 -0
app/api/routes.py +306 -0
app/config.py +212 -0
app/main.py +168 -0
app/memory/__init__.py +10 -0
app/memory/conversation_memory.py +201 -0
app/schemas/__init__.py +32 -0
app/schemas/models.py +264 -0
app/services/__init__.py +11 -0
app/services/document_service.py +294 -0
app/services/orchestrator.py +272 -0
app/tools/__init__.py +25 -0
app/tools/action_tools.py +279 -0
app/tools/document_tool.py +200 -0
app/tools/search_tool.py +135 -0
app/vectorstore/__init__.py +11 -0
app/vectorstore/embeddings.py +172 -0
app/vectorstore/faiss_store.py +294 -0
claude.md +39 -0
data/documents/account_settings.txt +100 -0
data/documents/billing_faq.txt +75 -0
data/documents/password_reset.txt +45 -0
data/documents/technical_support.txt +127 -0
requirements.txt +46 -0
scripts/test_api.py +157 -0
skills.md +27 -0
tools.md +20 -0

.env.example ADDED Viewed

	@@ -0,0 +1,27 @@

+# Multi-Agent RAG System Environment Configuration
+# =================================================
+# Copy this file to .env and fill in your values
+# OpenAI API Key (Required)
+# Get yours at: https://platform.openai.com/api-keys
+OPENAI_API_KEY=sk-1234efgh5678ijkl1234efgh5678ijkl1234efgh
+# Model Configuration
+# Options: gpt-4-turbo-preview, gpt-4, gpt-3.5-turbo
+LLM_MODEL=gpt-4-turbo-preview
+# Embedding Model
+# Options: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002
+EMBEDDING_MODEL=text-embedding-3-small
+# Vector Store Configuration
+FAISS_INDEX_PATH=./data/faiss_index
+DOCUMENTS_PATH=./data/documents
+# API Configuration
+API_HOST=0.0.0.0
+API_PORT=8000
+DEBUG_MODE=true
+# Logging
+LOG_LEVEL=INFO

.gitignore ADDED Viewed

	@@ -0,0 +1,31 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+.env
+.venv
+venv/
+ENV/
+# IDE
+.idea/
+.vscode/
+*.swp
+*.swo
+# Data (optional - remove if you want to include sample docs)
+data/faiss_index/
+# OS
+.DS_Store
+Thumbs.db
+# Logs
+*.log
+# Testing
+.pytest_cache/
+.coverage
+htmlcov/

Dockerfile ADDED Viewed

	@@ -0,0 +1,36 @@

+# Dockerfile for Hugging Face Spaces
+FROM python:3.11-slim
+WORKDIR /app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    build-essential \
+    && rm -rf /var/lib/apt/lists/*
+# Copy requirements first for caching
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy application code
+COPY . .
+# Create data directories
+RUN mkdir -p data/documents data/faiss_index
+# Copy sample documents
+COPY data/documents/* data/documents/ 2>/dev/null || true
+# Set environment variables
+ENV LLM_PROVIDER=huggingface
+ENV EMBEDDING_PROVIDER=huggingface
+ENV HUGGINGFACE_MODEL=mistralai/Mistral-7B-Instruct-v0.2
+ENV HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
+ENV API_HOST=0.0.0.0
+ENV API_PORT=8000
+# Expose port
+EXPOSE 8000
+# Run the application
+CMD ["python", "-m", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

README.md ADDED Viewed

	@@ -0,0 +1,295 @@

+# Multi-Agent RAG System
+A production-grade Retrieval-Augmented Generation (RAG) system using multiple specialized agents built with LangChain and FastAPI.
+## Architecture Overview
+```
+                    ┌──────────────┐
+                    │ User Query   │
+                    └──────┬───────┘
+                           │
+                    ┌──────▼───────┐
+                    │ Router Agent │ ◄─── Classifies query intent
+                    └──────┬───────┘
+                           │
+           ┌───────────────┼───────────────┐
+           │               │               │
+    ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
+    │  Retriever  │ │  Reasoning  │ │   Action    │
+    │    Agent    │ │    Agent    │ │    Agent    │
+    └─────────────┘ └─────────────┘ └─────────────┘
+           │               ▲               │
+           └───────────────┘               │
+         (context flows to reasoning)      │
+                    ┌──────────────────────┘
+                    ▼
+             ┌──────────────┐
+             │   Response   │
+             └──────────────┘
+```
+### Agents
+1. **Router Agent** - Classifies query intent and routes to appropriate agents
+2. **Retriever Agent** - Searches FAISS vector store for relevant documents
+3. **Reasoning Agent** - Generates grounded responses from retrieved context
+4. **Action Agent** - Executes actions like creating tickets or escalating
+## Features
+- Multi-agent architecture with single responsibility principle
+- Semantic document search using FAISS and OpenAI embeddings
+- Grounded responses with source citations
+- Conversation memory for multi-turn interactions
+- Action execution (tickets, escalation, notifications)
+- RESTful API with FastAPI
+- Automatic API documentation (Swagger/OpenAPI)
+## Quick Start
+### 1. Install Dependencies
+```bash
+pip install -r requirements.txt
+```
+### 2. Configure Environment
+```bash
+# Copy example environment file
+cp .env.example .env
+# Edit .env and add your OpenAI API key
+# OPENAI_API_KEY=your-key-here
+```
+### 3. Run the Server
+```bash
+# Development mode (with auto-reload)
+uvicorn app.main:app --reload
+# Or run directly
+python -m app.main
+```
+### 4. Access the API
+- **Swagger UI**: http://localhost:8000/docs
+- **ReDoc**: http://localhost:8000/redoc
+- **Health Check**: http://localhost:8000/api/v1/health
+## API Endpoints
+### Query Endpoint
+```bash
+POST /api/v1/query
+Content-Type: application/json
+{
+    "query": "How do I reset my password?",
+    "conversation_id": "optional-session-id",
+    "include_sources": true
+}
+```
+### Document Ingestion
+```bash
+# Ingest all documents from configured directory
+POST /api/v1/ingest
+Content-Type: application/json
+{
+    "force_reindex": false
+}
+# Ingest specific files
+POST /api/v1/ingest
+Content-Type: application/json
+{
+    "file_paths": ["/path/to/document.pdf"],
+    "force_reindex": false
+}
+```
+### Health Check
+```bash
+GET /api/v1/health
+```
+## Project Structure
+```
+multi-agent-rag/
+├── app/
+│   ├── __init__.py
+│   ├── main.py              # FastAPI application entry
+│   ├── config.py            # Configuration management
+│   ├── agents/              # Agent implementations
+│   │   ├── base_agent.py    # Abstract base class
+│   │   ├── router_agent.py  # Query routing
+│   │   ├── retriever_agent.py
+│   │   ├── reasoning_agent.py
+│   │   └── action_agent.py
+│   ├── tools/               # LangChain tools
+│   │   ├── search_tool.py
+│   │   ├── document_tool.py
+│   │   └── action_tools.py
+│   ├── memory/              # Conversation memory
+│   │   └── conversation_memory.py
+│   ├── vectorstore/         # Vector database
+│   │   ├── embeddings.py
+│   │   └── faiss_store.py
+│   ├── schemas/             # Pydantic models
+│   │   └── models.py
+│   ├── services/            # Business logic
+│   │   ├── orchestrator.py
+│   │   └── document_service.py
+│   └── api/                 # API routes
+│       └── routes.py
+├── data/
+│   ├── documents/           # Source documents
+│   └── faiss_index/         # Vector index storage
+├── tests/
+├── requirements.txt
+├── .env.example
+└── README.md
+```
+## Configuration
+All configuration is done via environment variables or `.env` file:
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `OPENAI_API_KEY` | OpenAI API key | Required |
+| `LLM_MODEL` | Model for agents | gpt-4-turbo-preview |
+| `EMBEDDING_MODEL` | Embedding model | text-embedding-3-small |
+| `FAISS_INDEX_PATH` | Vector index location | ./data/faiss_index |
+| `DOCUMENTS_PATH` | Source documents | ./data/documents |
+| `CHUNK_SIZE` | Document chunk size | 1000 |
+| `CHUNK_OVERLAP` | Chunk overlap | 200 |
+| `RETRIEVAL_TOP_K` | Documents to retrieve | 5 |
+| `API_PORT` | Server port | 8000 |
+| `LOG_LEVEL` | Logging level | INFO |
+## Usage Examples
+### Python Client
+```python
+import httpx
+import asyncio
+async def query_rag():
+    async with httpx.AsyncClient() as client:
+        # Ingest documents first
+        await client.post(
+            "http://localhost:8000/api/v1/ingest",
+            json={"force_reindex": True}
+        )
+        # Query the system
+        response = await client.post(
+            "http://localhost:8000/api/v1/query",
+            json={
+                "query": "How do I reset my password?",
+                "include_sources": True
+            }
+        )
+        result = response.json()
+        print(f"Answer: {result['answer']}")
+        print(f"Sources: {len(result['sources'])} documents")
+        print(f"Agents used: {' -> '.join(result['agent_trace'])}")
+asyncio.run(query_rag())
+```
+### cURL Examples
+```bash
+# Health check
+curl http://localhost:8000/api/v1/health
+# Ingest documents
+curl -X POST http://localhost:8000/api/v1/ingest \
+  -H "Content-Type: application/json" \
+  -d '{"force_reindex": true}'
+# Query
+curl -X POST http://localhost:8000/api/v1/query \
+  -H "Content-Type: application/json" \
+  -d '{"query": "How do I change my email address?"}'
+```
+## Key Design Decisions
+### Why Multi-Agent Architecture?
+- **Single Responsibility**: Each agent does one thing well
+- **Testability**: Agents can be tested in isolation
+- **Flexibility**: Easy to add/remove agents
+- **Scalability**: Agents could run on different services
+### Why FAISS?
+- **No External Dependencies**: Runs locally, no API costs
+- **Fast**: Optimized C++ with Python bindings
+- **Scalable**: Handles millions of vectors
+- **Persistent**: Index saved to disk
+### Why LangChain?
+- **Standardized Interfaces**: Common patterns for agents, tools, memory
+- **Flexibility**: Easy to swap components
+- **Community**: Large ecosystem of integrations
+### Grounding Principle
+Every response is grounded in retrieved documents. The Reasoning Agent:
+1. ONLY uses information from retrieved context
+2. Admits when information is not available
+3. Never makes up information
+4. Cites sources when possible
+## Production Considerations
+### Security
+- Store API keys in environment variables, never in code
+- Use HTTPS in production
+- Implement rate limiting
+- Add authentication for sensitive endpoints
+### Scalability
+- Use Redis for distributed conversation memory
+- Consider Pinecone/Weaviate for larger document collections
+- Run multiple uvicorn workers: `uvicorn app.main:app --workers 4`
+- Add caching for frequently asked questions
+### Monitoring
+- Integrate with logging services (Datadog, CloudWatch)
+- Add tracing (OpenTelemetry)
+- Monitor agent response times
+- Track retrieval relevance scores
+## Troubleshooting
+### "No documents in knowledge base"
+Run the ingestion endpoint first:
+```bash
+curl -X POST http://localhost:8000/api/v1/ingest
+```
+### "OpenAI API key not set"
+Ensure your `.env` file exists and contains:
+```
+OPENAI_API_KEY=sk-your-key-here
+```
+### Slow responses
+- Reduce `RETRIEVAL_TOP_K` for faster retrieval
+- Use a smaller LLM model (gpt-3.5-turbo)
+- Check network latency to OpenAI
+## License
+MIT License - See LICENSE file for details.

README_HF.md ADDED Viewed

	@@ -0,0 +1,50 @@

+---
+title: Multi-Agent RAG System
+emoji: 🤖
+colorFrom: blue
+colorTo: purple
+sdk: docker
+app_port: 8000
+pinned: false
+license: mit
+---
+# Multi-Agent RAG System
+A production-grade Retrieval-Augmented Generation system using multiple specialized AI agents.
+## Features
+- **Router Agent**: Classifies queries and routes to appropriate agents
+- **Retriever Agent**: Semantic search using FAISS vector store
+- **Reasoning Agent**: Generates grounded responses from context
+- **Action Agent**: Executes actions like creating tickets
+## API Endpoints
+- `POST /api/v1/query` - Submit a question
+- `POST /api/v1/ingest` - Ingest documents
+- `GET /api/v1/health` - Health check
+- `GET /docs` - Swagger UI
+## Usage
+```python
+import requests
+response = requests.post(
+    "https://your-space.hf.space/api/v1/query",
+    json={"query": "How do I reset my password?"}
+)
+print(response.json()["answer"])
+```
+## Architecture
+```
+User Query → Router Agent → Retriever Agent → Reasoning Agent → Response
+                                    ↓
+                              FAISS Vector Store
+```
+Built with LangChain, FastAPI, and HuggingFace models.

app.py ADDED Viewed

	@@ -0,0 +1,22 @@

+"""
+Hugging Face Spaces Entry Point
+================================
+This file is the entry point for Hugging Face Spaces deployment.
+It imports and runs the FastAPI application.
+"""
+import os
+# Set environment variables for HuggingFace Spaces
+# These can be overridden by Space secrets
+os.environ.setdefault("LLM_PROVIDER", "huggingface")
+os.environ.setdefault("EMBEDDING_PROVIDER", "huggingface")
+os.environ.setdefault("HUGGINGFACE_MODEL", "mistralai/Mistral-7B-Instruct-v0.2")
+os.environ.setdefault("HUGGINGFACE_EMBEDDING_MODEL", "sentence-transformers/all-MiniLM-L6-v2")
+# Import the FastAPI app
+from app.main import app
+# For Hugging Face Spaces, we need to expose the app
+# Spaces will automatically run this with uvicorn

app/__init__.py ADDED Viewed

	@@ -0,0 +1,17 @@

+"""
+Multi-Agent RAG System
+======================
+A production-grade Retrieval-Augmented Generation system using multiple
+specialized agents for query routing, document retrieval, reasoning, and
+action execution.
+Architecture:
+- Router Agent: Classifies and routes queries
+- Retriever Agent: Handles vector search and document retrieval
+- Reasoning Agent: Generates grounded responses from context
+- Action Agent: Executes specific actions when needed
+"""
+__version__ = "1.0.0"
+__author__ = "AI Engineer"

app/agents/__init__.py ADDED Viewed

	@@ -0,0 +1,44 @@

+"""
+Agents Module
+=============
+Multi-agent system with specialized agents for different tasks.
+ARCHITECTURE:
+                    ┌──────────────┐
+                    │ User Query   │
+                    └──────┬───────┘
+                           │
+                    ┌──────▼───────┐
+                    │ Router Agent │ ◄─── Classifies query intent
+                    └──────┬───────┘
+                           │
+           ┌───────────────┼───────────────┐
+           │               │               │
+    ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
+    │  Retriever  │ │  Reasoning  │ │   Action    │
+    │    Agent    │ │    Agent    │ │    Agent    │
+    └─────────────┘ └─────────────┘ └─────────────┘
+           │               ▲               │
+           └───────────────┘               │
+         (context flows to reasoning)      │
+                    ┌──────────────────────┘
+                    ▼
+             ┌──────────────┐
+             │   Response   │
+             └──────────────┘
+"""
+from app.agents.base_agent import BaseAgent
+from app.agents.router_agent import RouterAgent
+from app.agents.retriever_agent import RetrieverAgent
+from app.agents.reasoning_agent import ReasoningAgent
+from app.agents.action_agent import ActionAgent
+__all__ = [
+    "BaseAgent",
+    "RouterAgent",
+    "RetrieverAgent",
+    "ReasoningAgent",
+    "ActionAgent",
+]

app/agents/action_agent.py ADDED Viewed

	@@ -0,0 +1,386 @@

+"""
+Action Agent
+============
+The Action Agent executes specific actions based on user requests.
+RESPONSIBILITIES:
+1. Parse action requests from the query
+2. Execute the appropriate action
+3. Return confirmation and next steps
+4. Handle action failures gracefully
+SUPPORTED ACTIONS:
+- create_ticket: Create a support ticket
+- escalate: Escalate to human agent
+- send_email: Send email notification
+- search_kb: Deep search in knowledge base
+WHY AN ACTION AGENT?
+- Separates "thinking" from "doing"
+- Actions can be audited and logged
+- Easy to add new actions
+- Can integrate with external systems (ticketing, email, etc.)
+ARCHITECTURE:
+    ┌─────────────────────────────────────────┐
+    │           Action Agent                   │
+    │  ┌──────────────────────────────────┐   │
+    │  │    Action Router (LLM-based)     │   │
+    │  └─────────────┬────────────────────┘   │
+    │                │                         │
+    │   ┌────────────┼────────────┐           │
+    │   ▼            ▼            ▼           │
+    │ ┌─────┐   ┌─────────┐  ┌─────────┐     │
+    │ │Ticket│   │Escalate│  │ Email   │     │
+    │ │Tool  │   │ Tool   │  │  Tool   │     │
+    │ └─────┘   └─────────┘  └─────────┘     │
+    └─────────────────────────────────────────┘
+"""
+import logging
+from typing import Any, Optional
+from datetime import datetime
+import uuid
+from langchain_core.prompts import ChatPromptTemplate
+from pydantic import BaseModel, Field
+from app.agents.base_agent import BaseAgent
+from app.schemas.models import AgentResponse, AgentType, ActionType
+logger = logging.getLogger(__name__)
+# In a production system, these would integrate with real services
+# For now, we simulate the actions and return structured results
+class TicketData(BaseModel):
+    """Data for a created support ticket."""
+    ticket_id: str
+    title: str
+    description: str
+    priority: str
+    created_at: str
+    status: str = "open"
+class EscalationData(BaseModel):
+    """Data for an escalation."""
+    escalation_id: str
+    reason: str
+    priority: str
+    queue: str
+    estimated_wait: str
+class ActionResult(BaseModel):
+    """Result of an action execution."""
+    action_type: ActionType
+    success: bool
+    message: str
+    data: Optional[dict] = None
+ACTION_PROMPT = """You are an action executor for a customer support system.
+Based on the conversation, you need to execute the requested action.
+AVAILABLE ACTIONS:
+1. create_ticket - Create a support ticket for the issue
+2. escalate - Escalate to a human agent
+3. send_email - Send an email notification
+4. search_knowledge_base - Perform a deeper search
+5. none - No action needed
+USER QUERY: {query}
+CONTEXT: {context}
+REQUESTED ACTION: {action_type}
+Generate appropriate details for this action:
+- For tickets: Generate a clear title and description
+- For escalation: Determine priority and reason
+- For email: Determine recipient type and content summary
+Respond in this JSON format:
+{{
+    "title": "Brief title of the issue",
+    "description": "Detailed description",
+    "priority": "low|medium|high|urgent",
+    "reason": "Why this action is being taken"
+}}"""
+class ActionAgent(BaseAgent):
+    """
+    Executes actions based on user requests and reasoning output.
+    This agent handles the "doing" part of the system - creating tickets,
+    escalating issues, and other concrete actions.
+    In production, this would integrate with:
+    - Ticketing systems (Zendesk, Jira, ServiceNow)
+    - Email services (SendGrid, SES)
+    - Communication platforms (Slack, Teams)
+    """
+    def __init__(self, **kwargs):
+        """Initialize the Action Agent."""
+        super().__init__(**kwargs)
+        self._prompt = ChatPromptTemplate.from_template(ACTION_PROMPT)
+    @property
+    def agent_type(self) -> AgentType:
+        """Return the agent type."""
+        return AgentType.ACTION
+    async def execute(
+        self,
+        input_data: dict[str, Any],
+        **kwargs
+    ) -> AgentResponse:
+        """
+        Execute the requested action.
+        Args:
+            input_data:
+                - query: User's original query
+                - context: Retrieved context
+                - action_type: Which action to execute
+            **kwargs: Additional options
+        Returns:
+            AgentResponse with action result
+        """
+        query = input_data.get("query", "")
+        context = input_data.get("context", "")
+        action_type_str = input_data.get("action_type", "none")
+        # Parse action type
+        try:
+            action_type = ActionType(action_type_str)
+        except ValueError:
+            action_type = ActionType.NONE
+        # If no action needed, return early
+        if action_type == ActionType.NONE:
+            return AgentResponse(
+                agent_type=self.agent_type,
+                output="No action required.",
+                confidence=1.0,
+                metadata={"action_type": "none", "action_taken": False}
+            )
+        # Execute the appropriate action
+        result = await self._execute_action(
+            action_type=action_type,
+            query=query,
+            context=context
+        )
+        return AgentResponse(
+            agent_type=self.agent_type,
+            output=result.message,
+            confidence=1.0 if result.success else 0.5,
+            metadata={
+                "action_type": result.action_type.value,
+                "action_taken": result.success,
+                "action_data": result.data,
+            }
+        )
+    async def _execute_action(
+        self,
+        action_type: ActionType,
+        query: str,
+        context: str
+    ) -> ActionResult:
+        """
+        Execute a specific action.
+        This method routes to the appropriate action handler.
+        Args:
+            action_type: Which action to execute
+            query: User's query
+            context: Retrieved context
+        Returns:
+            ActionResult with success/failure and details
+        """
+        # Map action types to handlers
+        action_handlers = {
+            ActionType.CREATE_TICKET: self._create_ticket,
+            ActionType.ESCALATE: self._escalate,
+            ActionType.SEND_EMAIL: self._send_email,
+            ActionType.SEARCH_KB: self._search_kb,
+        }
+        handler = action_handlers.get(action_type)
+        if handler is None:
+            return ActionResult(
+                action_type=action_type,
+                success=False,
+                message=f"Unknown action type: {action_type.value}",
+            )
+        try:
+            return await handler(query, context)
+        except Exception as e:
+            logger.error(f"Action {action_type.value} failed: {e}")
+            return ActionResult(
+                action_type=action_type,
+                success=False,
+                message=f"Action failed: {str(e)}",
+            )
+    async def _create_ticket(self, query: str, context: str) -> ActionResult:
+        """
+        Create a support ticket.
+        In production, this would call a ticketing API.
+        Here we simulate the ticket creation.
+        """
+        # Generate ticket details using LLM
+        details = await self._generate_action_details(
+            ActionType.CREATE_TICKET, query, context
+        )
+        # Simulate ticket creation
+        ticket = TicketData(
+            ticket_id=f"TKT-{uuid.uuid4().hex[:8].upper()}",
+            title=details.get("title", "Support Request"),
+            description=details.get("description", query),
+            priority=details.get("priority", "medium"),
+            created_at=datetime.utcnow().isoformat(),
+        )
+        logger.info(f"Created ticket: {ticket.ticket_id}")
+        return ActionResult(
+            action_type=ActionType.CREATE_TICKET,
+            success=True,
+            message=f"I've created support ticket {ticket.ticket_id} for your issue. "
+                    f"Our team will review it shortly. Priority: {ticket.priority}.",
+            data=ticket.model_dump(),
+        )
+    async def _escalate(self, query: str, context: str) -> ActionResult:
+        """
+        Escalate to a human agent.
+        In production, this would add to a queue or notify agents.
+        """
+        details = await self._generate_action_details(
+            ActionType.ESCALATE, query, context
+        )
+        escalation = EscalationData(
+            escalation_id=f"ESC-{uuid.uuid4().hex[:8].upper()}",
+            reason=details.get("reason", "Customer requested human assistance"),
+            priority=details.get("priority", "medium"),
+            queue="general_support",
+            estimated_wait="5-10 minutes",
+        )
+        logger.info(f"Created escalation: {escalation.escalation_id}")
+        return ActionResult(
+            action_type=ActionType.ESCALATE,
+            success=True,
+            message=f"I've escalated your request to a human agent. "
+                    f"Reference: {escalation.escalation_id}. "
+                    f"Estimated wait time: {escalation.estimated_wait}. "
+                    f"A support representative will assist you shortly.",
+            data=escalation.model_dump(),
+        )
+    async def _send_email(self, query: str, context: str) -> ActionResult:
+        """
+        Send an email notification.
+        In production, this would call an email service.
+        """
+        details = await self._generate_action_details(
+            ActionType.SEND_EMAIL, query, context
+        )
+        # Simulate email sending
+        email_id = f"EMAIL-{uuid.uuid4().hex[:8].upper()}"
+        logger.info(f"Sent email: {email_id}")
+        return ActionResult(
+            action_type=ActionType.SEND_EMAIL,
+            success=True,
+            message="I've sent a confirmation email to your registered email address. "
+                    "Please check your inbox (and spam folder) shortly.",
+            data={
+                "email_id": email_id,
+                "subject": details.get("title", "Support Update"),
+            },
+        )
+    async def _search_kb(self, query: str, context: str) -> ActionResult:
+        """
+        Perform a deeper knowledge base search.
+        This could trigger a more thorough search with different parameters.
+        """
+        # In production, this might search with different strategies
+        return ActionResult(
+            action_type=ActionType.SEARCH_KB,
+            success=True,
+            message="I've initiated a deeper search of our knowledge base. "
+                    "This may take a moment for complex queries.",
+            data={"search_query": query},
+        )
+    async def _generate_action_details(
+        self,
+        action_type: ActionType,
+        query: str,
+        context: str
+    ) -> dict:
+        """
+        Use LLM to generate appropriate details for an action.
+        Args:
+            action_type: Type of action
+            query: User's query
+            context: Retrieved context
+        Returns:
+            Dictionary with action-specific details
+        """
+        try:
+            formatted = self._prompt.format_messages(
+                query=query,
+                context=context[:2000],  # Limit context length
+                action_type=action_type.value,
+            )
+            response = await self._llm.ainvoke(formatted)
+            # Parse JSON from response
+            import json
+            content = response.content
+            # Try to extract JSON from the response
+            if "{" in content:
+                json_start = content.index("{")
+                json_end = content.rindex("}") + 1
+                json_str = content[json_start:json_end]
+                return json.loads(json_str)
+        except Exception as e:
+            logger.warning(f"Failed to generate action details: {e}")
+        # Return defaults
+        return {
+            "title": "Support Request",
+            "description": query,
+            "priority": "medium",
+            "reason": "User request",
+        }

app/agents/base_agent.py ADDED Viewed

	@@ -0,0 +1,190 @@

+"""
+Base Agent
+==========
+Abstract base class for all agents in the multi-agent system.
+Supports multiple LLM providers including free options.
+SUPPORTED PROVIDERS:
+1. ollama     - Local LLMs (FREE, requires Ollama installed)
+2. huggingface - HuggingFace API (FREE tier available)
+3. groq       - Groq Cloud (FREE tier, very fast)
+4. google     - Google Gemini (FREE tier)
+5. openai     - OpenAI (PAID)
+"""
+import logging
+from abc import ABC, abstractmethod
+from typing import Any, Optional, Union
+from langchain_core.language_models.chat_models import BaseChatModel
+from app.config import get_settings
+from app.schemas.models import AgentResponse, AgentType
+logger = logging.getLogger(__name__)
+def create_llm(
+    provider: Optional[str] = None,
+    temperature: Optional[float] = None,
+) -> BaseChatModel:
+    """
+    Create an LLM instance based on the configured provider.
+    Args:
+        provider: Override the default provider from settings
+        temperature: Override the default temperature
+    Returns:
+        A LangChain chat model instance
+    Raises:
+        ValueError: If provider is not supported
+    """
+    settings = get_settings()
+    provider = provider or settings.llm_provider
+    temp = temperature if temperature is not None else settings.llm_temperature
+    logger.info(f"Creating LLM with provider: {provider}")
+    if provider == "ollama":
+        from langchain_community.chat_models import ChatOllama
+        return ChatOllama(
+            model=settings.ollama_model,
+            base_url=settings.ollama_base_url,
+            temperature=temp,
+        )
+    elif provider == "huggingface":
+        from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
+        # Use HuggingFace Inference API
+        llm = HuggingFaceEndpoint(
+            repo_id=settings.huggingface_model,
+            huggingfacehub_api_token=settings.huggingface_api_key,
+            temperature=temp,
+            max_new_tokens=1024,
+        )
+        return ChatHuggingFace(llm=llm)
+    elif provider == "groq":
+        from langchain_groq import ChatGroq
+        return ChatGroq(
+            model=settings.groq_model,
+            api_key=settings.groq_api_key,
+            temperature=temp,
+        )
+    elif provider == "google":
+        from langchain_google_genai import ChatGoogleGenerativeAI
+        return ChatGoogleGenerativeAI(
+            model=settings.google_model,
+            google_api_key=settings.google_api_key,
+            temperature=temp,
+        )
+    elif provider == "openai":
+        from langchain_openai import ChatOpenAI
+        return ChatOpenAI(
+            model=settings.openai_model,
+            openai_api_key=settings.openai_api_key,
+            temperature=temp,
+        )
+    else:
+        raise ValueError(f"Unsupported LLM provider: {provider}")
+class BaseAgent(ABC):
+    """
+    Abstract base class for all agents.
+    Each agent must implement:
+    - agent_type: What kind of agent this is
+    - execute(): Main logic for the agent
+    Provides:
+    - Multi-provider LLM initialization
+    - Consistent error handling
+    - Logging infrastructure
+    """
+    def __init__(
+        self,
+        llm: Optional[BaseChatModel] = None,
+        provider: Optional[str] = None,
+        temperature: Optional[float] = None,
+    ):
+        """
+        Initialize the base agent.
+        Args:
+            llm: Pre-configured LLM (optional, for testing/customization)
+            provider: Override LLM provider from settings
+            temperature: Override default temperature
+        """
+        self._settings = get_settings()
+        # Use provided LLM or create based on provider
+        if llm is not None:
+            self._llm = llm
+        else:
+            self._llm = create_llm(provider, temperature)
+        logger.info(f"Initialized {self.agent_type.value} agent")
+    @property
+    @abstractmethod
+    def agent_type(self) -> AgentType:
+        """Return the type of this agent."""
+        pass
+    @abstractmethod
+    async def execute(
+        self,
+        input_data: dict[str, Any],
+        **kwargs
+    ) -> AgentResponse:
+        """Execute the agent's main logic."""
+        pass
+    async def safe_execute(
+        self,
+        input_data: dict[str, Any],
+        **kwargs
+    ) -> AgentResponse:
+        """
+        Execute with error handling wrapper.
+        This ensures agents always return a valid response,
+        even if errors occur.
+        """
+        try:
+            logger.debug(f"{self.agent_type.value} starting execution")
+            response = await self.execute(input_data, **kwargs)
+            logger.debug(f"{self.agent_type.value} completed successfully")
+            return response
+        except Exception as e:
+            logger.error(
+                f"{self.agent_type.value} failed: {type(e).__name__}: {e}",
+                exc_info=True
+            )
+            return AgentResponse(
+                agent_type=self.agent_type,
+                output=f"Agent error: {str(e)}",
+                confidence=0.0,
+                metadata={"error": str(e), "error_type": type(e).__name__}
+            )
+    def _format_prompt(self, template: str, **kwargs) -> str:
+        """Format a prompt template with variables."""
+        try:
+            return template.format(**kwargs)
+        except KeyError as e:
+            logger.warning(f"Missing prompt variable: {e}")
+            return template
+    def __repr__(self) -> str:
+        """String representation for debugging."""
+        return f"{self.__class__.__name__}(type={self.agent_type.value})"

app/agents/reasoning_agent.py ADDED Viewed

	@@ -0,0 +1,237 @@

+"""
+Reasoning Agent
+===============
+The Reasoning Agent generates grounded responses based on retrieved context.
+RESPONSIBILITIES:
+1. Take the user query and retrieved context
+2. Generate an accurate, helpful response
+3. Ground all claims in the provided context
+4. Admit when information is not available
+KEY PRINCIPLE: GROUNDING
+- Every statement must be traceable to source documents
+- Never hallucinate or make up information
+- If context doesn't contain the answer, say so
+- This builds user trust and prevents misinformation
+WHY SEPARATE FROM RETRIEVER?
+- Clear separation of concerns
+- Retrieval can be optimized independently
+- Reasoning prompts can be tuned without affecting retrieval
+- Makes the system more testable
+"""
+import logging
+from typing import Any
+from langchain_core.prompts import ChatPromptTemplate
+from langchain_core.messages import HumanMessage, SystemMessage
+from app.agents.base_agent import BaseAgent
+from app.schemas.models import AgentResponse, AgentType
+from app.memory.conversation_memory import ConversationMemoryManager
+logger = logging.getLogger(__name__)
+# The reasoning prompt is critical for grounding
+# Key elements:
+# 1. Clear role definition
+# 2. Explicit grounding instructions
+# 3. How to handle missing information
+# 4. Format guidelines
+REASONING_SYSTEM_PROMPT = """You are a helpful customer support assistant.
+YOUR ROLE:
+- Answer user questions accurately based on the provided context
+- Be helpful, professional, and concise
+- Guide users through solutions step-by-step when appropriate
+CRITICAL GROUNDING RULES:
+1. ONLY use information from the provided context documents
+2. If the context doesn't contain the answer, say: "I don't have specific information about that in my knowledge base. Let me connect you with a human agent who can help."
+3. Do NOT make up information, policies, or procedures
+4. When referencing information, you may cite the source document
+5. If context is partially relevant, use what's applicable and note limitations
+RESPONSE FORMAT:
+- Be concise but complete
+- Use bullet points for lists or steps
+- If providing steps, number them
+- End with a helpful follow-up question if appropriate
+CONVERSATION HISTORY:
+{chat_history}
+RETRIEVED CONTEXT:
+{context}
+Remember: It's better to admit you don't know than to provide incorrect information."""
+USER_PROMPT = """User Question: {query}
+Please provide a helpful, accurate response based on the context above."""
+class ReasoningAgent(BaseAgent):
+    """
+    Generates grounded responses from retrieved context.
+    This agent is the "brain" that synthesizes information
+    from the retriever into coherent, accurate responses.
+    """
+    def __init__(
+        self,
+        memory_manager: ConversationMemoryManager = None,
+        **kwargs
+    ):
+        """
+        Initialize the Reasoning Agent.
+        Args:
+            memory_manager: For conversation context (optional)
+            **kwargs: Passed to BaseAgent
+        """
+        super().__init__(**kwargs)
+        self._memory_manager = memory_manager or ConversationMemoryManager()
+    @property
+    def agent_type(self) -> AgentType:
+        """Return the agent type."""
+        return AgentType.REASONING
+    async def execute(
+        self,
+        input_data: dict[str, Any],
+        **kwargs
+    ) -> AgentResponse:
+        """
+        Generate a grounded response.
+        Args:
+            input_data:
+                - query: User's question
+                - context: Retrieved documents (from Retriever Agent)
+                - conversation_id: For memory (optional)
+            **kwargs: Additional options
+        Returns:
+            AgentResponse with the generated answer
+        """
+        query = input_data.get("query", "")
+        context = input_data.get("context", "No context provided.")
+        conversation_id = input_data.get("conversation_id", "")
+        if not query:
+            return AgentResponse(
+                agent_type=self.agent_type,
+                output="No query provided",
+                confidence=0.0,
+                metadata={"error": "empty_query"}
+            )
+        # Get conversation history if available
+        chat_history = ""
+        if conversation_id:
+            chat_history = self._memory_manager.get_context_string(conversation_id)
+        # Format the prompts
+        system_prompt = REASONING_SYSTEM_PROMPT.format(
+            chat_history=chat_history or "No previous conversation.",
+            context=context
+        )
+        user_prompt = USER_PROMPT.format(query=query)
+        # Create messages for the LLM
+        messages = [
+            SystemMessage(content=system_prompt),
+            HumanMessage(content=user_prompt)
+        ]
+        # Generate response
+        try:
+            response = await self._llm.ainvoke(messages)
+            answer = response.content
+        except Exception as e:
+            logger.error(f"Reasoning failed: {e}")
+            return AgentResponse(
+                agent_type=self.agent_type,
+                output=f"I encountered an error generating a response: {str(e)}",
+                confidence=0.0,
+                metadata={"error": str(e)}
+            )
+        # Estimate confidence based on context relevance
+        # This is a simple heuristic - could be improved with more sophisticated methods
+        confidence = self._estimate_confidence(context, answer)
+        # Update conversation memory
+        if conversation_id:
+            self._memory_manager.add_user_message(conversation_id, query)
+            self._memory_manager.add_ai_message(conversation_id, answer)
+        logger.info(f"Generated response with confidence: {confidence:.2f}")
+        return AgentResponse(
+            agent_type=self.agent_type,
+            output=answer,
+            confidence=confidence,
+            metadata={
+                "has_context": bool(context and context != "No context provided."),
+                "conversation_id": conversation_id,
+            }
+        )
+    def _estimate_confidence(self, context: str, answer: str) -> float:
+        """
+        Estimate confidence in the generated response.
+        This is a simple heuristic based on:
+        1. Whether context was provided
+        2. Whether the answer admits uncertainty
+        A production system might use:
+        - NLI (Natural Language Inference) to check grounding
+        - Semantic similarity between answer and context
+        - LLM self-evaluation
+        Args:
+            context: The retrieved context
+            answer: The generated answer
+        Returns:
+            Confidence score between 0 and 1
+        """
+        # Start with base confidence
+        confidence = 0.7
+        # No context = lower confidence
+        if not context or context == "No context provided.":
+            confidence = 0.3
+        # Phrases indicating uncertainty
+        uncertainty_phrases = [
+            "i don't have",
+            "not sure",
+            "can't find",
+            "no information",
+            "don't know",
+            "couldn't find",
+            "not in my knowledge",
+        ]
+        answer_lower = answer.lower()
+        for phrase in uncertainty_phrases:
+            if phrase in answer_lower:
+                confidence = min(confidence, 0.4)
+                break
+        # Very short answers might indicate issues
+        if len(answer) < 50:
+            confidence = min(confidence, 0.5)
+        return confidence

app/agents/retriever_agent.py ADDED Viewed

	@@ -0,0 +1,160 @@

+"""
+Retriever Agent
+===============
+The Retriever Agent is responsible for finding relevant documents
+from the vector store based on the user's query.
+RESPONSIBILITIES:
+1. Take the user query
+2. Search the FAISS vector store
+3. Return relevant document chunks with scores
+4. Optionally re-rank results for better accuracy
+WHY A SEPARATE RETRIEVER AGENT?
+- Single responsibility: Only handles retrieval
+- Can be enhanced independently (re-ranking, hybrid search)
+- Makes testing easier
+- Allows for retrieval-specific optimizations
+RETRIEVAL STRATEGY:
+1. Embed the query using same model as documents
+2. Find top-k nearest neighbors in vector space
+3. Return documents with relevance scores
+4. (Optional) Re-rank using cross-encoder for better accuracy
+"""
+import logging
+from typing import Any
+from app.agents.base_agent import BaseAgent
+from app.schemas.models import AgentResponse, AgentType, RetrievedDocument
+from app.vectorstore.faiss_store import FAISSVectorStore
+from app.config import get_settings
+logger = logging.getLogger(__name__)
+class RetrieverAgent(BaseAgent):
+    """
+    Retrieves relevant documents from the vector store.
+    This agent encapsulates the retrieval logic, making it easy to:
+    - Swap vector stores (FAISS -> Pinecone)
+    - Add re-ranking
+    - Implement hybrid search (semantic + keyword)
+    """
+    def __init__(self, vector_store: FAISSVectorStore = None, **kwargs):
+        """
+        Initialize the Retriever Agent.
+        Args:
+            vector_store: FAISS store instance (uses singleton if not provided)
+            **kwargs: Passed to BaseAgent
+        """
+        super().__init__(**kwargs)
+        self._vector_store = vector_store or FAISSVectorStore()
+        self._settings = get_settings()
+    @property
+    def agent_type(self) -> AgentType:
+        """Return the agent type."""
+        return AgentType.RETRIEVER
+    async def execute(
+        self,
+        input_data: dict[str, Any],
+        **kwargs
+    ) -> AgentResponse:
+        """
+        Retrieve relevant documents for the query.
+        Args:
+            input_data: Must contain 'query' key
+            **kwargs: Optional 'top_k' to override default
+        Returns:
+            AgentResponse with retrieved documents in metadata
+        """
+        query = input_data.get("query", "")
+        top_k = kwargs.get("top_k", self._settings.retrieval_top_k)
+        if not query:
+            return AgentResponse(
+                agent_type=self.agent_type,
+                output="No query provided for retrieval",
+                confidence=0.0,
+                metadata={"documents": [], "error": "empty_query"}
+            )
+        # Check if vector store is ready
+        if not self._vector_store.is_ready:
+            return AgentResponse(
+                agent_type=self.agent_type,
+                output="No documents in knowledge base. Please ingest documents first.",
+                confidence=0.0,
+                metadata={"documents": [], "error": "no_documents"}
+            )
+        # Perform similarity search
+        try:
+            results = self._vector_store.similarity_search(query, k=top_k)
+        except Exception as e:
+            logger.error(f"Retrieval failed: {e}")
+            return AgentResponse(
+                agent_type=self.agent_type,
+                output=f"Retrieval error: {str(e)}",
+                confidence=0.0,
+                metadata={"documents": [], "error": str(e)}
+            )
+        # Convert results to RetrievedDocument objects
+        documents = []
+        for doc, score in results:
+            retrieved_doc = RetrievedDocument(
+                content=doc.page_content,
+                source=doc.metadata.get("source", "unknown"),
+                relevance_score=score,
+                chunk_index=doc.metadata.get("chunk_index")
+            )
+            documents.append(retrieved_doc)
+        # Calculate overall confidence based on best match
+        best_score = documents[0].relevance_score if documents else 0.0
+        # Format documents for context
+        context_parts = []
+        for i, doc in enumerate(documents, 1):
+            context_parts.append(
+                f"[Document {i}] (Source: {doc.source}, Relevance: {doc.relevance_score:.2f})\n"
+                f"{doc.content}"
+            )
+        context_string = "\n\n---\n\n".join(context_parts) if context_parts else "No relevant documents found."
+        logger.info(
+            f"Retrieved {len(documents)} documents, best score: {best_score:.2f}"
+        )
+        return AgentResponse(
+            agent_type=self.agent_type,
+            output=context_string,
+            confidence=best_score,
+            metadata={
+                "documents": [doc.model_dump() for doc in documents],
+                "document_count": len(documents),
+                "best_relevance_score": best_score,
+            }
+        )
+    def get_retriever(self):
+        """
+        Get LangChain-compatible retriever interface.
+        This allows the agent to be used directly in LangChain chains.
+        Returns:
+            LangChain Retriever instance
+        """
+        return self._vector_store.as_retriever()

app/agents/router_agent.py ADDED Viewed

	@@ -0,0 +1,211 @@

+"""
+Router Agent
+============
+The Router Agent is the "traffic controller" of the multi-agent system.
+It analyzes incoming queries and decides which agents should handle them.
+RESPONSIBILITIES:
+1. Classify query intent (question, action request, or both)
+2. Determine if retrieval is needed
+3. Decide if actions should be executed
+4. Return routing instructions for the orchestrator
+WHY A ROUTER?
+- Not all queries need all agents
+- Saves compute by skipping unnecessary agents
+- Enables conditional logic in the pipeline
+- Makes the system more efficient
+ROUTING LOGIC:
+┌─────────────────────────────────────────────────────────────────┐
+│ Query Type          │ Retrieval │ Reasoning │ Action           │
+├─────────────────────┼───────────┼───────────┼──────────────────┤
+│ Factual question    │ YES       │ YES       │ NO               │
+│ How-to question     │ YES       │ YES       │ NO               │
+│ Action request      │ MAYBE     │ YES       │ YES              │
+│ Small talk          │ NO        │ YES       │ NO               │
+└─────────────────────────────────────────────────────────────────┘
+"""
+import json
+import logging
+from typing import Any
+from langchain_core.prompts import ChatPromptTemplate
+from langchain_core.output_parsers import PydanticOutputParser
+from pydantic import BaseModel, Field
+from app.agents.base_agent import BaseAgent
+from app.schemas.models import AgentResponse, AgentType, ActionType
+logger = logging.getLogger(__name__)
+class RoutingDecision(BaseModel):
+    """
+    Structured output from the Router Agent.
+    This Pydantic model ensures the LLM returns valid routing decisions.
+    """
+    needs_retrieval: bool = Field(
+        description="Whether to search the knowledge base for context"
+    )
+    needs_reasoning: bool = Field(
+        default=True,
+        description="Whether to generate a reasoned response"
+    )
+    needs_action: bool = Field(
+        description="Whether to execute an action (create ticket, etc.)"
+    )
+    suggested_action: str = Field(
+        default="none",
+        description="Which action to execute if needs_action is True"
+    )
+    query_type: str = Field(
+        description="Classification: 'factual', 'how_to', 'action', 'general'"
+    )
+    confidence: float = Field(
+        ge=0.0,
+        le=1.0,
+        description="Confidence in the routing decision"
+    )
+    reasoning: str = Field(
+        description="Brief explanation of the routing decision"
+    )
+# The router prompt is carefully designed to:
+# 1. Give clear context about the system
+# 2. Provide examples of different query types
+# 3. Request structured JSON output
+# 4. Ask for reasoning (helps catch errors)
+ROUTER_PROMPT = """You are a query router for a customer support system.
+Your job is to analyze the user's query and decide how to handle it.
+AVAILABLE AGENTS:
+1. Retriever: Searches the knowledge base for relevant information
+2. Reasoning: Generates responses based on context
+3. Action: Executes actions like creating tickets or escalating
+ROUTING RULES:
+- Factual questions (what, who, when, where) -> Retriever + Reasoning
+- How-to questions (how do I, steps to) -> Retriever + Reasoning
+- Action requests (create ticket, escalate, reset password) -> Retriever + Reasoning + Action
+- General conversation (hello, thanks, goodbye) -> Reasoning only
+- Complaints or urgent issues -> Retriever + Reasoning + Action (escalate)
+AVAILABLE ACTIONS:
+- create_ticket: Create a support ticket
+- escalate: Escalate to human agent
+- send_email: Send email notification
+- search_knowledge_base: Deep search in KB
+- none: No action needed
+USER QUERY: {query}
+CONVERSATION CONTEXT: {context}
+Analyze the query and provide your routing decision.
+{format_instructions}"""
+class RouterAgent(BaseAgent):
+    """
+    Routes queries to appropriate agents based on intent classification.
+    The Router Agent uses an LLM to understand the query intent and
+    determine which agents should process it. This enables:
+    1. Efficiency: Skip unnecessary agents
+    2. Accuracy: Match queries to appropriate handlers
+    3. Flexibility: Easy to add new routing logic
+    """
+    def __init__(self, **kwargs):
+        """Initialize the Router Agent."""
+        super().__init__(**kwargs)
+        # Parser ensures LLM output matches RoutingDecision schema
+        self._parser = PydanticOutputParser(pydantic_object=RoutingDecision)
+        self._prompt = ChatPromptTemplate.from_template(ROUTER_PROMPT)
+    @property
+    def agent_type(self) -> AgentType:
+        """Return the agent type."""
+        return AgentType.ROUTER
+    async def execute(
+        self,
+        input_data: dict[str, Any],
+        **kwargs
+    ) -> AgentResponse:
+        """
+        Analyze query and determine routing.
+        Args:
+            input_data: Must contain 'query' key, optionally 'context'
+        Returns:
+            AgentResponse with routing decision in metadata
+        """
+        query = input_data.get("query", "")
+        context = input_data.get("context", "No previous context")
+        if not query:
+            return AgentResponse(
+                agent_type=self.agent_type,
+                output="No query provided",
+                confidence=0.0,
+                metadata={"error": "empty_query"}
+            )
+        # Format the prompt with query and context
+        formatted_prompt = self._prompt.format_messages(
+            query=query,
+            context=context,
+            format_instructions=self._parser.get_format_instructions()
+        )
+        # Get routing decision from LLM
+        response = await self._llm.ainvoke(formatted_prompt)
+        # Parse the structured output
+        try:
+            decision = self._parser.parse(response.content)
+        except Exception as e:
+            logger.warning(f"Failed to parse routing decision: {e}")
+            # Default to full pipeline on parse error
+            decision = RoutingDecision(
+                needs_retrieval=True,
+                needs_reasoning=True,
+                needs_action=False,
+                suggested_action="none",
+                query_type="unknown",
+                confidence=0.5,
+                reasoning="Failed to parse, using default routing"
+            )
+        # Convert suggested_action string to ActionType enum
+        try:
+            action_type = ActionType(decision.suggested_action)
+        except ValueError:
+            action_type = ActionType.NONE
+        logger.info(
+            f"Routed query: retrieval={decision.needs_retrieval}, "
+            f"action={decision.needs_action} ({action_type.value})"
+        )
+        return AgentResponse(
+            agent_type=self.agent_type,
+            output=decision.reasoning,
+            confidence=decision.confidence,
+            metadata={
+                "needs_retrieval": decision.needs_retrieval,
+                "needs_reasoning": decision.needs_reasoning,
+                "needs_action": decision.needs_action,
+                "suggested_action": action_type.value,
+                "query_type": decision.query_type,
+            }
+        )

app/api/__init__.py ADDED Viewed

	@@ -0,0 +1,10 @@

+"""
+API Module
+==========
+FastAPI routes and endpoint definitions.
+"""
+from app.api.routes import router
+__all__ = ["router"]

app/api/routes.py ADDED Viewed

	@@ -0,0 +1,306 @@

+"""
+API Routes
+==========
+FastAPI endpoints for the Multi-Agent RAG system.
+ENDPOINTS:
+- POST /query: Submit a query to the RAG system
+- POST /ingest: Ingest documents into the knowledge base
+- GET /health: Health check endpoint
+- DELETE /documents: Clear all documents
+WHY FastAPI?
+- Automatic OpenAPI documentation
+- Type validation via Pydantic
+- Async support for scalability
+- Easy to test
+"""
+import logging
+from typing import Optional
+from fastapi import APIRouter, HTTPException, BackgroundTasks
+from fastapi.responses import JSONResponse
+from app.schemas.models import (
+    QueryRequest,
+    QueryResponse,
+    IngestionRequest,
+    IngestionResponse,
+    HealthResponse,
+)
+from app.services.orchestrator import MultiAgentOrchestrator
+from app.services.document_service import DocumentService
+from app import __version__
+logger = logging.getLogger(__name__)
+# Create router with prefix and tags for OpenAPI docs
+router = APIRouter(prefix="/api/v1", tags=["rag"])
+# Lazy initialization of services
+# These are created on first request to avoid startup delays
+_orchestrator: Optional[MultiAgentOrchestrator] = None
+_document_service: Optional[DocumentService] = None
+def get_orchestrator() -> MultiAgentOrchestrator:
+    """Get or create the orchestrator instance."""
+    global _orchestrator
+    if _orchestrator is None:
+        _orchestrator = MultiAgentOrchestrator()
+    return _orchestrator
+def get_document_service() -> DocumentService:
+    """Get or create the document service instance."""
+    global _document_service
+    if _document_service is None:
+        _document_service = DocumentService()
+    return _document_service
+# =============================================================================
+# Health Check
+# =============================================================================
+@router.get(
+    "/health",
+    response_model=HealthResponse,
+    summary="Health Check",
+    description="Check if the API is running and the vector store is ready",
+)
+async def health_check() -> HealthResponse:
+    """
+    Health check endpoint.
+    Returns the API status and whether the knowledge base is ready.
+    """
+    orchestrator = get_orchestrator()
+    return HealthResponse(
+        status="healthy",
+        version=__version__,
+        vector_store_ready=orchestrator.is_ready,
+        document_count=orchestrator.document_count,
+    )
+# =============================================================================
+# Query Endpoint
+# =============================================================================
+@router.post(
+    "/query",
+    response_model=QueryResponse,
+    summary="Submit Query",
+    description="Submit a question to the Multi-Agent RAG system",
+)
+async def submit_query(request: QueryRequest) -> QueryResponse:
+    """
+    Process a user query through the multi-agent RAG pipeline.
+    The query flows through:
+    1. Router Agent: Classifies intent and routes
+    2. Retriever Agent: Searches knowledge base
+    3. Reasoning Agent: Generates grounded response
+    4. Action Agent: Executes actions if needed
+    Args:
+        request: QueryRequest with the user's question
+    Returns:
+        QueryResponse with answer and sources
+    Raises:
+        HTTPException: If processing fails
+    """
+    try:
+        orchestrator = get_orchestrator()
+        # Check if we have documents
+        if not orchestrator.is_ready:
+            logger.warning("Query received but no documents in knowledge base")
+            # We still process - reasoning agent will handle this gracefully
+        response = await orchestrator.process_query(request)
+        logger.info(
+            f"Query processed: {request.query[:50]}... -> "
+            f"{len(response.answer)} chars, {len(response.sources)} sources"
+        )
+        return response
+    except Exception as e:
+        logger.error(f"Query processing failed: {e}", exc_info=True)
+        raise HTTPException(
+            status_code=500,
+            detail=f"Failed to process query: {str(e)}"
+        )
+# =============================================================================
+# Document Ingestion
+# =============================================================================
+@router.post(
+    "/ingest",
+    response_model=IngestionResponse,
+    summary="Ingest Documents",
+    description="Ingest documents into the knowledge base",
+)
+async def ingest_documents(request: IngestionRequest) -> IngestionResponse:
+    """
+    Ingest documents into the vector store.
+    If file_paths are specified, only those files are ingested.
+    Otherwise, all documents in the configured directory are ingested.
+    Args:
+        request: IngestionRequest specifying what to ingest
+    Returns:
+        IngestionResponse with processing results
+    """
+    try:
+        service = get_document_service()
+        response = await service.ingest_documents(request)
+        if response.errors:
+            logger.warning(f"Ingestion completed with errors: {response.errors}")
+        else:
+            logger.info(
+                f"Ingestion complete: {response.documents_processed} documents, "
+                f"{response.chunks_created} chunks"
+            )
+        return response
+    except Exception as e:
+        logger.error(f"Ingestion failed: {e}", exc_info=True)
+        raise HTTPException(
+            status_code=500,
+            detail=f"Failed to ingest documents: {str(e)}"
+        )
+@router.post(
+    "/ingest/text",
+    summary="Ingest Text",
+    description="Ingest raw text directly into the knowledge base",
+)
+async def ingest_text(
+    text: str,
+    source_name: str = "direct_input"
+) -> dict:
+    """
+    Ingest raw text directly without creating a file.
+    Args:
+        text: The text content to ingest
+        source_name: Name to use as the source
+    Returns:
+        Dictionary with chunks created count
+    """
+    try:
+        service = get_document_service()
+        chunks = await service.ingest_text(text, source_name)
+        return {
+            "success": True,
+            "chunks_created": chunks,
+            "source": source_name,
+        }
+    except Exception as e:
+        logger.error(f"Text ingestion failed: {e}", exc_info=True)
+        raise HTTPException(
+            status_code=500,
+            detail=f"Failed to ingest text: {str(e)}"
+        )
+# =============================================================================
+# Document Management
+# =============================================================================
+@router.delete(
+    "/documents",
+    summary="Clear Documents",
+    description="Delete all documents from the knowledge base",
+)
+async def clear_documents() -> dict:
+    """
+    Clear all documents from the vector store.
+    WARNING: This is destructive and cannot be undone.
+    Returns:
+        Confirmation message
+    """
+    try:
+        service = get_document_service()
+        service.clear_all_documents()
+        return {
+            "success": True,
+            "message": "All documents have been cleared from the knowledge base",
+        }
+    except Exception as e:
+        logger.error(f"Failed to clear documents: {e}", exc_info=True)
+        raise HTTPException(
+            status_code=500,
+            detail=f"Failed to clear documents: {str(e)}"
+        )
+@router.get(
+    "/documents/count",
+    summary="Document Count",
+    description="Get the number of document chunks in the knowledge base",
+)
+async def get_document_count() -> dict:
+    """
+    Get the current document chunk count.
+    Returns:
+        Dictionary with the count
+    """
+    service = get_document_service()
+    return {
+        "count": service.get_document_count(),
+        "ready": service.is_ready(),
+    }
+# =============================================================================
+# Conversation Management
+# =============================================================================
+@router.delete(
+    "/conversations/{conversation_id}",
+    summary="Clear Conversation",
+    description="Clear memory for a specific conversation",
+)
+async def clear_conversation(conversation_id: str) -> dict:
+    """
+    Clear memory for a specific conversation.
+    Args:
+        conversation_id: The conversation to clear
+    Returns:
+        Confirmation message
+    """
+    orchestrator = get_orchestrator()
+    orchestrator._memory_manager.clear_conversation(conversation_id)
+    return {
+        "success": True,
+        "message": f"Cleared memory for conversation: {conversation_id}",
+    }

app/config.py ADDED Viewed

	@@ -0,0 +1,212 @@

+"""
+Configuration Management
+========================
+This module centralizes all configuration using pydantic-settings.
+It loads environment variables and provides type-safe access to config values.
+SUPPORTED LLM PROVIDERS (all have free tiers):
+1. ollama     - Local LLMs (Llama, Mistral) - completely free
+2. huggingface - HuggingFace Inference API - free tier available
+3. groq       - Groq Cloud - free tier with fast inference
+4. google     - Google Gemini - free tier available
+5. openai     - OpenAI - paid (for reference)
+"""
+import os
+from pathlib import Path
+from functools import lru_cache
+from typing import Literal
+from pydantic_settings import BaseSettings
+from pydantic import Field
+class Settings(BaseSettings):
+    """
+    Application settings loaded from environment variables.
+    All settings have sensible defaults for development.
+    In production, override via environment variables or .env file.
+    """
+    # =========================================================================
+    # LLM Provider Selection
+    # =========================================================================
+    llm_provider: Literal["ollama", "huggingface", "groq", "google", "openai"] = Field(
+        default="ollama",
+        description="Which LLM provider to use (ollama is free and local)"
+    )
+    # =========================================================================
+    # API Keys (only needed for cloud providers)
+    # =========================================================================
+    openai_api_key: str = Field(
+        default="",
+        description="OpenAI API key (only if using openai provider)"
+    )
+    huggingface_api_key: str = Field(
+        default="",
+        description="HuggingFace API key (free at huggingface.co)"
+    )
+    groq_api_key: str = Field(
+        default="",
+        description="Groq API key (free at console.groq.com)"
+    )
+    google_api_key: str = Field(
+        default="",
+        description="Google API key (free at makersuite.google.com)"
+    )
+    # =========================================================================
+    # Model Configuration
+    # =========================================================================
+    # Models for each provider
+    ollama_model: str = Field(
+        default="llama3.2",
+        description="Ollama model (llama3.2, mistral, phi3, etc.)"
+    )
+    huggingface_model: str = Field(
+        default="mistralai/Mistral-7B-Instruct-v0.2",
+        description="HuggingFace model ID"
+    )
+    groq_model: str = Field(
+        default="llama-3.1-8b-instant",
+        description="Groq model (llama-3.1-8b-instant, mixtral-8x7b-32768)"
+    )
+    google_model: str = Field(
+        default="gemini-1.5-flash",
+        description="Google Gemini model"
+    )
+    openai_model: str = Field(
+        default="gpt-3.5-turbo",
+        description="OpenAI model"
+    )
+    # Temperature controls randomness: 0 = deterministic, 1 = creative
+    llm_temperature: float = Field(
+        default=0.1,
+        ge=0.0,
+        le=1.0,
+        description="LLM temperature (lower = more focused)"
+    )
+    # =========================================================================
+    # Embedding Configuration
+    # =========================================================================
+    embedding_provider: Literal["huggingface", "openai"] = Field(
+        default="huggingface",
+        description="Embedding provider (huggingface is free and local)"
+    )
+    huggingface_embedding_model: str = Field(
+        default="sentence-transformers/all-MiniLM-L6-v2",
+        description="Free local embedding model"
+    )
+    openai_embedding_model: str = Field(
+        default="text-embedding-3-small",
+        description="OpenAI embedding model (paid)"
+    )
+    # =========================================================================
+    # Ollama Configuration
+    # =========================================================================
+    ollama_base_url: str = Field(
+        default="http://localhost:11434",
+        description="Ollama server URL"
+    )
+    # =========================================================================
+    # Vector Store Configuration
+    # =========================================================================
+    faiss_index_path: Path = Field(
+        default=Path("./data/faiss_index"),
+        description="Directory to store FAISS index"
+    )
+    documents_path: Path = Field(
+        default=Path("./data/documents"),
+        description="Directory containing source documents"
+    )
+    chunk_size: int = Field(
+        default=1000,
+        description="Size of document chunks for embedding"
+    )
+    chunk_overlap: int = Field(
+        default=200,
+        description="Overlap between consecutive chunks"
+    )
+    retrieval_top_k: int = Field(
+        default=5,
+        description="Number of documents to retrieve"
+    )
+    # =========================================================================
+    # API Configuration
+    # =========================================================================
+    api_host: str = Field(
+        default="0.0.0.0",
+        description="Host to bind the API server"
+    )
+    api_port: int = Field(
+        default=8000,
+        description="Port for the API server"
+    )
+    debug_mode: bool = Field(
+        default=True,
+        description="Enable debug mode for development"
+    )
+    # =========================================================================
+    # Logging Configuration
+    # =========================================================================
+    log_level: Literal["DEBUG", "INFO", "WARNING", "ERROR"] = Field(
+        default="INFO",
+        description="Logging level"
+    )
+    class Config:
+        """Pydantic configuration for settings."""
+        env_file = ".env"
+        env_file_encoding = "utf-8"
+        extra = "ignore"
+    def ensure_directories(self) -> None:
+        """Create necessary directories if they don't exist."""
+        self.faiss_index_path.mkdir(parents=True, exist_ok=True)
+        self.documents_path.mkdir(parents=True, exist_ok=True)
+    def get_model_name(self) -> str:
+        """Get the model name for the selected provider."""
+        model_map = {
+            "ollama": self.ollama_model,
+            "huggingface": self.huggingface_model,
+            "groq": self.groq_model,
+            "google": self.google_model,
+            "openai": self.openai_model,
+        }
+        return model_map.get(self.llm_provider, self.ollama_model)
+@lru_cache()
+def get_settings() -> Settings:
+    """
+    Get cached settings instance.
+    Returns:
+        Settings: Application configuration instance
+    """
+    return Settings()

app/main.py ADDED Viewed

	@@ -0,0 +1,168 @@

+"""
+Multi-Agent RAG System - Main Application
+==========================================
+FastAPI application entry point for the Multi-Agent RAG system.
+This module:
+- Creates the FastAPI application
+- Configures middleware and CORS
+- Includes API routes
+- Sets up logging
+- Provides startup/shutdown hooks
+RUNNING THE APP:
+    Development: uvicorn app.main:app --reload
+    Production:  uvicorn app.main:app --host 0.0.0.0 --port 8000
+API DOCUMENTATION:
+    - Swagger UI: http://localhost:8000/docs
+    - ReDoc: http://localhost:8000/redoc
+"""
+import logging
+import sys
+from contextlib import asynccontextmanager
+from fastapi import FastAPI
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import RedirectResponse
+from app.api.routes import router
+from app.config import get_settings
+from app import __version__
+# =============================================================================
+# Logging Setup
+# =============================================================================
+def setup_logging() -> None:
+    """
+    Configure logging for the application.
+    Sets up structured logging with appropriate levels and formatting.
+    """
+    settings = get_settings()
+    # Create formatter
+    formatter = logging.Formatter(
+        fmt="%(asctime)s | %(levelname)-8s | %(name)s | %(message)s",
+        datefmt="%Y-%m-%d %H:%M:%S",
+    )
+    # Configure root logger
+    root_logger = logging.getLogger()
+    root_logger.setLevel(settings.log_level)
+    # Console handler
+    console_handler = logging.StreamHandler(sys.stdout)
+    console_handler.setFormatter(formatter)
+    root_logger.addHandler(console_handler)
+    # Reduce noise from third-party libraries
+    logging.getLogger("httpx").setLevel(logging.WARNING)
+    logging.getLogger("openai").setLevel(logging.WARNING)
+    logging.getLogger("langchain").setLevel(logging.WARNING)
+# =============================================================================
+# Application Lifecycle
+# =============================================================================
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    """
+    Application lifespan manager.
+    Handles startup and shutdown events:
+    - Startup: Initialize logging, ensure directories exist
+    - Shutdown: Cleanup resources
+    Args:
+        app: FastAPI application instance
+    """
+    # Startup
+    setup_logging()
+    logger = logging.getLogger(__name__)
+    settings = get_settings()
+    settings.ensure_directories()
+    logger.info(f"Starting Multi-Agent RAG System v{__version__}")
+    logger.info(f"Debug mode: {settings.debug_mode}")
+    logger.info(f"Documents path: {settings.documents_path}")
+    logger.info(f"FAISS index path: {settings.faiss_index_path}")
+    yield  # Application runs here
+    # Shutdown
+    logger.info("Shutting down Multi-Agent RAG System")
+# =============================================================================
+# FastAPI Application
+# =============================================================================
+def create_app() -> FastAPI:
+    """
+    Create and configure the FastAPI application.
+    Returns:
+        Configured FastAPI instance
+    """
+    settings = get_settings()
+    app = FastAPI(
+        title="Multi-Agent RAG System",
+        description=(
+            "A production-grade Retrieval-Augmented Generation system "
+            "using multiple specialized agents for query routing, "
+            "document retrieval, reasoning, and action execution."
+        ),
+        version=__version__,
+        docs_url="/docs",
+        redoc_url="/redoc",
+        lifespan=lifespan,
+    )
+    # Configure CORS for frontend access
+    app.add_middleware(
+        CORSMiddleware,
+        allow_origins=["*"],  # In production, specify allowed origins
+        allow_credentials=True,
+        allow_methods=["*"],
+        allow_headers=["*"],
+    )
+    # Include API routes
+    app.include_router(router)
+    # Root redirect to docs
+    @app.get("/", include_in_schema=False)
+    async def root():
+        """Redirect root to API documentation."""
+        return RedirectResponse(url="/docs")
+    return app
+# Create the application instance
+app = create_app()
+# =============================================================================
+# Development Server
+# =============================================================================
+if __name__ == "__main__":
+    import uvicorn
+    settings = get_settings()
+    uvicorn.run(
+        "app.main:app",
+        host=settings.api_host,
+        port=settings.api_port,
+        reload=settings.debug_mode,
+        log_level=settings.log_level.lower(),
+    )

app/memory/__init__.py ADDED Viewed

	@@ -0,0 +1,10 @@

+"""
+Memory Module
+=============
+Handles conversation memory for multi-turn interactions.
+"""
+from app.memory.conversation_memory import ConversationMemoryManager
+__all__ = ["ConversationMemoryManager"]

app/memory/conversation_memory.py ADDED Viewed

	@@ -0,0 +1,201 @@

+"""
+Conversation Memory Manager
+===========================
+This module manages conversation history for multi-turn interactions.
+WHY MEMORY?
+- Users expect context: "What about the second option?" requires memory
+- Follow-up questions need previous context
+- Maintains coherent, contextual conversations
+We implement a simple window-based memory that keeps the last N messages.
+This avoids deprecated LangChain memory modules and gives us full control.
+"""
+import logging
+from typing import Optional
+from collections import defaultdict, deque
+from dataclasses import dataclass, field
+from datetime import datetime
+import uuid
+from app.schemas.models import ConversationMessage
+logger = logging.getLogger(__name__)
+# Maximum messages to keep in memory (per conversation)
+# 10 exchanges = 20 messages (user + assistant)
+DEFAULT_WINDOW_SIZE = 10
+@dataclass
+class ConversationHistory:
+    """Holds the message history for a single conversation."""
+    messages: deque = field(default_factory=lambda: deque(maxlen=DEFAULT_WINDOW_SIZE * 2))
+    created_at: datetime = field(default_factory=datetime.utcnow)
+    last_updated: datetime = field(default_factory=datetime.utcnow)
+class ConversationMemoryManager:
+    """
+    Manages conversation memory across multiple user sessions.
+    Each conversation_id gets its own memory instance.
+    This allows multiple concurrent users without mixing context.
+    Usage:
+        manager = ConversationMemoryManager()
+        # Add messages
+        manager.add_user_message("session-123", "Hello")
+        manager.add_ai_message("session-123", "Hi there!")
+        # Get context for prompts
+        context = manager.get_context_string("session-123")
+    """
+    def __init__(self, window_size: int = DEFAULT_WINDOW_SIZE):
+        """
+        Initialize the memory manager.
+        Args:
+            window_size: Number of conversation turns to remember
+        """
+        self._window_size = window_size
+        # Dictionary mapping conversation_id -> ConversationHistory
+        self._conversations: dict[str, ConversationHistory] = {}
+    def _get_or_create_history(
+        self,
+        conversation_id: str
+    ) -> ConversationHistory:
+        """
+        Get existing history or create new one for conversation.
+        Args:
+            conversation_id: Unique identifier for the conversation
+        Returns:
+            ConversationHistory for this conversation
+        """
+        if conversation_id not in self._conversations:
+            self._conversations[conversation_id] = ConversationHistory(
+                messages=deque(maxlen=self._window_size * 2)
+            )
+            logger.debug(f"Created new memory for conversation: {conversation_id}")
+        return self._conversations[conversation_id]
+    def add_user_message(self, conversation_id: str, content: str) -> None:
+        """
+        Add a user message to conversation history.
+        Args:
+            conversation_id: Conversation identifier
+            content: The user's message text
+        """
+        history = self._get_or_create_history(conversation_id)
+        message = ConversationMessage(role="user", content=content)
+        history.messages.append(message)
+        history.last_updated = datetime.utcnow()
+        logger.debug(f"Added user message to {conversation_id}: {content[:50]}...")
+    def add_ai_message(self, conversation_id: str, content: str) -> None:
+        """
+        Add an AI response to conversation history.
+        Args:
+            conversation_id: Conversation identifier
+            content: The AI's response text
+        """
+        history = self._get_or_create_history(conversation_id)
+        message = ConversationMessage(role="assistant", content=content)
+        history.messages.append(message)
+        history.last_updated = datetime.utcnow()
+        logger.debug(f"Added AI message to {conversation_id}: {content[:50]}...")
+    def get_messages(self, conversation_id: str) -> list[ConversationMessage]:
+        """
+        Get all messages in a conversation.
+        Useful for API responses that want to show conversation history.
+        Args:
+            conversation_id: Conversation identifier
+        Returns:
+            List of messages in chronological order
+        """
+        if conversation_id not in self._conversations:
+            return []
+        return list(self._conversations[conversation_id].messages)
+    def get_context_string(self, conversation_id: str) -> str:
+        """
+        Get conversation history as a formatted string.
+        Useful for including in prompts.
+        Args:
+            conversation_id: Conversation identifier
+        Returns:
+            Formatted string of conversation history
+        """
+        messages = self.get_messages(conversation_id)
+        if not messages:
+            return "No previous conversation."
+        lines = []
+        for msg in messages:
+            role = "User" if msg.role == "user" else "Assistant"
+            lines.append(f"{role}: {msg.content}")
+        return "\n".join(lines)
+    def get_messages_for_llm(self, conversation_id: str) -> list[dict]:
+        """
+        Get messages formatted for LLM consumption.
+        Returns messages in the format expected by chat models:
+        [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]
+        Args:
+            conversation_id: Conversation identifier
+        Returns:
+            List of message dictionaries
+        """
+        messages = self.get_messages(conversation_id)
+        return [{"role": msg.role, "content": msg.content} for msg in messages]
+    def clear_conversation(self, conversation_id: str) -> None:
+        """
+        Clear all memory for a conversation.
+        Args:
+            conversation_id: Conversation to clear
+        """
+        if conversation_id in self._conversations:
+            del self._conversations[conversation_id]
+            logger.info(f"Cleared memory for conversation: {conversation_id}")
+    def generate_conversation_id(self) -> str:
+        """
+        Generate a new unique conversation ID.
+        Returns:
+            UUID string for new conversation
+        """
+        return str(uuid.uuid4())
+    @property
+    def active_conversations(self) -> int:
+        """Get count of active conversations in memory."""
+        return len(self._conversations)

app/schemas/__init__.py ADDED Viewed

	@@ -0,0 +1,32 @@

+"""
+Pydantic Schemas
+================
+Data models for request/response validation and internal data structures.
+"""
+from app.schemas.models import (
+    QueryRequest,
+    QueryResponse,
+    DocumentInfo,
+    AgentType,
+    AgentResponse,
+    RetrievedDocument,
+    ConversationMessage,
+    IngestionRequest,
+    IngestionResponse,
+    HealthResponse,
+)
+__all__ = [
+    "QueryRequest",
+    "QueryResponse",
+    "DocumentInfo",
+    "AgentType",
+    "AgentResponse",
+    "RetrievedDocument",
+    "ConversationMessage",
+    "IngestionRequest",
+    "IngestionResponse",
+    "HealthResponse",
+]

app/schemas/models.py ADDED Viewed

	@@ -0,0 +1,264 @@

+"""
+Data Models
+===========
+This module defines all Pydantic models used throughout the application.
+Models provide:
+- Request/response validation for API endpoints
+- Type safety for internal data flow
+- Automatic API documentation via OpenAPI
+WHY Pydantic?
+- Runtime type validation (catches errors early)
+- Automatic JSON serialization
+- Integration with FastAPI for auto-docs
+"""
+from datetime import datetime
+from enum import Enum
+from typing import Optional
+from pydantic import BaseModel, Field
+# =============================================================================
+# Enums
+# =============================================================================
+class AgentType(str, Enum):
+    """
+    Types of agents in the multi-agent system.
+    Each agent has a single responsibility:
+    - ROUTER: Classifies query intent and routes to appropriate agent
+    - RETRIEVER: Searches vector store for relevant documents
+    - REASONING: Generates grounded responses from context
+    - ACTION: Executes specific actions (e.g., create ticket, send email)
+    """
+    ROUTER = "router"
+    RETRIEVER = "retriever"
+    REASONING = "reasoning"
+    ACTION = "action"
+class ActionType(str, Enum):
+    """
+    Supported actions the Action Agent can execute.
+    These represent real business operations in a support system.
+    """
+    CREATE_TICKET = "create_ticket"
+    ESCALATE = "escalate"
+    SEND_EMAIL = "send_email"
+    SEARCH_KB = "search_knowledge_base"
+    NONE = "none"  # No action needed, just informational response
+# =============================================================================
+# Document Models
+# =============================================================================
+class DocumentInfo(BaseModel):
+    """Metadata about a source document."""
+    filename: str = Field(..., description="Original filename")
+    file_type: str = Field(..., description="File extension (pdf, txt, etc.)")
+    chunk_count: int = Field(..., description="Number of chunks created")
+    ingested_at: datetime = Field(
+        default_factory=datetime.utcnow,
+        description="When document was processed"
+    )
+class RetrievedDocument(BaseModel):
+    """
+    A document chunk retrieved from the vector store.
+    Contains both the content and metadata for traceability.
+    This allows users to verify the source of information.
+    """
+    content: str = Field(..., description="The text content of the chunk")
+    source: str = Field(..., description="Source file path")
+    relevance_score: float = Field(
+        ...,
+        ge=0.0,
+        le=1.0,
+        description="Similarity score (1.0 = perfect match)"
+    )
+    chunk_index: Optional[int] = Field(
+        default=None,
+        description="Position of chunk in original document"
+    )
+# =============================================================================
+# Conversation Models
+# =============================================================================
+class ConversationMessage(BaseModel):
+    """
+    A single message in the conversation history.
+    Used by memory module to maintain context across turns.
+    """
+    role: str = Field(..., description="'user' or 'assistant'")
+    content: str = Field(..., description="Message text")
+    timestamp: datetime = Field(
+        default_factory=datetime.utcnow,
+        description="When message was sent"
+    )
+# =============================================================================
+# Agent Response Models
+# =============================================================================
+class AgentResponse(BaseModel):
+    """
+    Response from an individual agent.
+    Each agent returns this structure for consistent handling.
+    """
+    agent_type: AgentType = Field(..., description="Which agent responded")
+    output: str = Field(..., description="Agent's output text")
+    confidence: float = Field(
+        default=1.0,
+        ge=0.0,
+        le=1.0,
+        description="Agent's confidence in response"
+    )
+    metadata: dict = Field(
+        default_factory=dict,
+        description="Additional agent-specific data"
+    )
+# =============================================================================
+# API Request/Response Models
+# =============================================================================
+class QueryRequest(BaseModel):
+    """
+    User query request to the RAG system.
+    Example:
+        {
+            "query": "How do I reset my password?",
+            "conversation_id": "abc123",
+            "include_sources": true
+        }
+    """
+    query: str = Field(
+        ...,
+        min_length=1,
+        max_length=2000,
+        description="The user's question or request"
+    )
+    conversation_id: Optional[str] = Field(
+        default=None,
+        description="ID for conversation continuity"
+    )
+    include_sources: bool = Field(
+        default=True,
+        description="Whether to return source documents"
+    )
+    class Config:
+        json_schema_extra = {
+            "example": {
+                "query": "How do I reset my password?",
+                "conversation_id": "user-session-123",
+                "include_sources": True
+            }
+        }
+class QueryResponse(BaseModel):
+    """
+    Response from the RAG system.
+    Contains the answer and optionally the source documents
+    that were used to generate it.
+    """
+    answer: str = Field(..., description="The generated answer")
+    sources: list[RetrievedDocument] = Field(
+        default_factory=list,
+        description="Documents used to generate answer"
+    )
+    conversation_id: str = Field(..., description="Conversation identifier")
+    agent_trace: list[str] = Field(
+        default_factory=list,
+        description="Sequence of agents that processed the query"
+    )
+    action_taken: Optional[ActionType] = Field(
+        default=None,
+        description="Action executed, if any"
+    )
+    processing_time_ms: float = Field(
+        ...,
+        description="Total processing time in milliseconds"
+    )
+    class Config:
+        json_schema_extra = {
+            "example": {
+                "answer": "To reset your password, go to Settings > Security > Reset Password...",
+                "sources": [
+                    {
+                        "content": "Password reset instructions...",
+                        "source": "docs/security.pdf",
+                        "relevance_score": 0.92
+                    }
+                ],
+                "conversation_id": "user-session-123",
+                "agent_trace": ["router", "retriever", "reasoning"],
+                "action_taken": None,
+                "processing_time_ms": 1250.5
+            }
+        }
+# =============================================================================
+# Document Ingestion Models
+# =============================================================================
+class IngestionRequest(BaseModel):
+    """Request to ingest documents into the vector store."""
+    file_paths: list[str] = Field(
+        default_factory=list,
+        description="Specific files to ingest (empty = all in documents_path)"
+    )
+    force_reindex: bool = Field(
+        default=False,
+        description="Re-index even if already processed"
+    )
+class IngestionResponse(BaseModel):
+    """Response after document ingestion."""
+    documents_processed: int = Field(..., description="Number of files processed")
+    chunks_created: int = Field(..., description="Total chunks created")
+    documents: list[DocumentInfo] = Field(
+        default_factory=list,
+        description="Details of each processed document"
+    )
+    errors: list[str] = Field(
+        default_factory=list,
+        description="Any errors encountered"
+    )
+# =============================================================================
+# Health Check Model
+# =============================================================================
+class HealthResponse(BaseModel):
+    """API health check response."""
+    status: str = Field(..., description="'healthy' or 'unhealthy'")
+    version: str = Field(..., description="API version")
+    vector_store_ready: bool = Field(
+        ...,
+        description="Whether vector store is initialized"
+    )
+    document_count: int = Field(
+        ...,
+        description="Number of documents in vector store"
+    )

app/services/__init__.py ADDED Viewed

	@@ -0,0 +1,11 @@

+"""
+Services Module
+===============
+Business logic services that orchestrate agents and handle document processing.
+"""
+from app.services.orchestrator import MultiAgentOrchestrator
+from app.services.document_service import DocumentService
+__all__ = ["MultiAgentOrchestrator", "DocumentService"]

app/services/document_service.py ADDED Viewed

	@@ -0,0 +1,294 @@

+"""
+Document Service
+================
+Service for ingesting and managing documents in the knowledge base.
+This service handles:
+- Loading documents from various file formats
+- Splitting documents into chunks
+- Indexing in the vector store
+- Tracking document metadata
+WHY A SEPARATE SERVICE?
+- Separates ingestion from query processing
+- Can be run as a batch job
+- Easy to add new document types
+- Provides clear API for document management
+"""
+import logging
+from pathlib import Path
+from typing import Optional
+from datetime import datetime
+from langchain_core.documents import Document
+from langchain_community.document_loaders import (
+    TextLoader,
+    PyPDFLoader,
+    UnstructuredWordDocumentLoader,
+    DirectoryLoader,
+)
+from app.vectorstore.faiss_store import FAISSVectorStore
+from app.schemas.models import (
+    IngestionRequest,
+    IngestionResponse,
+    DocumentInfo,
+)
+from app.config import get_settings
+logger = logging.getLogger(__name__)
+class DocumentService:
+    """
+    Service for document ingestion and management.
+    Handles loading documents from files, chunking them,
+    and storing in the vector database.
+    Usage:
+        service = DocumentService()
+        result = await service.ingest_directory("/path/to/docs")
+    """
+    # Supported file extensions and their loaders
+    SUPPORTED_EXTENSIONS = {
+        ".txt": TextLoader,
+        ".md": TextLoader,
+        ".pdf": PyPDFLoader,
+        ".docx": UnstructuredWordDocumentLoader,
+    }
+    def __init__(self, vector_store: Optional[FAISSVectorStore] = None):
+        """
+        Initialize the document service.
+        Args:
+            vector_store: FAISS store instance (uses singleton if not provided)
+        """
+        self._vector_store = vector_store or FAISSVectorStore()
+        self._settings = get_settings()
+    async def ingest_documents(
+        self,
+        request: IngestionRequest
+    ) -> IngestionResponse:
+        """
+        Ingest documents based on request.
+        Args:
+            request: IngestionRequest specifying what to ingest
+        Returns:
+            IngestionResponse with results
+        """
+        if request.file_paths:
+            # Ingest specific files
+            return await self._ingest_files(
+                request.file_paths,
+                request.force_reindex
+            )
+        else:
+            # Ingest all documents in the configured directory
+            return await self.ingest_directory(
+                str(self._settings.documents_path),
+                request.force_reindex
+            )
+    async def ingest_directory(
+        self,
+        directory_path: str,
+        force_reindex: bool = False
+    ) -> IngestionResponse:
+        """
+        Ingest all supported documents from a directory.
+        Recursively finds and indexes all supported file types.
+        Args:
+            directory_path: Path to directory
+            force_reindex: If True, clear existing index first
+        Returns:
+            IngestionResponse with details
+        """
+        path = Path(directory_path)
+        if not path.exists():
+            return IngestionResponse(
+                documents_processed=0,
+                chunks_created=0,
+                errors=[f"Directory not found: {directory_path}"],
+            )
+        if not path.is_dir():
+            return IngestionResponse(
+                documents_processed=0,
+                chunks_created=0,
+                errors=[f"Not a directory: {directory_path}"],
+            )
+        # Clear existing index if requested
+        if force_reindex:
+            logger.info("Force reindex requested - clearing existing index")
+            self._vector_store.delete_all()
+        # Find all supported files
+        all_files = []
+        for ext in self.SUPPORTED_EXTENSIONS:
+            all_files.extend(path.glob(f"**/*{ext}"))
+        if not all_files:
+            return IngestionResponse(
+                documents_processed=0,
+                chunks_created=0,
+                errors=[f"No supported files found in {directory_path}"],
+            )
+        # Ingest each file
+        return await self._ingest_files(
+            [str(f) for f in all_files],
+            force_reindex=False  # Already handled above
+        )
+    async def _ingest_files(
+        self,
+        file_paths: list[str],
+        force_reindex: bool = False
+    ) -> IngestionResponse:
+        """
+        Ingest a list of specific files.
+        Args:
+            file_paths: List of file paths to ingest
+            force_reindex: Clear existing index first
+        Returns:
+            IngestionResponse with details
+        """
+        if force_reindex:
+            self._vector_store.delete_all()
+        documents_info = []
+        errors = []
+        total_chunks = 0
+        for file_path in file_paths:
+            try:
+                result = await self._ingest_single_file(file_path)
+                documents_info.append(result)
+                total_chunks += result.chunk_count
+            except Exception as e:
+                error_msg = f"Failed to process {file_path}: {str(e)}"
+                logger.error(error_msg)
+                errors.append(error_msg)
+        return IngestionResponse(
+            documents_processed=len(documents_info),
+            chunks_created=total_chunks,
+            documents=documents_info,
+            errors=errors,
+        )
+    async def _ingest_single_file(self, file_path: str) -> DocumentInfo:
+        """
+        Ingest a single file into the vector store.
+        Args:
+            file_path: Path to the file
+        Returns:
+            DocumentInfo about the processed file
+        Raises:
+            ValueError: If file type not supported
+            FileNotFoundError: If file doesn't exist
+        """
+        path = Path(file_path)
+        if not path.exists():
+            raise FileNotFoundError(f"File not found: {file_path}")
+        extension = path.suffix.lower()
+        if extension not in self.SUPPORTED_EXTENSIONS:
+            raise ValueError(
+                f"Unsupported file type: {extension}. "
+                f"Supported: {list(self.SUPPORTED_EXTENSIONS.keys())}"
+            )
+        # Load the document
+        loader_class = self.SUPPORTED_EXTENSIONS[extension]
+        loader = loader_class(str(path))
+        documents = loader.load()
+        # Add metadata
+        for i, doc in enumerate(documents):
+            doc.metadata.update({
+                "source": str(path),
+                "file_name": path.name,
+                "file_type": extension,
+                "chunk_index": i,
+                "ingested_at": datetime.utcnow().isoformat(),
+            })
+        # Add to vector store (handles chunking)
+        chunks_created = self._vector_store.add_documents(documents)
+        logger.info(f"Ingested {path.name}: {chunks_created} chunks")
+        return DocumentInfo(
+            filename=path.name,
+            file_type=extension,
+            chunk_count=chunks_created,
+        )
+    async def ingest_text(
+        self,
+        text: str,
+        source_name: str = "direct_input",
+        metadata: Optional[dict] = None
+    ) -> int:
+        """
+        Ingest raw text directly.
+        Useful for adding content without creating files.
+        Args:
+            text: Text content to ingest
+            source_name: Name to use as source
+            metadata: Additional metadata
+        Returns:
+            Number of chunks created
+        """
+        doc = Document(
+            page_content=text,
+            metadata={
+                "source": source_name,
+                "file_type": "text",
+                "ingested_at": datetime.utcnow().isoformat(),
+                **(metadata or {}),
+            }
+        )
+        return self._vector_store.add_documents([doc])
+    def get_document_count(self) -> int:
+        """Get the total number of document chunks in the store."""
+        return self._vector_store.document_count
+    def is_ready(self) -> bool:
+        """Check if the document store has any documents."""
+        return self._vector_store.is_ready
+    def clear_all_documents(self) -> None:
+        """
+        Clear all documents from the vector store.
+        WARNING: This is destructive and cannot be undone.
+        """
+        self._vector_store.delete_all()
+        logger.info("All documents cleared from vector store")

app/services/orchestrator.py ADDED Viewed

	@@ -0,0 +1,272 @@

+"""
+Multi-Agent Orchestrator
+========================
+The orchestrator is the "conductor" of the multi-agent system.
+It coordinates the flow between agents and manages the overall pipeline.
+PIPELINE FLOW:
+    ┌─────────┐
+    │  Query  │
+    └────┬────┘
+         │
+    ┌────▼────┐
+    │ Router  │ ──► Classifies query intent
+    │  Agent  │
+    └────┬────┘
+         │
+    ┌────▼──────────────┐
+    │  Needs Retrieval? │
+    └────┬──────────────┘
+         │Yes
+    ┌────▼────┐
+    │Retriever│ ──► Searches vector store
+    │  Agent  │
+    └────┬────┘
+         │
+    ┌────▼────┐
+    │Reasoning│ ──► Generates grounded response
+    │  Agent  │
+    └────┬────┘
+         │
+    ┌────▼──────────┐
+    │ Needs Action? │
+    └────┬──────────┘
+         │Yes
+    ┌────▼────┐
+    │ Action  │ ──► Executes action (ticket, escalate)
+    │  Agent  │
+    └────┬────┘
+         │
+    ┌────▼────┐
+    │Response │
+    └─────────┘
+WHY AN ORCHESTRATOR?
+- Decouples agents from each other
+- Easy to modify pipeline without changing agents
+- Provides central logging and monitoring
+- Handles error recovery at pipeline level
+"""
+import logging
+import time
+from typing import Optional
+import uuid
+from app.agents.router_agent import RouterAgent
+from app.agents.retriever_agent import RetrieverAgent
+from app.agents.reasoning_agent import ReasoningAgent
+from app.agents.action_agent import ActionAgent
+from app.schemas.models import (
+    QueryRequest,
+    QueryResponse,
+    RetrievedDocument,
+    ActionType,
+)
+from app.memory.conversation_memory import ConversationMemoryManager
+from app.vectorstore.faiss_store import FAISSVectorStore
+logger = logging.getLogger(__name__)
+class MultiAgentOrchestrator:
+    """
+    Orchestrates the multi-agent RAG pipeline.
+    This class:
+    1. Receives user queries
+    2. Routes them through appropriate agents
+    3. Manages state between agents
+    4. Returns consolidated responses
+    Usage:
+        orchestrator = MultiAgentOrchestrator()
+        response = await orchestrator.process_query(
+            QueryRequest(query="How do I reset my password?")
+        )
+    """
+    def __init__(
+        self,
+        vector_store: Optional[FAISSVectorStore] = None,
+        memory_manager: Optional[ConversationMemoryManager] = None,
+    ):
+        """
+        Initialize the orchestrator with all agents.
+        Args:
+            vector_store: FAISS store (uses singleton if not provided)
+            memory_manager: Conversation memory (creates new if not provided)
+        """
+        # Shared dependencies
+        self._vector_store = vector_store or FAISSVectorStore()
+        self._memory_manager = memory_manager or ConversationMemoryManager()
+        # Initialize all agents
+        self._router = RouterAgent()
+        self._retriever = RetrieverAgent(vector_store=self._vector_store)
+        self._reasoning = ReasoningAgent(memory_manager=self._memory_manager)
+        self._action = ActionAgent()
+        logger.info("Multi-agent orchestrator initialized")
+    async def process_query(self, request: QueryRequest) -> QueryResponse:
+        """
+        Process a user query through the multi-agent pipeline.
+        This is the main entry point for the RAG system.
+        Args:
+            request: QueryRequest with user's question
+        Returns:
+            QueryResponse with answer, sources, and metadata
+        """
+        start_time = time.time()
+        # Generate or use existing conversation ID
+        conversation_id = request.conversation_id or self._memory_manager.generate_conversation_id()
+        # Track which agents process this query
+        agent_trace = []
+        logger.info(f"Processing query: {request.query[:100]}...")
+        try:
+            # Step 1: Route the query
+            routing_response = await self._router.safe_execute({
+                "query": request.query,
+                "context": self._memory_manager.get_context_string(conversation_id),
+            })
+            agent_trace.append("router")
+            routing_meta = routing_response.metadata
+            needs_retrieval = routing_meta.get("needs_retrieval", True)
+            needs_action = routing_meta.get("needs_action", False)
+            suggested_action = routing_meta.get("suggested_action", "none")
+            logger.info(
+                f"Routing: retrieval={needs_retrieval}, action={needs_action}"
+            )
+            # Step 2: Retrieve documents if needed
+            context = ""
+            retrieved_docs = []
+            if needs_retrieval:
+                retrieval_response = await self._retriever.safe_execute({
+                    "query": request.query,
+                })
+                agent_trace.append("retriever")
+                context = retrieval_response.output
+                retrieved_docs = retrieval_response.metadata.get("documents", [])
+            # Step 3: Generate reasoning response
+            reasoning_response = await self._reasoning.safe_execute({
+                "query": request.query,
+                "context": context or "No context available.",
+                "conversation_id": conversation_id,
+            })
+            agent_trace.append("reasoning")
+            answer = reasoning_response.output
+            # Step 4: Execute action if needed
+            action_taken = None
+            if needs_action:
+                action_response = await self._action.safe_execute({
+                    "query": request.query,
+                    "context": context,
+                    "action_type": suggested_action,
+                })
+                agent_trace.append("action")
+                # Append action result to answer
+                if action_response.metadata.get("action_taken"):
+                    answer += f"\n\n---\n**Action Taken:**\n{action_response.output}"
+                    action_taken = ActionType(
+                        action_response.metadata.get("action_type", "none")
+                    )
+            # Calculate processing time
+            processing_time_ms = (time.time() - start_time) * 1000
+            # Build response
+            sources = []
+            if request.include_sources and retrieved_docs:
+                for doc_dict in retrieved_docs:
+                    sources.append(RetrievedDocument(**doc_dict))
+            logger.info(
+                f"Query processed in {processing_time_ms:.2f}ms, "
+                f"agents: {' -> '.join(agent_trace)}"
+            )
+            return QueryResponse(
+                answer=answer,
+                sources=sources,
+                conversation_id=conversation_id,
+                agent_trace=agent_trace,
+                action_taken=action_taken,
+                processing_time_ms=processing_time_ms,
+            )
+        except Exception as e:
+            logger.error(f"Pipeline error: {e}", exc_info=True)
+            processing_time_ms = (time.time() - start_time) * 1000
+            return QueryResponse(
+                answer=(
+                    "I apologize, but I encountered an error processing your request. "
+                    "Please try again or contact support if the issue persists."
+                ),
+                sources=[],
+                conversation_id=conversation_id,
+                agent_trace=agent_trace,
+                action_taken=None,
+                processing_time_ms=processing_time_ms,
+            )
+    async def process_simple_query(self, query: str) -> str:
+        """
+        Simple interface for quick queries.
+        Skips the full pipeline and just does retrieval + reasoning.
+        Args:
+            query: User's question
+        Returns:
+            Answer string
+        """
+        # Retrieve context
+        if self._vector_store.is_ready:
+            retrieval_response = await self._retriever.safe_execute({
+                "query": query,
+            })
+            context = retrieval_response.output
+        else:
+            context = "No documents in knowledge base."
+        # Generate response
+        reasoning_response = await self._reasoning.safe_execute({
+            "query": query,
+            "context": context,
+        })
+        return reasoning_response.output
+    @property
+    def is_ready(self) -> bool:
+        """Check if the orchestrator is ready to process queries."""
+        return self._vector_store.is_ready
+    @property
+    def document_count(self) -> int:
+        """Get number of documents in the knowledge base."""
+        return self._vector_store.document_count

app/tools/__init__.py ADDED Viewed

	@@ -0,0 +1,25 @@

+"""
+Tools Module
+============
+LangChain tools for use by agents.
+Tools extend agent capabilities with specific actions.
+Each tool is a function the agent can call.
+"""
+from app.tools.search_tool import create_search_tool
+from app.tools.document_tool import create_document_loader_tool
+from app.tools.action_tools import (
+    create_ticket_tool,
+    create_escalation_tool,
+    get_all_action_tools,
+)
+__all__ = [
+    "create_search_tool",
+    "create_document_loader_tool",
+    "create_ticket_tool",
+    "create_escalation_tool",
+    "get_all_action_tools",
+]

app/tools/action_tools.py ADDED Viewed

	@@ -0,0 +1,279 @@

+"""
+Action Tools
+============
+LangChain tools for executing actions like creating tickets,
+escalating issues, and sending notifications.
+These tools wrap the action logic to make it available to agents.
+In production, they would integrate with real external systems.
+"""
+import logging
+from datetime import datetime
+from typing import Optional
+import uuid
+from langchain.tools import Tool
+from langchain_core.tools import StructuredTool
+from pydantic import BaseModel, Field
+logger = logging.getLogger(__name__)
+# =============================================================================
+# Input Schemas
+# =============================================================================
+class TicketInput(BaseModel):
+    """Input schema for ticket creation."""
+    title: str = Field(description="Brief title for the support ticket")
+    description: str = Field(description="Detailed description of the issue")
+    priority: str = Field(
+        default="medium",
+        description="Priority level: low, medium, high, urgent"
+    )
+    customer_id: Optional[str] = Field(
+        default=None,
+        description="Customer ID if known"
+    )
+class EscalationInput(BaseModel):
+    """Input schema for escalation."""
+    reason: str = Field(description="Reason for escalation to human agent")
+    priority: str = Field(
+        default="medium",
+        description="Priority level: low, medium, high, urgent"
+    )
+    context: Optional[str] = Field(
+        default=None,
+        description="Relevant context for the human agent"
+    )
+class EmailInput(BaseModel):
+    """Input schema for email notifications."""
+    subject: str = Field(description="Email subject line")
+    body_summary: str = Field(description="Summary of email content")
+    recipient_type: str = Field(
+        default="customer",
+        description="Recipient type: customer, support_team, manager"
+    )
+# =============================================================================
+# Tool Functions
+# =============================================================================
+def create_support_ticket(
+    title: str,
+    description: str,
+    priority: str = "medium",
+    customer_id: Optional[str] = None,
+) -> str:
+    """
+    Create a support ticket in the system.
+    In production, this would call a ticketing API (Zendesk, Jira, etc.).
+    For now, we simulate the creation.
+    Args:
+        title: Ticket title
+        description: Detailed description
+        priority: low, medium, high, urgent
+        customer_id: Customer identifier
+    Returns:
+        Confirmation message with ticket ID
+    """
+    # Generate ticket ID
+    ticket_id = f"TKT-{uuid.uuid4().hex[:8].upper()}"
+    # Validate priority
+    valid_priorities = ["low", "medium", "high", "urgent"]
+    if priority.lower() not in valid_priorities:
+        priority = "medium"
+    # In production: API call to ticketing system
+    # ticketing_client.create(title=title, description=description, ...)
+    logger.info(f"Created ticket {ticket_id}: {title} (Priority: {priority})")
+    return (
+        f"Support ticket created successfully.\n"
+        f"Ticket ID: {ticket_id}\n"
+        f"Title: {title}\n"
+        f"Priority: {priority.capitalize()}\n"
+        f"Status: Open\n"
+        f"Our team will review and respond within the SLA for {priority} priority tickets."
+    )
+def escalate_to_human(
+    reason: str,
+    priority: str = "medium",
+    context: Optional[str] = None,
+) -> str:
+    """
+    Escalate the conversation to a human agent.
+    In production, this would:
+    - Add to a queue in the support platform
+    - Notify available agents
+    - Transfer the chat session
+    Args:
+        reason: Why escalation is needed
+        priority: Urgency level
+        context: Context to pass to human agent
+    Returns:
+        Confirmation with escalation details
+    """
+    escalation_id = f"ESC-{uuid.uuid4().hex[:8].upper()}"
+    # Estimated wait times by priority (in production, query actual queue)
+    wait_times = {
+        "low": "15-30 minutes",
+        "medium": "5-15 minutes",
+        "high": "2-5 minutes",
+        "urgent": "Under 2 minutes",
+    }
+    wait_time = wait_times.get(priority.lower(), "5-15 minutes")
+    logger.info(f"Created escalation {escalation_id}: {reason}")
+    return (
+        f"Escalation initiated successfully.\n"
+        f"Reference ID: {escalation_id}\n"
+        f"Reason: {reason}\n"
+        f"Priority: {priority.capitalize()}\n"
+        f"Estimated wait time: {wait_time}\n\n"
+        f"A human support agent will join this conversation shortly. "
+        f"Please stay on this chat."
+    )
+def send_email_notification(
+    subject: str,
+    body_summary: str,
+    recipient_type: str = "customer",
+) -> str:
+    """
+    Send an email notification.
+    In production, this would integrate with email services
+    like SendGrid, AWS SES, or similar.
+    Args:
+        subject: Email subject
+        body_summary: Summary of what the email contains
+        recipient_type: Who receives the email
+    Returns:
+        Confirmation message
+    """
+    email_id = f"EMAIL-{uuid.uuid4().hex[:8].upper()}"
+    # Map recipient types to descriptions
+    recipient_descriptions = {
+        "customer": "your registered email address",
+        "support_team": "the support team",
+        "manager": "the appropriate manager",
+    }
+    recipient_desc = recipient_descriptions.get(
+        recipient_type.lower(),
+        "the specified recipient"
+    )
+    logger.info(f"Sent email {email_id}: {subject} to {recipient_type}")
+    return (
+        f"Email notification sent.\n"
+        f"Email ID: {email_id}\n"
+        f"Subject: {subject}\n"
+        f"Sent to: {recipient_desc}\n"
+        f"Please check the inbox (and spam folder) within the next few minutes."
+    )
+# =============================================================================
+# Tool Creation Functions
+# =============================================================================
+def create_ticket_tool() -> StructuredTool:
+    """
+    Create the ticket creation tool.
+    Returns:
+        StructuredTool for creating support tickets
+    """
+    return StructuredTool.from_function(
+        func=create_support_ticket,
+        name="create_support_ticket",
+        description=(
+            "Create a support ticket for issues that need follow-up. "
+            "Use when the issue can't be resolved immediately, "
+            "requires investigation, or needs to be tracked. "
+            "Provide a clear title and detailed description."
+        ),
+        args_schema=TicketInput,
+    )
+def create_escalation_tool() -> StructuredTool:
+    """
+    Create the escalation tool.
+    Returns:
+        StructuredTool for escalating to human agents
+    """
+    return StructuredTool.from_function(
+        func=escalate_to_human,
+        name="escalate_to_human",
+        description=(
+            "Escalate the conversation to a human support agent. "
+            "Use when: the customer explicitly requests a human, "
+            "the issue is too complex for automated handling, "
+            "the customer is frustrated or upset, "
+            "or there's a sensitive matter requiring human judgment."
+        ),
+        args_schema=EscalationInput,
+    )
+def create_email_tool() -> StructuredTool:
+    """
+    Create the email notification tool.
+    Returns:
+        StructuredTool for sending emails
+    """
+    return StructuredTool.from_function(
+        func=send_email_notification,
+        name="send_email",
+        description=(
+            "Send an email notification about the support interaction. "
+            "Use for: sending confirmation of actions taken, "
+            "providing written documentation of solutions, "
+            "or notifying relevant parties about issues."
+        ),
+        args_schema=EmailInput,
+    )
+def get_all_action_tools() -> list[StructuredTool]:
+    """
+    Get all action tools for agent use.
+    Returns:
+        List of all action tools
+    """
+    return [
+        create_ticket_tool(),
+        create_escalation_tool(),
+        create_email_tool(),
+    ]

app/tools/document_tool.py ADDED Viewed

	@@ -0,0 +1,200 @@

+"""
+Document Loader Tool
+====================
+Tool for loading and processing documents into the vector store.
+This tool handles:
+- Loading documents from files
+- Splitting into chunks
+- Adding to the vector store
+SUPPORTED FORMATS:
+- .txt: Plain text files
+- .pdf: PDF documents
+- .md: Markdown files
+- .docx: Word documents
+"""
+import logging
+from pathlib import Path
+from typing import Optional
+from langchain.tools import Tool
+from langchain_core.tools import StructuredTool
+from langchain_core.documents import Document
+from langchain_community.document_loaders import (
+    TextLoader,
+    PyPDFLoader,
+    UnstructuredWordDocumentLoader,
+)
+from pydantic import BaseModel, Field
+from app.vectorstore.faiss_store import FAISSVectorStore
+from app.config import get_settings
+logger = logging.getLogger(__name__)
+class DocumentLoadInput(BaseModel):
+    """Input for document loading tool."""
+    file_path: str = Field(description="Path to the document file to load")
+def load_document(file_path: str) -> list[Document]:
+    """
+    Load a document from file path.
+    Automatically selects the appropriate loader based on file extension.
+    Args:
+        file_path: Path to the document
+    Returns:
+        List of Document objects (may be multiple for PDFs)
+    Raises:
+        ValueError: If file type not supported
+    """
+    path = Path(file_path)
+    if not path.exists():
+        raise FileNotFoundError(f"File not found: {file_path}")
+    extension = path.suffix.lower()
+    # Select loader based on extension
+    loaders = {
+        ".txt": TextLoader,
+        ".md": TextLoader,
+        ".pdf": PyPDFLoader,
+        ".docx": UnstructuredWordDocumentLoader,
+    }
+    loader_class = loaders.get(extension)
+    if loader_class is None:
+        raise ValueError(
+            f"Unsupported file type: {extension}. "
+            f"Supported: {list(loaders.keys())}"
+        )
+    # Load the document
+    loader = loader_class(str(path))
+    documents = loader.load()
+    # Add source metadata
+    for doc in documents:
+        doc.metadata["source"] = str(path)
+        doc.metadata["file_type"] = extension
+    logger.info(f"Loaded {len(documents)} documents from {file_path}")
+    return documents
+def create_document_loader_tool(
+    vector_store: FAISSVectorStore = None
+) -> StructuredTool:
+    """
+    Create a tool for loading documents into the vector store.
+    This tool is useful for agents that need to ingest new documents
+    into the knowledge base.
+    Args:
+        vector_store: FAISS store instance
+    Returns:
+        StructuredTool for document loading
+    """
+    store = vector_store or FAISSVectorStore()
+    def load_and_index(file_path: str) -> str:
+        """Load a document and add it to the vector store."""
+        try:
+            documents = load_document(file_path)
+            chunks_created = store.add_documents(documents)
+            return (
+                f"Successfully loaded and indexed document: {file_path}\n"
+                f"Created {chunks_created} searchable chunks."
+            )
+        except FileNotFoundError as e:
+            return f"Error: {str(e)}"
+        except ValueError as e:
+            return f"Error: {str(e)}"
+        except Exception as e:
+            logger.error(f"Failed to load document: {e}")
+            return f"Failed to load document: {str(e)}"
+    return StructuredTool.from_function(
+        func=load_and_index,
+        name="load_document",
+        description=(
+            "Load a document file and add it to the knowledge base. "
+            "Supports .txt, .md, .pdf, and .docx files. "
+            "The document will be automatically chunked and indexed for search."
+        ),
+        args_schema=DocumentLoadInput,
+    )
+def load_directory(
+    directory_path: str,
+    vector_store: FAISSVectorStore = None,
+    extensions: list[str] = None
+) -> dict:
+    """
+    Load all documents from a directory.
+    Args:
+        directory_path: Path to directory containing documents
+        vector_store: FAISS store instance
+        extensions: List of extensions to include (default: all supported)
+    Returns:
+        Dictionary with loading results
+    """
+    store = vector_store or FAISSVectorStore()
+    path = Path(directory_path)
+    if not path.exists():
+        raise FileNotFoundError(f"Directory not found: {directory_path}")
+    if not path.is_dir():
+        raise ValueError(f"Not a directory: {directory_path}")
+    # Default extensions
+    supported = extensions or [".txt", ".md", ".pdf", ".docx"]
+    results = {
+        "files_processed": 0,
+        "chunks_created": 0,
+        "errors": [],
+        "files": [],
+    }
+    # Find all matching files
+    for ext in supported:
+        for file_path in path.glob(f"**/*{ext}"):
+            try:
+                documents = load_document(str(file_path))
+                chunks = store.add_documents(documents)
+                results["files_processed"] += 1
+                results["chunks_created"] += chunks
+                results["files"].append({
+                    "path": str(file_path),
+                    "chunks": chunks,
+                })
+            except Exception as e:
+                results["errors"].append({
+                    "path": str(file_path),
+                    "error": str(e),
+                })
+                logger.error(f"Failed to load {file_path}: {e}")
+    logger.info(
+        f"Loaded {results['files_processed']} files, "
+        f"created {results['chunks_created']} chunks"
+    )
+    return results

app/tools/search_tool.py ADDED Viewed

	@@ -0,0 +1,135 @@

+"""
+Search Tool
+===========
+LangChain tool for semantic search in the vector store.
+This tool wraps the FAISS vector store to provide a clean interface
+for agents to search documents.
+WHY A TOOL?
+- LangChain agents work with tools as their action primitives
+- Tools have clear input/output schemas
+- Makes the search capability composable
+"""
+from langchain.tools import Tool
+from langchain_core.tools import StructuredTool
+from pydantic import BaseModel, Field
+from app.vectorstore.faiss_store import FAISSVectorStore
+class SearchInput(BaseModel):
+    """Input schema for the search tool."""
+    query: str = Field(description="The search query to find relevant documents")
+    num_results: int = Field(
+        default=5,
+        ge=1,
+        le=20,
+        description="Number of results to return"
+    )
+def create_search_tool(vector_store: FAISSVectorStore = None) -> StructuredTool:
+    """
+    Create a search tool for semantic document search.
+    The tool searches the FAISS vector store and returns
+    relevant document chunks with their sources.
+    Args:
+        vector_store: FAISS store instance (uses singleton if not provided)
+    Returns:
+        StructuredTool that can be used by LangChain agents
+    Example:
+        tool = create_search_tool()
+        result = tool.invoke({"query": "password reset", "num_results": 3})
+    """
+    store = vector_store or FAISSVectorStore()
+    def search_documents(query: str, num_results: int = 5) -> str:
+        """
+        Search for documents matching the query.
+        Returns formatted string of results for agent consumption.
+        """
+        if not store.is_ready:
+            return "Error: No documents in knowledge base. Please ingest documents first."
+        try:
+            results = store.similarity_search(query, k=num_results)
+        except Exception as e:
+            return f"Search error: {str(e)}"
+        if not results:
+            return "No relevant documents found for the query."
+        # Format results for agent consumption
+        output_parts = [f"Found {len(results)} relevant documents:\n"]
+        for i, (doc, score) in enumerate(results, 1):
+            source = doc.metadata.get("source", "unknown")
+            content = doc.page_content[:500]  # Limit length
+            if len(doc.page_content) > 500:
+                content += "..."
+            output_parts.append(
+                f"\n[Result {i}] (Relevance: {score:.2f}, Source: {source})\n"
+                f"{content}"
+            )
+        return "\n".join(output_parts)
+    return StructuredTool.from_function(
+        func=search_documents,
+        name="search_knowledge_base",
+        description=(
+            "Search the knowledge base for information relevant to the query. "
+            "Use this to find documentation, policies, procedures, and FAQs. "
+            "Returns relevant document excerpts with their sources."
+        ),
+        args_schema=SearchInput,
+    )
+def create_simple_search_tool(vector_store: FAISSVectorStore = None) -> Tool:
+    """
+    Create a simple search tool with just a query string.
+    This is an alternative for agents that work better with
+    simple string inputs rather than structured inputs.
+    Args:
+        vector_store: FAISS store instance
+    Returns:
+        Simple Tool with string input
+    """
+    store = vector_store or FAISSVectorStore()
+    def search(query: str) -> str:
+        """Search documents with a query string."""
+        if not store.is_ready:
+            return "No documents in knowledge base."
+        try:
+            results = store.similarity_search(query, k=5)
+            if not results:
+                return "No relevant documents found."
+            output = []
+            for doc, score in results:
+                output.append(f"[{score:.2f}] {doc.page_content[:300]}...")
+            return "\n\n".join(output)
+        except Exception as e:
+            return f"Search error: {str(e)}"
+    return Tool(
+        name="search",
+        func=search,
+        description="Search the knowledge base for relevant information",
+    )

app/vectorstore/__init__.py ADDED Viewed

	@@ -0,0 +1,11 @@

+"""
+Vector Store Module
+===================
+Handles document embeddings and FAISS vector storage for semantic search.
+"""
+from app.vectorstore.embeddings import EmbeddingManager
+from app.vectorstore.faiss_store import FAISSVectorStore
+__all__ = ["EmbeddingManager", "FAISSVectorStore"]

app/vectorstore/embeddings.py ADDED Viewed

	@@ -0,0 +1,172 @@

+"""
+Embedding Manager
+=================
+This module handles the conversion of text to vector embeddings.
+Supports both FREE (HuggingFace) and PAID (OpenAI) embeddings.
+FREE OPTION: HuggingFace sentence-transformers
+- Runs locally, no API costs
+- Good quality embeddings
+- Model: all-MiniLM-L6-v2 (384 dimensions, fast)
+PAID OPTION: OpenAI
+- Cloud-based
+- Higher quality for some tasks
+- Requires API key and costs money
+"""
+import logging
+from typing import Optional
+from langchain_core.embeddings import Embeddings
+from app.config import get_settings
+logger = logging.getLogger(__name__)
+def create_embeddings(provider: Optional[str] = None) -> Embeddings:
+    """
+    Create embeddings instance based on provider.
+    Args:
+        provider: Override provider from settings ("huggingface" or "openai")
+    Returns:
+        LangChain Embeddings instance
+    """
+    settings = get_settings()
+    provider = provider or settings.embedding_provider
+    logger.info(f"Creating embeddings with provider: {provider}")
+    if provider == "huggingface":
+        from langchain_huggingface import HuggingFaceEmbeddings
+        return HuggingFaceEmbeddings(
+            model_name=settings.huggingface_embedding_model,
+            model_kwargs={"device": "cpu"},  # Use "cuda" if GPU available
+            encode_kwargs={"normalize_embeddings": True},
+        )
+    elif provider == "openai":
+        from langchain_openai import OpenAIEmbeddings
+        return OpenAIEmbeddings(
+            model=settings.openai_embedding_model,
+            openai_api_key=settings.openai_api_key,
+        )
+    else:
+        raise ValueError(f"Unsupported embedding provider: {provider}")
+class EmbeddingManager:
+    """
+    Manages text embedding generation.
+    By default uses FREE HuggingFace embeddings that run locally.
+    Can be configured to use OpenAI for higher quality (paid).
+    Usage:
+        manager = EmbeddingManager()
+        embeddings = manager.get_embeddings()
+        vector = embeddings.embed_query("Hello world")
+    """
+    _instance: Optional["EmbeddingManager"] = None
+    _embeddings: Optional[Embeddings] = None
+    def __new__(cls) -> "EmbeddingManager":
+        """Singleton pattern ensures we only create one embedding client."""
+        if cls._instance is None:
+            cls._instance = super().__new__(cls)
+        return cls._instance
+    def __init__(self) -> None:
+        """Initialize the embedding manager with settings."""
+        if self._embeddings is None:
+            self._initialize_embeddings()
+    def _initialize_embeddings(self) -> None:
+        """Create the embeddings client."""
+        settings = get_settings()
+        try:
+            self._embeddings = create_embeddings(settings.embedding_provider)
+            logger.info(
+                f"Initialized embeddings with provider: {settings.embedding_provider}"
+            )
+        except Exception as e:
+            logger.error(f"Failed to initialize embeddings: {e}")
+            raise
+    def get_embeddings(self) -> Embeddings:
+        """
+        Get the embeddings instance for use with vector stores.
+        Returns:
+            Embeddings instance
+        Raises:
+            RuntimeError: If embeddings not initialized
+        """
+        if self._embeddings is None:
+            raise RuntimeError("Embeddings not initialized")
+        return self._embeddings
+    def embed_text(self, text: str) -> list[float]:
+        """
+        Embed a single text string.
+        Args:
+            text: Text to embed
+        Returns:
+            List of floats representing the embedding vector
+        """
+        embeddings = self.get_embeddings()
+        return embeddings.embed_query(text)
+    def embed_documents(self, texts: list[str]) -> list[list[float]]:
+        """
+        Embed multiple documents in batch.
+        Args:
+            texts: List of texts to embed
+        Returns:
+            List of embedding vectors
+        """
+        embeddings = self.get_embeddings()
+        return embeddings.embed_documents(texts)
+    @property
+    def dimension(self) -> int:
+        """
+        Get the embedding dimension for the current model.
+        Returns:
+            int: Number of dimensions in embedding vector
+        """
+        settings = get_settings()
+        # Dimensions for common models
+        dimensions = {
+            # HuggingFace models
+            "sentence-transformers/all-MiniLM-L6-v2": 384,
+            "sentence-transformers/all-mpnet-base-v2": 768,
+            "BAAI/bge-small-en-v1.5": 384,
+            "BAAI/bge-base-en-v1.5": 768,
+            # OpenAI models
+            "text-embedding-3-small": 1536,
+            "text-embedding-3-large": 3072,
+            "text-embedding-ada-002": 1536,
+        }
+        model = (
+            settings.huggingface_embedding_model
+            if settings.embedding_provider == "huggingface"
+            else settings.openai_embedding_model
+        )
+        return dimensions.get(model, 384)

app/vectorstore/faiss_store.py ADDED Viewed

	@@ -0,0 +1,294 @@

+"""
+FAISS Vector Store
+==================
+This module manages the FAISS vector database for semantic document search.
+WHAT IS FAISS?
+- Facebook AI Similarity Search
+- Efficient library for similarity search in high-dimensional vectors
+- Stores document embeddings and enables fast nearest-neighbor search
+WHY FAISS?
+- Fast: Optimized C++ with Python bindings
+- Scalable: Handles millions of vectors
+- Free: No external service needed (unlike Pinecone)
+- Persistent: Can save/load index to disk
+HOW IT WORKS:
+1. Documents are split into chunks
+2. Each chunk is embedded into a vector
+3. Vectors are indexed in FAISS
+4. Query is embedded and compared to all vectors
+5. Most similar vectors (and their chunks) are returned
+"""
+import logging
+from pathlib import Path
+from typing import Optional
+from langchain_community.vectorstores import FAISS
+from langchain_core.documents import Document
+from langchain_text_splitters import RecursiveCharacterTextSplitter
+from app.config import get_settings
+from app.vectorstore.embeddings import EmbeddingManager
+logger = logging.getLogger(__name__)
+class FAISSVectorStore:
+    """
+    Manages FAISS vector store for document retrieval.
+    This class provides:
+    - Document indexing with automatic chunking
+    - Semantic similarity search
+    - Persistent storage (save/load from disk)
+    - Singleton pattern for memory efficiency
+    Usage:
+        store = FAISSVectorStore()
+        store.add_documents([Document(page_content="...", metadata={...})])
+        results = store.similarity_search("query", k=5)
+    """
+    _instance: Optional["FAISSVectorStore"] = None
+    _store: Optional[FAISS] = None
+    _initialized: bool = False
+    def __new__(cls) -> "FAISSVectorStore":
+        """Singleton pattern - one vector store instance."""
+        if cls._instance is None:
+            cls._instance = super().__new__(cls)
+        return cls._instance
+    def __init__(self) -> None:
+        """Initialize the vector store."""
+        # Only initialize once
+        if not self._initialized:
+            self._settings = get_settings()
+            self._embedding_manager = EmbeddingManager()
+            self._text_splitter = self._create_text_splitter()
+            self._try_load_existing_index()
+            FAISSVectorStore._initialized = True
+    def _create_text_splitter(self) -> RecursiveCharacterTextSplitter:
+        """
+        Create text splitter for chunking documents.
+        WHY RecursiveCharacterTextSplitter?
+        - Tries to split on natural boundaries (paragraphs, sentences)
+        - Falls back to characters if needed
+        - Maintains context within chunks
+        Chunk size of 1000 chars (~250 tokens) is a good balance:
+        - Small enough to be specific
+        - Large enough to maintain context
+        """
+        return RecursiveCharacterTextSplitter(
+            chunk_size=self._settings.chunk_size,
+            chunk_overlap=self._settings.chunk_overlap,
+            length_function=len,
+            # Split hierarchy: paragraphs -> sentences -> words -> chars
+            separators=["\n\n", "\n", ". ", " ", ""],
+        )
+    def _try_load_existing_index(self) -> None:
+        """
+        Try to load an existing FAISS index from disk.
+        If no index exists, the store remains uninitialized
+        until documents are added.
+        """
+        index_path = self._settings.faiss_index_path
+        if (index_path / "index.faiss").exists():
+            try:
+                self._store = FAISS.load_local(
+                    str(index_path),
+                    self._embedding_manager.get_embeddings(),
+                    allow_dangerous_deserialization=True,
+                )
+                logger.info(f"Loaded existing FAISS index from {index_path}")
+            except Exception as e:
+                logger.warning(f"Could not load existing index: {e}")
+                self._store = None
+        else:
+            logger.info("No existing FAISS index found. Ready for indexing.")
+    def add_documents(
+        self,
+        documents: list[Document],
+        chunk: bool = True
+    ) -> int:
+        """
+        Add documents to the vector store.
+        Args:
+            documents: List of LangChain Document objects
+            chunk: Whether to split documents into chunks (default: True)
+        Returns:
+            Number of chunks created and indexed
+        Example:
+            docs = [Document(page_content="Long text...", metadata={"source": "file.pdf"})]
+            chunks_created = store.add_documents(docs)
+        """
+        if not documents:
+            logger.warning("No documents provided to index")
+            return 0
+        # Split documents into chunks if requested
+        if chunk:
+            chunks = self._text_splitter.split_documents(documents)
+            logger.info(
+                f"Split {len(documents)} documents into {len(chunks)} chunks"
+            )
+        else:
+            chunks = documents
+        # Create or update the FAISS index
+        embeddings = self._embedding_manager.get_embeddings()
+        if self._store is None:
+            # Create new index
+            self._store = FAISS.from_documents(chunks, embeddings)
+            logger.info(f"Created new FAISS index with {len(chunks)} chunks")
+        else:
+            # Add to existing index
+            self._store.add_documents(chunks)
+            logger.info(f"Added {len(chunks)} chunks to existing index")
+        # Persist to disk
+        self._save_index()
+        return len(chunks)
+    def _save_index(self) -> None:
+        """Save the FAISS index to disk for persistence."""
+        if self._store is None:
+            return
+        index_path = self._settings.faiss_index_path
+        index_path.mkdir(parents=True, exist_ok=True)
+        self._store.save_local(str(index_path))
+        logger.info(f"Saved FAISS index to {index_path}")
+    def similarity_search(
+        self,
+        query: str,
+        k: Optional[int] = None,
+    ) -> list[tuple[Document, float]]:
+        """
+        Search for documents similar to the query.
+        This is the core retrieval function used by the Retriever Agent.
+        Args:
+            query: The search query text
+            k: Number of results to return (default from settings)
+        Returns:
+            List of (Document, score) tuples, sorted by relevance
+            Score is between 0 and 1, where 1 is most similar
+        Raises:
+            RuntimeError: If no documents have been indexed
+        Example:
+            results = store.similarity_search("password reset", k=3)
+            for doc, score in results:
+                print(f"Score: {score:.2f}, Content: {doc.page_content[:100]}")
+        """
+        if self._store is None:
+            raise RuntimeError(
+                "No documents indexed. Please add documents first."
+            )
+        k = k or self._settings.retrieval_top_k
+        # FAISS returns (Document, score) tuples
+        # Score is L2 distance; we convert to similarity (0-1 range)
+        results = self._store.similarity_search_with_score(query, k=k)
+        # Convert L2 distance to similarity score
+        # L2 distance: 0 = identical, higher = less similar
+        # We normalize using: similarity = 1 / (1 + distance)
+        normalized_results = []
+        for doc, distance in results:
+            # Convert distance to similarity (0-1 range)
+            similarity = 1 / (1 + distance)
+            normalized_results.append((doc, similarity))
+        return normalized_results
+    def similarity_search_simple(
+        self,
+        query: str,
+        k: Optional[int] = None,
+    ) -> list[Document]:
+        """
+        Simple search that returns just documents (no scores).
+        Convenience method for when you just need the documents.
+        Args:
+            query: The search query
+            k: Number of results
+        Returns:
+            List of Document objects
+        """
+        results = self.similarity_search(query, k)
+        return [doc for doc, _ in results]
+    def delete_all(self) -> None:
+        """
+        Delete all documents from the vector store.
+        WARNING: This is destructive and cannot be undone.
+        """
+        self._store = None
+        # Remove saved index files
+        index_path = self._settings.faiss_index_path
+        if index_path.exists():
+            import shutil
+            shutil.rmtree(index_path)
+            index_path.mkdir(parents=True, exist_ok=True)
+        logger.info("Deleted all documents from vector store")
+    @property
+    def is_ready(self) -> bool:
+        """Check if the vector store has documents indexed."""
+        return self._store is not None
+    @property
+    def document_count(self) -> int:
+        """Get the number of document chunks in the store."""
+        if self._store is None:
+            return 0
+        # FAISS doesn't expose this directly, so we use the index
+        return self._store.index.ntotal
+    def as_retriever(self, **kwargs):
+        """
+        Get a LangChain Retriever interface.
+        This allows the vector store to be used directly
+        in LangChain chains and agents.
+        Args:
+            **kwargs: Passed to FAISS.as_retriever()
+        Returns:
+            LangChain Retriever instance
+        """
+        if self._store is None:
+            raise RuntimeError("No documents indexed")
+        return self._store.as_retriever(
+            search_kwargs={"k": self._settings.retrieval_top_k},
+            **kwargs
+        )

claude.md ADDED Viewed

	@@ -0,0 +1,39 @@

+# Project: Multi-Agent RAG System with LangChain
+## Role
+You are acting as a **Senior AI Engineer** building a production-grade multi-agent Retrieval-Augmented Generation (RAG) system.
+## Core Skills You Must Use
+- Agentic AI design
+- LangChain agents, tools, and memory
+- Retrieval-Augmented Generation (RAG)
+- Vector databases (FAISS)
+- Clean Python architecture
+- FastAPI backend design
+## Architectural Rules
+1. Use a **multi-agent architecture**
+   - Router Agent: Routes queries to appropriate agents
+   - Retriever Agent: Handles document retrieval and vector search
+   - Reasoning Agent: Processes context and generates reasoning chains
+   - Action Agent: Executes actions based on reasoning
+2. Each agent must have **single responsibility**
+3. Retrieval must happen **before** generation
+4. Answers MUST be grounded in retrieved context
+5. No logic should be hard-coded into prompts
+6. Code must be modular and extensible
+## Non-Negotiables
+- No monolithic files
+- No hallucination-prone prompting
+- No magic numbers without explanation
+- Comment WHY, not just WHAT
+## Style Guidelines
+- Beginner-friendly explanations
+- Production-quality code
+- Explicit error handling
+- Clear naming conventions
+## Outcome Goal
+Build a system suitable for a **Senior AI Engineer role** in a real SaaS company (e.g., GoDaddy-style customer support automation).

data/documents/account_settings.txt ADDED Viewed

	@@ -0,0 +1,100 @@

+Account Settings Guide
+======================
+Profile Settings
+----------------
+Updating Your Profile
+1. Navigate to Settings > Profile
+2. Click "Edit Profile"
+3. Update your information:
+   - Display name
+   - Profile picture (max 5MB, JPG/PNG format)
+   - Bio (up to 500 characters)
+   - Time zone
+   - Language preference
+4. Click "Save Changes"
+Changing Your Email Address
+1. Go to Settings > Account > Email
+2. Enter your new email address
+3. Enter your current password for verification
+4. Click "Update Email"
+5. Check your new email for a verification link
+6. Click the verification link within 48 hours
+Note: Your old email will receive a notification about the change.
+Security Settings
+-----------------
+Two-Factor Authentication (2FA)
+We strongly recommend enabling 2FA for additional security.
+To enable 2FA:
+1. Go to Settings > Security > Two-Factor Authentication
+2. Choose your 2FA method:
+   - Authenticator app (recommended): Google Authenticator, Authy, etc.
+   - SMS: Receive codes via text message
+3. Follow the setup instructions
+4. Save your backup codes in a secure location
+Session Management
+- View all active sessions at Settings > Security > Active Sessions
+- Click "Sign Out" next to any session to end it
+- Use "Sign Out All Devices" for security emergencies
+Login History
+- View your login history at Settings > Security > Login History
+- Shows date, time, location, and device for each login
+- Suspicious logins are flagged automatically
+Notification Settings
+--------------------
+Email Notifications
+Customize which emails you receive at Settings > Notifications > Email:
+- Account alerts (security, billing) - Always enabled for security
+- Product updates and news
+- Tips and tutorials
+- Marketing and promotions
+Push Notifications
+For mobile app users, manage push notifications at:
+Settings > Notifications > Push Notifications
+- Instant messages
+- Activity updates
+- Reminders
+Privacy Settings
+----------------
+Data Visibility
+Control who can see your information:
+- Profile visibility: Public, Private, or Contacts Only
+- Activity status: Show when you're online
+- Read receipts: Show when you've read messages
+Data Export
+Download your data at Settings > Privacy > Download My Data
+- Includes all your content and account information
+- Available in JSON or CSV format
+- Processing takes up to 48 hours
+Account Deletion
+To permanently delete your account:
+1. Go to Settings > Account > Delete Account
+2. Read the information about what will be deleted
+3. Enter your password
+4. Type "DELETE" to confirm
+5. Click "Permanently Delete Account"
+Note: This action cannot be undone. All data is deleted within 30 days.
+Connected Apps
+--------------
+Managing Third-Party Access
+View and manage connected applications at Settings > Connected Apps:
+- See which apps have access to your account
+- Review permissions for each app
+- Revoke access by clicking "Disconnect"
+- Connected apps lose access immediately when disconnected

data/documents/billing_faq.txt ADDED Viewed

	@@ -0,0 +1,75 @@

+Billing and Payments FAQ
+========================
+General Billing Questions
+-------------------------
+Q: When will I be charged?
+A: Charges occur on the same date each month that you signed up. For example, if you subscribed on the 15th, you'll be charged on the 15th of each subsequent month.
+Q: What payment methods do you accept?
+A: We accept the following payment methods:
+- Credit cards (Visa, MasterCard, American Express, Discover)
+- Debit cards with Visa/MasterCard logo
+- PayPal
+- Bank transfers (for annual enterprise plans)
+Q: How do I update my payment method?
+A: To update your payment method:
+1. Log into your account
+2. Go to Settings > Billing
+3. Click "Payment Methods"
+4. Add a new payment method or edit existing ones
+5. Set your preferred default payment method
+Q: Can I get a refund?
+A: We offer refunds under the following conditions:
+- Within 14 days of initial purchase for new customers
+- Within 7 days for plan upgrades
+- No refunds for plan downgrades (credited towards future billing)
+- Pro-rated refunds for annual plans canceled mid-term
+Subscription Management
+-----------------------
+Q: How do I cancel my subscription?
+A: To cancel your subscription:
+1. Go to Settings > Billing > Subscription
+2. Click "Cancel Subscription"
+3. Select your cancellation reason
+4. Confirm cancellation
+Your access continues until the end of the current billing period.
+Q: How do I upgrade or downgrade my plan?
+A: Plan changes take effect immediately:
+- Upgrades: You're charged a pro-rated amount for the remainder of the billing cycle
+- Downgrades: Credit is applied to your next billing cycle
+Q: What happens if my payment fails?
+A: If a payment fails:
+1. You'll receive an email notification
+2. We'll retry the payment in 3 days
+3. If it fails again, we'll retry in 7 days
+4. After 14 days of failed payments, your account may be suspended
+5. Contact support to resolve payment issues
+Invoice and Receipts
+--------------------
+Q: How do I get an invoice?
+A: Invoices are automatically sent to your email after each payment. You can also:
+1. Go to Settings > Billing > Invoice History
+2. Click on any invoice to view or download
+3. Invoices are available in PDF format
+Q: How do I add my company details to invoices?
+A: To add company billing information:
+1. Go to Settings > Billing > Billing Information
+2. Enter your company name, address, and tax ID
+3. Save changes - this will apply to future invoices
+Contact Billing Support
+-----------------------
+Email: billing@example.com
+Phone: 1-800-555-0123 (Mon-Fri, 9 AM - 5 PM EST)
+Chat: Available 24/7 for Premium and Enterprise customers

data/documents/password_reset.txt ADDED Viewed

	@@ -0,0 +1,45 @@

+Password Reset Guide
+====================
+How to Reset Your Password
+--------------------------
+If you've forgotten your password or need to reset it for security reasons, follow these steps:
+1. Go to the login page at https://example.com/login
+2. Click on "Forgot Password?" link below the login form
+3. Enter your registered email address
+4. Check your email inbox for a password reset link (check spam folder if not visible)
+5. Click the reset link within 24 hours (links expire after 24 hours)
+6. Create a new password following our password requirements
+7. Log in with your new password
+Password Requirements
+--------------------
+- Minimum 8 characters
+- At least one uppercase letter (A-Z)
+- At least one lowercase letter (a-z)
+- At least one number (0-9)
+- At least one special character (!@#$%^&*)
+- Cannot be any of your last 5 passwords
+Troubleshooting
+---------------
+If you don't receive the reset email:
+- Wait 5-10 minutes and check again
+- Check your spam/junk folder
+- Make sure you entered the correct email address
+- Contact support if issues persist
+If the reset link doesn't work:
+- Request a new reset link
+- Make sure you're clicking the most recent link if you requested multiple
+- Try copying and pasting the link instead of clicking
+- Clear your browser cache and try again
+Security Notes
+--------------
+- Never share your password with anyone
+- We will never ask for your password via email or phone
+- Enable two-factor authentication for additional security
+- Report any suspicious activity to security@example.com

data/documents/technical_support.txt ADDED Viewed

	@@ -0,0 +1,127 @@

+Technical Support Guide
+=======================
+Common Technical Issues
+-----------------------
+Login Problems
+--------------
+Problem: Cannot log in despite correct credentials
+Solutions:
+1. Clear browser cache and cookies
+2. Try a different browser (Chrome, Firefox, Safari, Edge)
+3. Disable browser extensions temporarily
+4. Check if Caps Lock is on
+5. Reset your password if issues persist
+Problem: "Session Expired" error
+Solutions:
+1. This occurs after inactivity for 30 minutes
+2. Simply log in again
+3. Enable "Remember Me" for extended sessions
+4. Check if cookies are enabled in your browser
+Problem: Account locked
+Solutions:
+1. Wait 30 minutes - accounts unlock automatically
+2. If urgent, contact support for manual unlock
+3. Use password reset to regain access immediately
+Performance Issues
+------------------
+Problem: Slow loading times
+Solutions:
+1. Check your internet connection speed (minimum 5 Mbps recommended)
+2. Clear browser cache (Settings > Clear Browsing Data)
+3. Disable unnecessary browser extensions
+4. Try accessing during off-peak hours
+5. Check our status page for any ongoing issues
+Problem: Features not working
+Solutions:
+1. Ensure JavaScript is enabled
+2. Update your browser to the latest version
+3. Disable ad blockers for our domain
+4. Check browser console for errors (F12 > Console)
+5. Try incognito/private browsing mode
+Mobile App Issues
+-----------------
+Problem: App crashes on startup
+Solutions:
+1. Force close the app and reopen
+2. Restart your device
+3. Update the app to the latest version
+4. Uninstall and reinstall the app
+5. Check if your OS is supported (iOS 13+ / Android 8+)
+Problem: Push notifications not working
+Solutions:
+1. Check app notification settings
+2. Check device notification settings
+3. Ensure battery saver mode is off
+4. Log out and log back in
+5. Reinstall the app
+Integration Issues
+------------------
+Problem: Third-party integration not syncing
+Solutions:
+1. Disconnect and reconnect the integration
+2. Check if the third-party service is operational
+3. Verify API permissions are correctly set
+4. Wait 15 minutes for sync to complete
+5. Contact support with sync logs
+Problem: Webhook not receiving data
+Solutions:
+1. Verify webhook URL is correct and accessible
+2. Check if your server returns 200 OK response
+3. Review webhook logs in Settings > Integrations > Webhooks
+4. Ensure SSL certificate is valid (HTTPS required)
+5. Check firewall settings on your server
+File Upload Issues
+------------------
+Problem: File upload fails
+Solutions:
+1. Check file size (max 100MB per file)
+2. Verify file format is supported
+3. Try a different browser
+4. Disable VPN temporarily
+5. Check if storage quota is reached
+Supported File Formats:
+- Documents: PDF, DOC, DOCX, TXT, RTF
+- Images: JPG, PNG, GIF, SVG, WebP
+- Videos: MP4, MOV, AVI (max 500MB)
+- Archives: ZIP, RAR (max 200MB)
+Getting Technical Support
+-------------------------
+Before Contacting Support:
+1. Check our Help Center at help.example.com
+2. Search our community forums
+3. Review the status page for outages
+4. Gather relevant information:
+   - Error messages (screenshots if possible)
+   - Browser and version
+   - Operating system
+   - Steps to reproduce the issue
+Support Channels:
+- Email: support@example.com
+- Live Chat: Available 24/7 for Premium customers
+- Phone: 1-800-555-0199 (Enterprise customers)
+- Community Forums: community.example.com
+Response Times:
+- Critical issues: 1-4 hours
+- High priority: 4-12 hours
+- Normal priority: 24-48 hours
+- Low priority: 48-72 hours

requirements.txt ADDED Viewed

	@@ -0,0 +1,46 @@

+# Multi-Agent RAG System Dependencies
+# =====================================
+# Core AI Framework
+langchain>=0.1.0
+langchain-core>=0.1.0
+langchain-openai>=0.0.5
+langchain-community>=0.0.20
+langchain-text-splitters>=0.0.1
+# Free LLM Providers
+langchain-huggingface>=0.0.1
+langchain-google-genai>=0.0.5
+langchain-groq>=0.0.1
+# For local embeddings (free)
+sentence-transformers>=2.2.0
+# Vector Store
+faiss-cpu>=1.7.4
+# Embeddings (using OpenAI)
+openai>=1.10.0
+# Web Framework
+fastapi>=0.109.0
+uvicorn[standard]>=0.27.0
+# Data Validation
+pydantic>=2.5.0
+pydantic-settings>=2.1.0
+# Environment & Config
+python-dotenv>=1.0.0
+# Document Processing
+pypdf>=4.0.0
+python-docx>=1.1.0
+# Utilities
+tiktoken>=0.5.0
+httpx>=0.26.0
+# Testing (optional)
+pytest>=7.4.0
+pytest-asyncio>=0.23.0

scripts/test_api.py ADDED Viewed

	@@ -0,0 +1,157 @@

+"""
+API Test Script
+===============
+Simple script to test the Multi-Agent RAG API endpoints.
+Run the API server first:
+    uvicorn app.main:app --reload
+Then run this script:
+    python scripts/test_api.py
+"""
+import asyncio
+import httpx
+import json
+from typing import Optional
+BASE_URL = "http://localhost:8000/api/v1"
+async def test_health():
+    """Test health endpoint."""
+    print("\n" + "=" * 50)
+    print("Testing Health Endpoint")
+    print("=" * 50)
+    async with httpx.AsyncClient() as client:
+        response = await client.get(f"{BASE_URL}/health")
+        print(f"Status: {response.status_code}")
+        print(f"Response: {json.dumps(response.json(), indent=2)}")
+        return response.json()
+async def test_ingest():
+    """Test document ingestion."""
+    print("\n" + "=" * 50)
+    print("Testing Document Ingestion")
+    print("=" * 50)
+    async with httpx.AsyncClient(timeout=120.0) as client:
+        response = await client.post(
+            f"{BASE_URL}/ingest",
+            json={"force_reindex": True}
+        )
+        print(f"Status: {response.status_code}")
+        print(f"Response: {json.dumps(response.json(), indent=2)}")
+        return response.json()
+async def test_query(query: str, conversation_id: Optional[str] = None):
+    """Test query endpoint."""
+    print("\n" + "=" * 50)
+    print(f"Testing Query: {query}")
+    print("=" * 50)
+    async with httpx.AsyncClient(timeout=120.0) as client:
+        payload = {
+            "query": query,
+            "include_sources": True
+        }
+        if conversation_id:
+            payload["conversation_id"] = conversation_id
+        response = await client.post(
+            f"{BASE_URL}/query",
+            json=payload
+        )
+        print(f"Status: {response.status_code}")
+        if response.status_code == 200:
+            result = response.json()
+            print(f"\nAnswer:\n{result['answer']}")
+            print(f"\nAgent Trace: {' -> '.join(result['agent_trace'])}")
+            print(f"Processing Time: {result['processing_time_ms']:.2f}ms")
+            print(f"Sources: {len(result['sources'])} documents")
+            if result['sources']:
+                print("\nTop Source:")
+                source = result['sources'][0]
+                print(f"  - Score: {source['relevance_score']:.2f}")
+                print(f"  - Source: {source['source']}")
+                print(f"  - Content: {source['content'][:200]}...")
+            return result
+        else:
+            print(f"Error: {response.text}")
+            return None
+async def test_document_count():
+    """Test document count endpoint."""
+    print("\n" + "=" * 50)
+    print("Testing Document Count")
+    print("=" * 50)
+    async with httpx.AsyncClient() as client:
+        response = await client.get(f"{BASE_URL}/documents/count")
+        print(f"Status: {response.status_code}")
+        print(f"Response: {json.dumps(response.json(), indent=2)}")
+        return response.json()
+async def run_all_tests():
+    """Run all API tests."""
+    print("\n" + "#" * 60)
+    print("# Multi-Agent RAG System - API Tests")
+    print("#" * 60)
+    # Test 1: Health check
+    health = await test_health()
+    if health['status'] != 'healthy':
+        print("ERROR: API is not healthy!")
+        return
+    # Test 2: Ingest documents
+    ingest_result = await test_ingest()
+    if ingest_result.get('documents_processed', 0) == 0:
+        print("WARNING: No documents were ingested!")
+    # Test 3: Check document count
+    await test_document_count()
+    # Test 4: Run queries
+    test_queries = [
+        "How do I reset my password?",
+        "What payment methods do you accept?",
+        "How can I enable two-factor authentication?",
+        "The app is running slow, what should I do?",
+        "I want to talk to a human agent",  # Should trigger escalation
+    ]
+    for query in test_queries:
+        await test_query(query)
+    # Test 5: Multi-turn conversation
+    print("\n" + "=" * 50)
+    print("Testing Multi-turn Conversation")
+    print("=" * 50)
+    result1 = await test_query("What are your password requirements?")
+    if result1:
+        conversation_id = result1['conversation_id']
+        await test_query(
+            "What if I don't receive the reset email?",
+            conversation_id=conversation_id
+        )
+    print("\n" + "#" * 60)
+    print("# All Tests Complete!")
+    print("#" * 60)
+if __name__ == "__main__":
+    print("Starting API Tests...")
+    print("Make sure the API server is running at http://localhost:8000")
+    asyncio.run(run_all_tests())

skills.md ADDED Viewed

	@@ -0,0 +1,27 @@

+# Skills Manifest
+Claude must actively apply the following skills:
+## AI & ML
+- Retrieval-Augmented Generation (RAG)
+- Embeddings & vector similarity
+- Multi-agent reasoning
+- Tool calling & autonomy
+- Prompt grounding & safety
+## Engineering
+- Modular Python architecture
+- API-first design
+- Separation of concerns
+- Error handling & logging
+## System Design
+- Stateless vs stateful services
+- Memory management in agents
+- Scalability considerations
+- Enterprise AI patterns
+## Teaching Mode
+- Explain all concepts step-by-step
+- Use real-world analogies
+- Assume the user is learning

tools.md ADDED Viewed

	@@ -0,0 +1,20 @@

+# Allowed Tools & Libraries
+## AI Frameworks
+- LangChain (agents, tools, memory)
+- OpenAI or Anthropic models
+- FAISS for vector storage
+## Backend
+- FastAPI
+- Uvicorn
+## Utilities
+- python-dotenv
+- logging
+- pathlib
+## Rules
+- Do NOT introduce new frameworks without explanation
+- Prefer standard LangChain abstractions
+- Avoid experimental APIs unless necessary