Spaces:

GIZ
/

ChaBo_README

Running

File size: 22,509 Bytes

orchestratorText = """ 
                # Chabo Orchestrator Documentation 

                ### Table of Contents
                1. Overview
                2. System Architecture
                3. Components
                4. Configuration
                5. Deployment Guide
                6. API Reference
                7. Usage Examples
                8. Troubleshooting

                ## Overview

                The Chabo Orchestrator is the central coordination module of the Chabo RAG system. \
                It orchestrates the flow between multiple microservices to provide intelligent \
                document processing and question-answering capabilities. The system is designed for deployment on Huggingface Spaces.

                ### Key Features:
                - **Workflow Orchestration**: Uses LangGraph to manage complex processing pipelines
                - **Multi-Modal Support**: Handles files dependent on ChatUI and Ingestor config (e.g. PDF, DOCX, GeoJSON, and JSON )
                - **Streaming Responses**: Real-time response generation with Server-Sent Events (SSE)
                - **Dual Processing Modes**:
                - **Direct Output Mode**: Returns ingestor results immediately (e.g. EUDR use case)
                - **Standard RAG Mode**: Full retrieval-augmented generation pipeline
                - **Intelligent Caching**: Prevents redundant file processing (e.g. EUDR use case)
                - **Multiple Interfaces**: FastAPI endpoints for modules; LangServe endpoints for ChatUI; Gradio UI for testing

                ## System Architecture

                ### High-Level Architecture

                ```

                        ┌─────────────────┐
                        │   ChatUI        │
                        │   Frontend      │
                        └────────┬────────┘
                                 │ HTTP/SSE
                                 ▼
                ┌─────────────────────────────────┐
                │      Chabo Orchestrator         │
                │   ┌─────────────────────────┐   │
                │   │   LangGraph Workflow    │   │
                │   │   ┌─────────────────┐   │   │
                │   │   │ Detect File     │   │   │
                │   │   │ Type            │   │   │
                │   │   └────────┬────────┘   │   │
                │   │            │            │   │
                │   │   ┌────────▼────────┐   │   │
                │   │   │ Ingest File     │   │   │
                │   │   └────────┬────────┘   │   │
                │   │            │            │   │
                │   │      ┌─────┴──────┐     │   │
                │   │      │            │     │   │
                │   │   ┌──▼───┐   ┌────▼───┐ │   │
                │   │   │Direct│   │Retrieve│ │   │
                │   │   │Output│   │Context │ │   │
                │   │   └──┬───┘   └────┬───┘ │   │
                │   │      │            │     │   │
                │   │      │       ┌────▼───┐ │   │
                │   │      │       │Generate│ │   │
                │   │      │       │Response│ │   │
                │   │      │       └────────┘ │   │
                │   └──────┴──────────────────┘   │
                └──────┬───────────┬──────────┬───┘
                       │           │          │
                    ┌───▼──┐   ┌───▼───┐   ┌──▼────┐
                    │Ingest│   │Retrie-│   │Genera-│
                    │or    │   │ver    │   │tor    │
                    └──────┘   └───────┘   └───────┘
                
                ```
                ### Component Communication

                All communication between modules happens over HTTP:
                - **Orchestrator ↔ Ingestor**: Gradio Client (file upload, processing)
                - **Orchestrator ↔ Retriever**: Gradio Client (semantic search)
                - **Orchestrator ↔ Generator**: HTTP streaming (SSE for real-time responses)
                - **ChatUI ↔ Orchestrator**: LangServe streaming endpoints

                ### Workflow Logic

                The orchestrator implements two distinct workflows:
                **Direct Output Workflow** (when `DIRECT_OUTPUT=True` and file is new):
                ```
                    File Upload → Detect Type → Ingest → Direct Output → Return Results
                ```
                **Standard RAG Workflow** (default or cached files):
                ```
                    Query → Retrieve Context → Generate Response → Stream to User
                ```


                ## Components

                ### 1. Main Application (`main.py`)
                - LangServe endpoints for ChatUI integration
                - Gradio web interface for testing
                - FastAPI endpoints for diagnostics and future use (e.g. /health)
                - Cache management endpoint (for direct output use cases)


                **Key Functions:**
                - `chatui_adapter()`: Handles text-only queries
                - `chatui_file_adapter()`: Handles file uploads with queries
                - `create_gradio_interface()`: Test UI

                ### 2. Workflow Nodes (`nodes.py`)

                LangGraph nodes that implement the processing pipeline:

                **Node Functions:**

                - `detect_file_type_node()`: Identifies file type and determines routing
                - `ingest_node()`: Processes files through appropriate ingestor
                - `direct_output_node()`: Returns raw ingestor results
                - `retrieve_node()`: Fetches relevant context from vector store
                - `generate_node_streaming()`: Streams LLM responses
                - `route_workflow()`: Conditional routing logic

                **Helper Functions:**

                - `process_query_streaming()`: Unified streaming interface
                - `compute_file_hash()`: SHA256 hashing for deduplication
                - `clear_direct_output_cache()`: Cache management

                ### 3. Data Models (`models.py`)

                Pydantic models for type validation

                ### 4. Retriever Adapter (`retriever_adapter.py`)

                Abstraction layer for managing different retriever configurations:
                - Handles authentication
                - Formats queries and filters

                 ### 5. Utilities (`utils.py`)

                Helper functions

                #### Conversation Context Management

                The `build_conversation_context()` function manages conversation history to provide relevant context to the generator while respecting token limits and conversation flow.

                **Key Features:**

                - **Context Selection**: Always includes the first user and assistant messages to maintain conversation context
                - **Recent Turn Limiting**: Includes only the last N complete turns (user + assistant pairs) to focus on recent conversation (default: 3)
                - **Character Limit Management**: Truncates to maximum character limits to prevent context overflow


                **Function Parameters:**

                ```python
                    def build_conversation_context(
                        messages,           # List of Message objects from conversation
                        max_turns: int = 3, # Maximum number of recent turns to include
                        max_chars: int = 8000  # Maximum total characters in context
                    ) -> str
                ```

                
                ## Configuration

                ### Configuration File (`params.cfg`)

                ```ini
                    [file_processing]
                    # Enable direct output mode: when True, ingestor results are returned directly
                    # without going through the generator. When False, all files go through full RAG pipeline.
                    # This also prevents ChatUI from resending the file in the conversation history with each turn
                    # Note: File type validation is handled by the ChatUI frontend
                    DIRECT_OUTPUT = True

                    [conversation_history]
                    # Limit the context window for the conversation history
                    MAX_TURNS = 3
                    MAX_CHARS = 12000

                    [retriever]
                    RETRIEVER = https://giz-chatfed-retriever0-3.hf.space/
                    # Optional
                    COLLECTION_NAME = EUDR  

                    [generator]
                    GENERATOR = https://giz-eudr-chabo-generator.hf.space

                    [ingestor]
                    INGESTOR = https://giz-eudr-chabo-ingestor.hf.space

                    [general]
                    # need to include this for HF inference endpoint limits
                    MAX_CONTEXT_CHARS = 15000 
                ```

                ### Environment Variables

                Create a `.env` file with:

                ```bash
                    # Required for private HuggingFace Spaces
                    HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxx
                ```

                ### ChatUI Configuration

                ChatUI `DOTENV_LOCAL` example deployment configuration:

                ```javascript
                    MODELS=`[
                    {
                        "name": "asistente_eudr",
                        "displayName": "Asistente EUDR",
                        "description": "Retrieval-augmented generation on EUDR Whisp API powered by ChatFed modules.",
                        "instructions": {
                            "title": "EUDR Asistente: Instructiones",
                            "content": "Hola, soy Asistente EUDR, un asistente conversacional basado en inteligencia artificial diseñado para ayudarle a comprender el cumplimiento y el análisis del Reglamento de la UE sobre la deforestación. Responderé a sus preguntas utilizando los informes EUDR y los archivos GeoJSON cargados.\\n\\n💡 **Cómo utilizarlo (panel a la derecha)**\\n\\n**Modo de uso:** elija entre subir un archivo GeoJSON para su análisis o consultar los informes EUDR filtrados por país.\\n\\n**Ejemplos:** seleccione entre preguntas de ejemplo seleccionadas de diferentes categorías.\\n\\n**Referencias:** consulte las fuentes de contenido utilizadas para la verificación de datos.\\n\\n⚠️ Para conocer las limitaciones y la información sobre la recopilación de datos, consulte la pestaña «Exención de responsabilidad»\\n\\n⚠️ Al utilizar esta aplicación, usted acepta que recopilemos estadísticas de uso (como preguntas formuladas, comentarios realizados, duración de la sesión, tipo de dispositivo e información geográfica anónima) para comprender el rendimiento y mejorar continuamente la herramienta, basándonos en nuestro interés legítimo por mejorar nuestros servicios."
                        },
                        "multimodal": true,
                        "multimodalAcceptedMimetypes": [
                        "application/geojson"
                        ],
                        "chatPromptTemplate": "{{#each messages}}{{#ifUser}}{{content}}{{/ifUser}}{{#ifAssistant}}{{content}}{{/ifAssistant}}{{/each}}",
                        "parameters": {
                        "temperature": 0.0,
                        "max_new_tokens": 2048
                        },
                        "endpoints": [{
                        "type": "langserve-streaming",
                        "url": "[https://giz-eudr-chabo-orchestrator.hf.space/chatfed-ui-stream](https://giz-eudr-chabo-orchestrator.hf.space/chatfed-ui-stream)",
                        "streamingFileUploadUrl": "[https://giz-eudr-chabo-orchestrator.hf.space/chatfed-with-file-stream](https://giz-eudr-chabo-orchestrator.hf.space/chatfed-with-file-stream)",
                        "inputKey": "text",
                        "fileInputKey": "files"
                        }]
                    }
                ]`
                
                PUBLIC_ANNOUNCEMENT_BANNERS=`[
                    {
                    "title": "This is Chat Prototype for DSC users",
                    "linkTitle": "Keep it Clean"
                    }
                ]`

                PUBLIC_APP_DISCLAIMER_MESSAGE="Disclaimer: AI is an area of active research with known problems such as biased generation and misinformation. Do not use this application for high-stakes decisions or advice. Do not insert your personal data, especially sensitive, like health data."
                PUBLIC_APP_DESCRIPTION="Internal Chat-tool for DSC users for testing"

                PUBLIC_APP_NAME="EUDR ChatUI" 
                ENABLE_ASSISTANTS=false
                ENABLE_ASSISTANTS_RAG=false
                COMMUNITY_TOOLS=false
                MONGODB_URL=mongodb://localhost:27017

                # Disable LLM-based title generation to prevent template queries
                LLM_SUMMARIZATION=false
                ```
                Key things to ensure here:
                - multimodalAcceptedMimetypes: file types to accept for upload via ChatUI
                - endpoints: orchestrator url + endpoints

                ## Deployment Guide

                ### Local Development

                **Prerequisites:**
                - Python 3.10+
                - pip

                **Steps:**

                1. Clone the repository:
                ```bash
                git clone <your-repo-url>
                cd chabo-orchestrator
                ```

                2. Install dependencies:
                ```bash
                pip install -r requirements.txt
                ```

                3. Configure the system:
                ```bash
                # Create .env file
                echo "HF_TOKEN=your_token_here" > .env

                # Edit params.cfg with your service URLs
                nano params.cfg
                ```

                4. Run the application:
                ```bash
                python app/main.py
                ```

                5. Access interfaces:
                - Gradio UI: http://localhost:7860/gradio
                - API Docs: http://localhost:7860/docs
                - Health Check: http://localhost:7860/health
            
                ### Docker Deployment

                **Build the image:**

                ```bash
                docker build -t chabo-orchestrator .
                ```

                **Run the container:**

                ```bash
                docker run -d --name chabo-orchestrator -p 7860:7860 chabo-orchestrator
                ```

                ### HuggingFace Spaces Deployment


                **Repository Structure:**
                ```
                your-space/
                ├── app/
                │   ├── main.py
                │   ├── nodes.py
                │   ├── models.py
                │   ├── retriever_adapter.py
                │   └── utils.py
                ├── Dockerfile
                ├── requirements.txt
                ├── params.cfg
                └── README.md
                ```
                **Steps:**

                1. Create a new Space on HuggingFace
                2. Select "Docker" as the SDK
                3. Push your code to the Space repository
                4. Add secrets in Space settings:
                - `HF_TOKEN`: Your HuggingFace token
                5. The Space will automatically build and deploy

                **Important:** Ensure all service URLs in `params.cfg` are publicly accessible.

                ### Docker Compose (Multi-Service)

                Example orchestrated deployment for the entire Chabo stack (*NOTE - docker-compose will not run on Huggingface spaces*)
                ```yaml
                version: '3.8'

                services:
                orchestrator:
                    build: ./orchestrator
                    ports:
                    - "7860:7860"
                    environment:
                    - HF_TOKEN=${HF_TOKEN}
                    - RETRIEVER=http://retriever:7861
                    - GENERATOR=http://generator:7862
                    - INGESTOR=http://ingestor:7863
                    depends_on:
                    - retriever
                    - generator
                    - ingestor

                retriever:
                    build: ./retriever
                    ports:
                    - "7861:7861"
                    environment:
                    - QDRANT_API_KEY=${QDRANT_API_KEY}

                generator:
                    build: ./generator
                    ports:
                    - "7862:7862"
                    environment:
                    - HF_TOKEN=${HF_TOKEN}

                ingestor:
                    build: ./ingestor
                    ports:
                    - "7863:7863"
                ```
                ## API Reference

                ### Endpoints

                #### Health Check
                ```
                GET /health
                ```
                Returns service health status.

                **Response:**
                ```json
                {
                "status": "healthy"
                }
                ```

                #### Root Information
                ```
                GET /
                ```
                Returns API metadata and available endpoints.

                #### Text Query (Streaming)
                ```
                POST /chatfed-ui-stream/stream
                Content-Type: application/json
                ```

                **Request Body:**
                ```json
                {
                "input": {
                    "text": "What are EUDR requirements?"
                }
                }
                ```

                **Response:** Server-Sent Events stream
                ```
                event: data
                data: "The EUDR requires..."

                event: sources
                data: {"sources": [...]}

                event: end
                data: ""
                ```
                #### File Upload Query (Streaming)
                ```
                POST /chatfed-with-file-stream/stream
                Content-Type: application/json
                ```

                **Request Body:**
                ```json
                {
                "input": {
                    "text": "Analyze this GeoJSON",
                    "files": [
                    {
                        "name": "boundaries.geojson",
                        "type": "base64",
                        "content": "base64_encoded_content"
                    }
                    ]
                }
                }
                ```

                #### Clear Cache
                ```
                POST /clear-cache
                ```
                Clears the direct output file cache.

                **Response:**
                ```json
                {
                "status": "cache cleared"
                }
                ```

                ### Gradio Interface

                #### Interactive Query

                Gradio's default API endpoint for UI interactions. If running on huggingface spaces, access via: https://[ORG_NAME]-[SPACE_NAME].hf.space/gradio/
                ## Troubleshooting

                ### Common Issues

                #### 1. File Upload Fails

                **Symptoms:** "Error reading file" or "Failed to decode uploaded file"

                **Solutions:**
                - Verify file is properly base64 encoded
                - Check file size limits (default: varies by deployment)
                - Ensure MIME type is in `multimodalAcceptedMimetypes`

                #### 2. Slow Responses

                **Symptoms:** Long wait times for responses

                **Solutions:**
                - Check network latency to external services
                - Verify `MAX_CONTEXT_CHARS` isn't too high
                - Consider enabling `DIRECT_OUTPUT` for suitable file types
                - Check logs for retrieval/generation bottlenecks

                #### 3. Cache Not Clearing

                **Symptoms:** Same file shows cached results when it shouldn't

                **Solutions:**
                - Call `/clear-cache` endpoint
                - Restart the service (clears in-memory cache)
                - Check if `DIRECT_OUTPUT=True` in config

                #### 4. Service Connection Errors

                **Symptoms:** "Connection refused" or timeout errors

                **Solutions:**
                - Verify all service URLs in `params.cfg` are accessible
                - Check HF_TOKEN is valid and has access to private spaces (*NOTE - THE ORCHESTRATOR CURRENTLY MUST BE PUBLIC*)
                - Test each service independently with health checks
                - Review firewall/network policies


                ### Version History

                - **v1.0.0**: Initial release with LangGraph orchestration
                - Current implementation supports streaming, caching, and dual-mode processing

                ---

                **Documentation Last Updated:** 2025-10-01  
                **Compatible With:** Python 3.10+, LangGraph 0.2+, FastAPI 0.100+



            """