rag_agent / README.md
Cheh Kit Hong
changed rag method flags
94e0eef

A newer version of the Gradio SDK is available: 6.1.0

Upgrade
metadata
title: RAG Agent
emoji: πŸ•΅πŸ»β€β™‚οΈ
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 6.0.1
app_file: main.py
pinned: false
hf_oauth: true
hf_oauth_expiration_minutes: 480

πŸ“ Project Structure

mai-rag-agent/
β”‚
β”œβ”€β”€ πŸ“‚ agent/                      # Core agent logic
β”‚   β”œβ”€β”€ graph.py                   # LangGraph workflow definition
β”‚   β”œβ”€β”€ nodes.py                   # Agent nodes (router, vectordb, web_search, generate)
β”‚   β”œβ”€β”€ prompts.py                 # System prompts and templates
β”‚   β”œβ”€β”€ state.py                   # Agent state management (AgentState, RAG_method)
β”‚   └── tools.py                   # Tool definitions (Tavily, Wikipedia, ArXiv, ChromaDB)
β”‚
β”œβ”€β”€ πŸ“‚ core/                       # Business logic layer
β”‚   β”œβ”€β”€ llm.py                     # LLM initialization (Anthropic Claude)
β”‚   └── rag_agent.py               # Main RAGAgent class with graph orchestration
β”‚
β”œβ”€β”€ πŸ“‚ ui/                         # User interface
β”‚   └── gradio_components.py       # Gradio web interface components
β”‚
β”œβ”€β”€ πŸ“‚ knowledge_base/             # scripts for setting up Chroma
β”‚
β”œβ”€β”€ πŸ“‚ chroma_data/                # Artifacts for Chroma
β”‚
β”œβ”€β”€ πŸ“‚ docs/                       # Source documents (PDFs, text files)
β”‚
β”œβ”€β”€ πŸ“„ main.py                     # Application entry point
β”œβ”€β”€ πŸ“„ config.py                   # Configuration settings
β”œβ”€β”€ πŸ“„ test_scripts.py             # Agent testing script
β”‚
β”œβ”€β”€ πŸ“„ .env                        # Environment variables (API keys)
β”œβ”€β”€ πŸ“„ .gitignore                  # Git ignore rules
β”‚
β”œβ”€β”€ πŸ“„ requirements.txt            # Python dependencies
β”œβ”€β”€ πŸ“„ pyproject.toml              # Project metadata (if using uv)
β”‚
└── πŸ“„ README.md                   # Project documentation (this file)

πŸ“‹ Key Components

πŸ€– Agent Module (agent/)

  • graph.py: Defines the LangGraph workflow with conditional routing
  • nodes.py: Implements agent nodes:
    • router_node: Classifies queries (RAG/WEBSEARCH/GENERAL)
    • vectordb_node: Retrieves from local ChromaDB
    • web_search_agent_node: Executes web searches
    • generate_node: Generates final responses
  • state.py: Defines AgentState with message history, routing method, and context
  • tools.py: Tool implementations for Tavily, Wikipedia, ArXiv, and ChromaDB
  • prompts.py: System prompts for routing and generation

🎯 Core Module (core/)

  • llm.py: Initializes the LLM (Anthropic Claude Sonnet 4.5)
  • rag_agent.py: Main RAGAgent class that orchestrates the graph

πŸ–₯️ UI Module (ui/)

  • gradio_components.py: Gradio web interface with chat functionality

πŸ“Š Data Module (data/)

  • documents/: Raw source documents for ingestion
  • chroma_db/: Persisted vector embeddings

βš™οΈ Configuration

  • config.py: Centralized configuration (model names, paths, API settings)
  • .env: API keys (ANTHROPIC_API_KEY, TAVILY_API_KEY)

πŸš€ Entry Points

  • main.py: Launches the Gradio UI
  • test_scripts.py: Runs agent tests

πŸ”„ Data Flow

User Query
    ↓
[Router Node] β†’ Classifies intent (RAG/WEBSEARCH/GENERAL)
    ↓
    β”œβ”€β†’ [VectorDB Node] β†’ Retrieves from ChromaDB β†’ [Generate Node]
    β”œβ”€β†’ [Web Search Agent] β†’ Calls Tavily/Wikipedia β†’ [Generate Node]
    └─→ [Generate Node] β†’ Uses LLM knowledge only
         ↓
    Response to User

πŸ› οΈ Technology Stack

  • LangChain: Framework for LLM applications
  • LangGraph: Workflow orchestration
  • Anthropic Claude: LLM (Sonnet 4.5)
  • ChromaDB: Vector database
  • Gradio: Web UI framework
  • HuggingFace: Embeddings model
  • Tavily: Web search API
  • UV: Python package manager

πŸš€ Quick Start with UV

Prerequisites

  • Python 3.10+
  • UV package manager (Install UV)
  • API Keys: Anthropic, Tavily

1️⃣ Clone the Repository

git clone https://github.com/yourusername/mai-rag-agent.git
cd mai-rag-agent

2️⃣ Create Virtual Environment with UV

# Create a new virtual environment
uv venv

# Activate the environment
source .venv/bin/activate  # Linux/macOS
# or
.venv\Scripts\activate     # Windows

3️⃣ Install Dependencies

# Install all dependencies from requirements.txt
uv pip install -r requirements.txt

# Or install directly from pyproject.toml (if available)
uv pip install -e .

4️⃣ Set Up Environment Variables

# Copy example environment file
cp .env.example .env

# Edit .env and add your API keys
nano .env  # or use your preferred editor

Required environment variables:

GOOGLE_API_KEY=xxxxxxxxxxxxx      # Gemini API key
TAVILY_API_KEY=tvly-xxxxxxxxxxxxx # Enable web search

5️⃣ Prepare Data

# Create necessary directories
mkdir -p data/documents data/chroma_db

# Add your documents to data/documents/
# Then run ingestion (if you have an ingestion script)
# python ingest_data.py

6️⃣ Run the Application

# Launch the Gradio UI
python main.py

7️⃣ Run Tests (Optional)

# Test the agent functionality
python test_scripts.py

🐳 Quick Start with Dev Container (Alternative)

If you're using VS Code with Dev Containers:

# 1. Open in VS Code
code .

# 2. Reopen in Container
# Command Palette (Ctrl+Shift+P) β†’ "Dev Containers: Reopen in Container"

# 3. Inside container, install dependencies
uv pip install -r requirements.txt

# 4. Set up .env file
cp .env.example .env
# Edit .env with your API keys

# 5. Run the app
python main.py

πŸ“¦ UV-Specific Commands

# Update all dependencies
uv pip install --upgrade -r requirements.txt

# List installed packages
uv pip list

# Freeze current environment
uv pip freeze > requirements.txt

# Install a new package
uv pip install package-name

# Uninstall a package
uv pip uninstall package-name

# Sync environment (removes unused packages)
uv pip sync requirements.txt

πŸ”§ Troubleshooting

Issue: uv command not found

# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh

# Add to PATH (if needed)
export PATH="$HOME/.cargo/bin:$PATH"

Issue: API key not loading

# Check if .env exists
cat .env | grep -i api

# Ensure no typos in variable names
# Should be: ANTHROPIC_API_KEY and TAVILY_API_KEY

Issue: ChromaDB not found

# Ensure data directories exist
mkdir -p data/chroma_db

# Check permissions
chmod -R 755 data/

Issue: Port 7860 already in use

# Find and kill the process
lsof -ti:7860 | xargs kill -9

# Or use a different port in main.py
# demo.launch(server_port=7861)

🎯 Next Steps

  1. βœ… Add your documents to data/documents/
  2. βœ… Configure embeddings model in config.py
  3. βœ… Customize prompts in agent/prompts.py
  4. βœ… Test with sample queries in the Gradio UI
  5. βœ… Deploy to production (see deployment docs)

πŸ“š Additional Resources

πŸ“š Reference (Used for demo as document in vector store)

  1. Wei, H., Sun, Y., & Li, Y. (2025). Deepseek-ocr: Contexts optical compression. arXiv preprint arXiv:2510.18234.
  2. Chen, X., Chu, F. J., Gleize, P., Liang, K. J., Sax, A., Tang, H., ... & SAM 3D Team. (2025). SAM 3D: 3Dfy Anything in Images. arXiv preprint arXiv:2511.16624.
  3. Carion, N., Gustafson, L., Hu, Y. T., Debnath, S., Hu, R., Suris, D., ... & Feichtenhofer, C. (2025). SAM 3: Segment Anything with Concepts. arXiv preprint arXiv:2511.16719.
  4. Yan, B. Y., Li, C., Qian, H., Lu, S., & Liu, Z. (2025). General Agentic Memory Via Deep Research. arXiv preprint arXiv:2511.18423.
  5. Zhang, S., Fan, J., Fan, M., Li, G., & Du, X. (2025). Deepanalyze: Agentic large language models for autonomous data science. arXiv preprint arXiv:2510.16872.