Spaces:

syntaxhacker
/

developer-portfolio-rag

Sleeping

App Files Files Community

developer-portfolio-rag / api.md

rohit

update

b7b8e60 about 2 months ago

preview code

raw

history blame contribute delete

5.34 kB

RAG Pipeline API Documentation

Overview

FastAPI-based RAG (Retrieval-Augmented Generation) pipeline with OpenRouter GLM integration for intelligent tool calling.

Base URL

http://localhost:8000

Endpoints

`/chat` - Main Chat Endpoint

Method: POST
Description: Intelligent chat with RAG tool calling. GLM automatically determines when to use RAG vs. general conversation.

Request Body

{
  "messages": [
    {
      "role": "user|assistant|system",
      "content": "string"
    }
  ]
}

Response Format

{
  "response": "string",
  "tool_calls": [
    {
      "name": "rag_qa",
      "arguments": "{\"question\": \"string\", \"dataset\": \"string\"}"
    }
  ] | null
}

Examples

1. General Greeting (No RAG):

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"hi"}]}'

Response:

{
  "response": "Hi! I'm Rohit's AI assistant. I can help you learn about his professional background, skills, and experience. What would you like to know about Rohit?",
  "tool_calls": null
}

2. Portfolio Question (RAG Enabled):

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"What is your current role?"}]}'

Response:

{
  "response": "Based on the portfolio information, Rohit is currently working as a Tech Lead at FleetEnable, where he leads UI development for a logistics SaaS product focused on drayage and freight management...",
  "tool_calls": [
    {
      "name": "rag_qa", 
      "arguments": "{\"question\": \"What is your current role?\"}"
    }
  ]
}

`/health` - Health Check

Method: GET
Description: Check API and dataset loading status.

Response

{
  "status": "healthy",
  "datasets_loaded": 1,
  "available_datasets": ["developer-portfolio"]
}

`/datasets` - List Available Datasets

Method: GET
Description: Get list of available datasets.

Response

{
  "datasets": ["developer-portfolio"]
}

Features

🧠 Intelligent Tool Calling

Automatic Detection: GLM determines when questions need RAG vs. general conversation
Context-Aware: Uses portfolio information for relevant questions
Natural Responses: Synthesizes RAG results into conversational answers

🎯 Third-Person AI Assistant

Portfolio Focus: Responds about Rohit's experience (not "my" experience)
Professional Tone: Maintains proper third-person references
Context Integration: Combines multiple data points coherently

⚡ Performance Optimizations

On-Demand Loading: Datasets load only when RAG is needed
Clean Output: No verbose ML logging for general conversations
Fast Responses: Sub-second for greetings, ~20s for first RAG query

Available Datasets

`developer-portfolio`

Content: Work experience, skills, projects, achievements
Topics: FleetEnable, Coditude, technologies, leadership
Size: 19 documents with full metadata

Error Handling

Common Responses

Datasets Loading: "RAG Pipeline is running but datasets are still loading..."
Dataset Not Found: "Dataset 'xyz' not available. Available datasets: [...]"
API Errors: HTTP 500 with error details

Status Codes

200 - Success
400 - Bad Request (invalid JSON, missing fields)
500 - Internal Server Error

Environment Variables

Create .env file:

OPENROUTER_API_KEY=sk-or-v1-your-key-here
PORT=8000
TOKENIZERS_PARALLELISM=false

Development

Running Locally

# Install dependencies
pip install -r requirements.txt

# Start server
python -m uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

# Or use script
./start.sh

Testing

# Health check
curl http://localhost:8000/health

# Chat test
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"hi"}]}'

Deployment

Docker

# Build
docker build -t rag-pipeline .

# Run
docker run -p 8000:8000 rag-pipeline

Hugging Face Spaces

Push code to repository
Connect Space to repository
Set environment variables in Space settings
Automatic deployment from main branch

Architecture

OpenRouter GLM-4.5-air (Parent AI)
├── Tool Calling Logic
│   ├── Automatically detects RAG-worthy questions
│   └── Falls back to general knowledge
├── RAG Tool Function
│   ├── Dataset selection (developer-portfolio)
│   ├── Document retrieval
│   └── Context formatting
└── Response Generation
    ├── Tool results integration
    └── Natural language responses

Changelog

v2.0 - Current

✅ OpenRouter GLM integration with tool calling
✅ Intelligent RAG vs. conversation detection
✅ Third-person AI assistant for Rohit's portfolio
✅ On-demand dataset loading
✅ Removed /answer endpoint (use /chat only)
✅ Environment variable configuration
✅ Performance optimizations

v1.0 - Legacy

Google Gemini integration
Multiple endpoints (/answer, /chat)
Background dataset loading
First-person responses

RAG Pipeline API Documentation

Overview

Base URL

Endpoints

/chat - Main Chat Endpoint

Request Body

Response Format

Examples

/health - Health Check

Response

/datasets - List Available Datasets

Response

Features

🧠 Intelligent Tool Calling

🎯 Third-Person AI Assistant

⚡ Performance Optimizations

Available Datasets

developer-portfolio

Error Handling

Common Responses

Status Codes

Environment Variables

Development

Running Locally

Testing

Deployment

Docker

Hugging Face Spaces

Architecture

Changelog

v2.0 - Current

v1.0 - Legacy

`/chat` - Main Chat Endpoint

`/health` - Health Check

`/datasets` - List Available Datasets

`developer-portfolio`