HonestAI

Paused

App Files Files Community

HonestAI / API_DOCUMENTATION.md

JatsTheAIGen

Add context mode endpoints (fresh vs relevant) and update API documentation

c3a42ce about 1 month ago

preview code

raw

history blame contribute delete

28.6 kB

Flask API Documentation

Overview

The Research AI Assistant API provides a RESTful interface for interacting with an AI-powered research assistant. The API uses local GPU models for inference and supports conversational interactions with context management.

Base URL (HF Spaces): https://jatinautonomouslabs-research-ai-assistant-api.hf.space

Alternative Base URL: https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API

API Version: 1.0

Content-Type: application/json

Note: For Hugging Face Spaces Docker deployments, use the .hf.space domain format. The space name is converted to lowercase with hyphens.

Features

🤖 AI-Powered Responses - Local GPU model inference (Tesla T4)
💬 Conversational Context - Maintains conversation history and user context
🔒 CORS Enabled - Ready for web integration
⚡ Async Processing - Efficient request handling
📊 Transparent Reasoning - Returns reasoning chains and performance metrics

Authentication

Currently, the API does not require authentication. However, for production use, you should:

Set HF_TOKEN environment variable for Hugging Face model access
Implement API key authentication if needed

Endpoints

1. Get API Information

Endpoint: GET /

Description: Returns API information, version, and available endpoints.

Request:

GET / HTTP/1.1
Host: huggingface.co

Response:

{
  "name": "AI Assistant Flask API",
  "version": "1.0",
  "status": "running",
  "orchestrator_ready": true,
  "features": {
    "local_gpu_models": true,
    "max_workers": 4,
    "hardware": "NVIDIA T4 Medium"
  },
  "endpoints": {
    "health": "GET /api/health",
    "chat": "POST /api/chat",
    "initialize": "POST /api/initialize",
    "context_mode_get": "GET /api/context/mode",
    "context_mode_set": "POST /api/context/mode"
  }
}

Status Codes:

200 OK - Success

2. Health Check

Endpoint: GET /api/health

Description: Checks if the API and orchestrator are ready to handle requests.

Request:

GET /api/health HTTP/1.1
Host: huggingface.co

Response:

{
  "status": "healthy",
  "orchestrator_ready": true
}

Status Codes:

200 OK - API is healthy
- orchestrator_ready: true - Ready to process requests
- orchestrator_ready: false - Still initializing

Example Response (Initializing):

{
  "status": "initializing",
  "orchestrator_ready": false
}

3. Chat Endpoint

Endpoint: POST /api/chat

Description: Send a message to the AI assistant and receive a response with reasoning and context.

Request Headers:

Content-Type: application/json

Request Body:

{
  "message": "Explain quantum entanglement in simple terms",
  "history": [
    ["User message 1", "Assistant response 1"],
    ["User message 2", "Assistant response 2"]
  ],
  "session_id": "session-123",
  "user_id": "user-456"
}

Request Fields:

Field	Type	Required	Description
`message`	string	✅ Yes	User's message/question (max 10,000 characters)
`history`	array	❌ No	Conversation history as array of `[user, assistant]` pairs
`session_id`	string	❌ No	Unique session identifier for context continuity
`user_id`	string	❌ No	User identifier (defaults to "anonymous")
`context_mode`	string	❌ No	Context retrieval mode: `"fresh"` (no user context) or `"relevant"` (only relevant context). Defaults to `"fresh"` if not set.

Response (Success):

{
  "success": true,
  "message": "Quantum entanglement is when two particles become linked...",
  "history": [
    ["Explain quantum entanglement", "Quantum entanglement is when two particles become linked..."]
  ],
  "reasoning": {
    "intent": "educational_query",
    "steps": ["Understanding request", "Gathering information", "Synthesizing response"],
    "confidence": 0.95
  },
  "performance": {
    "response_time_ms": 2345,
    "tokens_generated": 156,
    "model_used": "mistralai/Mistral-7B-Instruct-v0.2"
  }
}

Response Fields:

Field	Type	Description
`success`	boolean	Whether the request was successful
`message`	string	AI assistant's response
`history`	array	Updated conversation history including the new exchange
`reasoning`	object	AI reasoning process and confidence metrics
`performance`	object	Performance metrics (response time, tokens, model used)

Status Codes:

200 OK - Request processed successfully
400 Bad Request - Invalid request (missing message, empty message, too long, wrong type)
500 Internal Server Error - Server error processing request
503 Service Unavailable - Orchestrator not ready (still initializing)

Error Response:

{
  "success": false,
  "error": "Message is required",
  "message": "Error processing your request. Please try again."
}

Context Mode Feature:

The context_mode parameter controls how user context is retrieved and used:

"fresh" (default): No user context is included. Each conversation starts fresh, ideal for:
- General questions requiring no prior context
- Avoiding context contamination
- Faster responses (no context retrieval overhead)
"relevant": Only relevant user context is included based on relevance classification. The system:
- Analyzes all previous interactions for the session
- Classifies which interactions are relevant to the current query
- Includes only relevant context summaries
- Ideal for:
  - Follow-up questions that build on previous conversations
  - Maintaining continuity within a research session
  - Personalized responses based on user history

Example with Context Mode:

{
  "message": "Can you remind me what we discussed about quantum computing?",
  "session_id": "session-123",
  "user_id": "user-456",
  "context_mode": "relevant"
}

4. Initialize Orchestrator

Endpoint: POST /api/initialize

Description: Manually trigger orchestrator initialization (useful if initialization failed on startup).

Request:

POST /api/initialize HTTP/1.1
Host: huggingface.co
Content-Type: application/json

Request Body:

{}

Response (Success):

{
  "success": true,
  "message": "Orchestrator initialized successfully"
}

Response (Failure):

{
  "success": false,
  "message": "Initialization failed. Check logs for details."
}

Status Codes:

200 OK - Initialization successful
500 Internal Server Error - Initialization failed

5. Get Context Mode

Endpoint: GET /api/context/mode

Description: Retrieve the current context retrieval mode for a session.

Request:

GET /api/context/mode?session_id=session-123 HTTP/1.1
Host: huggingface.co

Query Parameters:

Parameter	Type	Required	Description
`session_id`	string	✅ Yes	Session identifier

Response (Success):

{
  "success": true,
  "session_id": "session-123",
  "context_mode": "fresh",
  "description": {
    "fresh": "No user context included - starts fresh each time",
    "relevant": "Only relevant user context included based on relevance classification"
  }
}

Response Fields:

Field	Type	Description
`success`	boolean	Whether the request was successful
`session_id`	string	Session identifier
`context_mode`	string	Current mode: `"fresh"` or `"relevant"`
`description`	object	Description of each mode

Status Codes:

200 OK - Success
400 Bad Request - Missing session_id parameter
500 Internal Server Error - Server error
503 Service Unavailable - Orchestrator not ready or context mode not available

Error Response:

{
  "success": false,
  "error": "session_id query parameter is required"
}

6. Set Context Mode

Endpoint: POST /api/context/mode

Description: Set the context retrieval mode for a session (fresh or relevant).

Request Headers:

Content-Type: application/json

Request Body:

{
  "session_id": "session-123",
  "mode": "relevant",
  "user_id": "user-456"
}

Request Fields:

Field	Type	Required	Description
`session_id`	string	✅ Yes	Session identifier
`mode`	string	✅ Yes	Context mode: `"fresh"` or `"relevant"`
`user_id`	string	❌ No	User identifier (defaults to "anonymous")

Response (Success):

{
  "success": true,
  "session_id": "session-123",
  "context_mode": "relevant",
  "message": "Context mode set successfully"
}

Response Fields:

Field	Type	Description
`success`	boolean	Whether the request was successful
`session_id`	string	Session identifier
`context_mode`	string	The mode that was set
`message`	string	Success message

Status Codes:

200 OK - Context mode set successfully
400 Bad Request - Invalid request (missing fields, invalid mode)
500 Internal Server Error - Server error or failed to set mode
503 Service Unavailable - Orchestrator not ready or context mode not available

Error Response:

{
  "success": false,
  "error": "mode must be 'fresh' or 'relevant'"
}

Usage Notes:

The context mode persists for the session until changed
Setting mode to "relevant" enables relevance classification, which analyzes all previous interactions to include only relevant context
Setting mode to "fresh" disables context retrieval, providing faster responses without user history
The mode can also be set per-request via the context_mode parameter in /api/chat

Code Examples

Python

import requests
import json

BASE_URL = "https://jatinautonomouslabs-research-ai-assistant-api.hf.space"

# Check health
def check_health():
    response = requests.get(f"{BASE_URL}/api/health")
    return response.json()

# Send chat message
def send_message(message, session_id=None, user_id=None, history=None, context_mode=None):
    payload = {
        "message": message,
        "session_id": session_id,
        "user_id": user_id or "anonymous",
        "history": history or []
    }
    if context_mode:
        payload["context_mode"] = context_mode
    
    response = requests.post(
        f"{BASE_URL}/api/chat",
        json=payload,
        headers={"Content-Type": "application/json"}
    )
    
    if response.status_code == 200:
        return response.json()
    else:
        raise Exception(f"API Error: {response.status_code} - {response.text}")

# Example usage
if __name__ == "__main__":
    # Check if API is ready
    health = check_health()
    print(f"API Status: {health}")
    
    if health.get("orchestrator_ready"):
        # Send a message
        result = send_message(
            message="What is machine learning?",
            session_id="my-session-123",
            user_id="user-456"
        )
        
        print(f"Response: {result['message']}")
        print(f"Reasoning: {result.get('reasoning', {})}")
        
        # Set context mode to relevant for follow-up
        import requests
        requests.post(
            f"{BASE_URL}/api/context/mode",
            json={
                "session_id": "my-session-123",
                "mode": "relevant",
                "user_id": "user-456"
            }
        )
        
        # Continue conversation with relevant context
        history = result['history']
        result2 = send_message(
            message="Can you explain neural networks?",
            session_id="my-session-123",
            user_id="user-456",
            history=history,
            context_mode="relevant"
        )
        print(f"Follow-up Response: {result2['message']}")

JavaScript (Fetch API)

const BASE_URL = 'https://jatinautonomouslabs-research-ai-assistant-api.hf.space';

// Check health
async function checkHealth() {
    const response = await fetch(`${BASE_URL}/api/health`);
    return await response.json();
}

// Get context mode for a session
async function getContextMode(sessionId) {
    const response = await fetch(`${BASE_URL}/api/context/mode?session_id=${sessionId}`);
    if (!response.ok) {
        throw new Error(`API Error: ${response.status}`);
    }
    return await response.json();
}

// Set context mode for a session
async function setContextMode(sessionId, mode, userId = null) {
    const payload = {
        session_id: sessionId,
        mode: mode
    };
    if (userId) {
        payload.user_id = userId;
    }
    
    const response = await fetch(`${BASE_URL}/api/context/mode`, {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json'
        },
        body: JSON.stringify(payload)
    });
    
    if (!response.ok) {
        const error = await response.json();
        throw new Error(`API Error: ${response.status} - ${error.error || error.message}`);
    }
    
    return await response.json();
}

// Send chat message
async function sendMessage(message, sessionId = null, userId = null, history = [], contextMode = null) {
    const payload = {
        message: message,
        session_id: sessionId,
        user_id: userId || 'anonymous',
        history: history
    };
    if (contextMode) {
        payload.context_mode = contextMode;
    }
    
    const response = await fetch(`${BASE_URL}/api/chat`, {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json'
        },
        body: JSON.stringify(payload)
    });
    
    if (!response.ok) {
        const error = await response.json();
        throw new Error(`API Error: ${response.status} - ${error.error || error.message}`);
    }
    
    return await response.json();
}

// Example usage
async function main() {
    try {
        // Check if API is ready
        const health = await checkHealth();
        console.log('API Status:', health);
        
        if (health.orchestrator_ready) {
            // Send a message
            const result = await sendMessage(
                'What is machine learning?',
                'my-session-123',
                'user-456'
            );
            
            console.log('Response:', result.message);
            console.log('Reasoning:', result.reasoning);
            
            // Continue conversation with relevant context
            await setContextMode('my-session-123', 'relevant', 'user-456');
            const result2 = await sendMessage(
                'Can you explain neural networks?',
                'my-session-123',
                'user-456',
                result.history,
                'relevant'
            );
            console.log('Follow-up Response:', result2.message);
            
            // Check current context mode
            const modeInfo = await getContextMode('my-session-123');
            console.log('Current context mode:', modeInfo.context_mode);
        }
    } catch (error) {
        console.error('Error:', error);
    }
}

main();

cURL

# Check health
curl -X GET "https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/health"

# Get context mode
curl -X GET "https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/context/mode?session_id=my-session-123"

# Set context mode to relevant
curl -X POST "https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/context/mode" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "my-session-123",
    "mode": "relevant",
    "user_id": "user-456"
  }'

# Send chat message
curl -X POST "https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What is machine learning?",
    "session_id": "my-session-123",
    "user_id": "user-456",
    "context_mode": "relevant",
    "history": []
  }'

# Continue conversation
curl -X POST "https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Can you explain neural networks?",
    "session_id": "my-session-123",
    "user_id": "user-456",
    "history": [
      ["What is machine learning?", "Machine learning is a subset of artificial intelligence..."]
    ]
  }'

Node.js (Axios)

const axios = require('axios');

const BASE_URL = 'https://jatinautonomouslabs-research-ai-assistant-api.hf.space';

// Check health
async function checkHealth() {
    const response = await axios.get(`${BASE_URL}/api/health`);
    return response.data;
}

// Get context mode
async function getContextMode(sessionId) {
    const response = await axios.get(`${BASE_URL}/api/context/mode`, {
        params: { session_id: sessionId }
    });
    return response.data;
}

// Set context mode
async function setContextMode(sessionId, mode, userId = null) {
    const payload = {
        session_id: sessionId,
        mode: mode
    };
    if (userId) payload.user_id = userId;
    
    const response = await axios.post(`${BASE_URL}/api/context/mode`, payload);
    return response.data;
}

// Send chat message
async function sendMessage(message, sessionId = null, userId = null, history = [], contextMode = null) {
    try {
        const payload = {
            message: message,
            session_id: sessionId,
            user_id: userId || 'anonymous',
            history: history
        };
        if (contextMode) payload.context_mode = contextMode;
        
        const response = await axios.post(`${BASE_URL}/api/chat`, payload, {
            headers: {
                'Content-Type': 'application/json'
            }
        });
        
        return response.data;
    } catch (error) {
        if (error.response) {
            throw new Error(`API Error: ${error.response.status} - ${error.response.data.error || error.response.data.message}`);
        }
        throw error;
    }
}

// Example usage
(async () => {
    try {
        const health = await checkHealth();
        console.log('API Status:', health);
        
        if (health.orchestrator_ready) {
            // Set context mode to relevant
            await setContextMode('my-session-123', 'relevant', 'user-456');
            
            const result = await sendMessage(
                'What is machine learning?',
                'my-session-123',
                'user-456',
                [],
                'relevant'
            );
            
            console.log('Response:', result.message);
            
            // Check current mode
            const modeInfo = await getContextMode('my-session-123');
            console.log('Context mode:', modeInfo.context_mode);
        }
    } catch (error) {
        console.error('Error:', error.message);
    }
})();

Error Handling

Common Error Responses

400 Bad Request

Missing Message:

{
  "success": false,
  "error": "Message is required"
}

Empty Message:

{
  "success": false,
  "error": "Message cannot be empty"
}

Message Too Long:

{
  "success": false,
  "error": "Message too long. Maximum length is 10000 characters"
}

Invalid Type:

{
  "success": false,
  "error": "Message must be a string"
}

503 Service Unavailable

Orchestrator Not Ready:

{
  "success": false,
  "error": "Orchestrator not ready",
  "message": "AI system is initializing. Please try again in a moment."
}

Solution: Wait a few seconds and retry, or check the /api/health endpoint.

500 Internal Server Error

Generic Error:

{
  "success": false,
  "error": "Error message here",
  "message": "Error processing your request. Please try again."
}

Best Practices

1. Session Management

Use consistent session IDs for maintaining conversation context
Generate unique session IDs per user conversation thread
Include conversation history in subsequent requests for better context

# Good: Maintains context
session_id = "user-123-session-1"
history = []

# First message
result1 = send_message("What is AI?", session_id=session_id, history=history)
history = result1['history']

# Follow-up message (includes context)
result2 = send_message("Can you explain more?", session_id=session_id, history=history)

2. Error Handling

Always implement retry logic for 503 errors:

import time

def send_message_with_retry(message, max_retries=3, retry_delay=2):
    for attempt in range(max_retries):
        try:
            result = send_message(message)
            return result
        except Exception as e:
            if "503" in str(e) and attempt < max_retries - 1:
                time.sleep(retry_delay)
                continue
            raise

3. Health Checks

Check API health before sending requests:

def is_api_ready():
    try:
        health = check_health()
        return health.get("orchestrator_ready", False)
    except:
        return False

if is_api_ready():
    # Send request
    result = send_message("Hello")
else:
    print("API is not ready yet")

4. Rate Limiting

No explicit rate limits are currently enforced
Recommended: Implement client-side rate limiting (e.g., 1 request per second)
Consider: Implementing request queuing for high-volume applications

5. Message Length

Maximum: 10,000 characters per message
Recommended: Keep messages concise for faster processing
For long content: Split into multiple messages or summarize

6. Context Management

Include history in requests to maintain conversation context
Session IDs help track conversations across multiple requests
User IDs enable personalization and user-specific context

Integration Examples

React Component

import React, { useState, useEffect } from 'react';

const AIAssistant = () => {
    const [message, setMessage] = useState('');
    const [history, setHistory] = useState([]);
    const [loading, setLoading] = useState(false);
    const [sessionId] = useState(`session-${Date.now()}`);
    
    const sendMessage = async () => {
        if (!message.trim()) return;
        
        setLoading(true);
        try {
            const response = await fetch('https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/chat', {
                method: 'POST',
                headers: { 'Content-Type': 'application/json' },
                body: JSON.stringify({
                    message: message,
                    session_id: sessionId,
                    user_id: 'user-123',
                    history: history
                })
            });
            
            const data = await response.json();
            if (data.success) {
                setHistory(data.history);
                setMessage('');
            }
        } catch (error) {
            console.error('Error:', error);
        } finally {
            setLoading(false);
        }
    };
    
    return (
        <div>
            <div className="chat-history">
                {history.map(([user, assistant], idx) => (
                    <div key={idx}>
                        <div><strong>You:</strong> {user}</div>
                        <div><strong>Assistant:</strong> {assistant}</div>
                    </div>
                ))}
            </div>
            <input
                value={message}
                onChange={(e) => setMessage(e.target.value)}
                onKeyPress={(e) => e.key === 'Enter' && sendMessage()}
                disabled={loading}
            />
            <button onClick={sendMessage} disabled={loading}>
                {loading ? 'Sending...' : 'Send'}
            </button>
        </div>
    );
};

Python CLI Tool

#!/usr/bin/env python3
import requests
import sys

BASE_URL = "https://jatinautonomouslabs-research-ai-assistant-api.hf.space"

class ChatCLI:
    def __init__(self):
        self.session_id = f"cli-session-{hash(__file__)}"
        self.history = []
    
    def chat(self, message):
        response = requests.post(
            f"{BASE_URL}/api/chat",
            json={
                "message": message,
                "session_id": self.session_id,
                "user_id": "cli-user",
                "history": self.history
            }
        )
        
        if response.status_code == 200:
            data = response.json()
            self.history = data['history']
            return data['message']
        else:
            return f"Error: {response.status_code} - {response.text}"
    
    def run(self):
        print("AI Assistant CLI (Type 'exit' to quit)")
        print("=" * 50)
        
        while True:
            user_input = input("\nYou: ").strip()
            if user_input.lower() in ['exit', 'quit']:
                break
            
            print("Assistant: ", end="", flush=True)
            response = self.chat(user_input)
            print(response)

if __name__ == "__main__":
    cli = ChatCLI()
    cli.run()

Response Times

Typical Response: 2-10 seconds
First Request: May take longer due to model loading (10-30 seconds)
Subsequent Requests: Faster due to cached models (2-5 seconds)

Factors Affecting Response Time:

Message length
Model loading (first request)
GPU availability
Concurrent requests

Troubleshooting

Common Issues

404 Not Found

Problem: Getting 404 when accessing the API

Solutions:

Verify the Space is running:
- Check the Hugging Face Space page to ensure it's built and running
- Wait for the initial build to complete (5-10 minutes)
Check URL format:
- ✅ Correct: https://jatinautonomouslabs-research-ai-assistant-api.hf.space
- ❌ Wrong: https://jatinautonomouslabs-research_ai_assistant_api.hf.space (underscores)
- ✅ Alternative: https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API
Verify endpoint paths:
- Health: GET /api/health
- Chat: POST /api/chat
- Root: GET /

Test with root endpoint first:

curl https://jatinautonomouslabs-research-ai-assistant-api.hf.space/

503 Service Unavailable

Problem: Orchestrator not ready

Solutions:

Wait 30-60 seconds for initialization
Check /api/health endpoint
Use /api/initialize to manually trigger initialization

CORS Errors

Problem: CORS errors in browser

Solutions:

The API has CORS enabled for all origins
If issues persist, check browser console for specific errors
Ensure you're using the correct base URL

Testing API Connectivity

Quick Health Check:

# Test root endpoint
curl https://jatinautonomouslabs-research-ai-assistant-api.hf.space/

# Test health endpoint
curl https://jatinautonomouslabs-research-ai-assistant-api.hf.space/api/health

Python Test Script:

import requests

BASE_URL = "https://jatinautonomouslabs-research-ai-assistant-api.hf.space"

# Test root
try:
    response = requests.get(f"{BASE_URL}/", timeout=10)
    print(f"Root endpoint: {response.status_code} - {response.json()}")
except Exception as e:
    print(f"Root endpoint failed: {e}")

# Test health
try:
    response = requests.get(f"{BASE_URL}/api/health", timeout=10)
    print(f"Health endpoint: {response.status_code} - {response.json()}")
except Exception as e:
    print(f"Health endpoint failed: {e}")

Support

For issues, questions, or contributions:

Repository: [GitHub Repository URL]
Hugging Face Space: https://huggingface.co/spaces/JatinAutonomousLabs/Research_AI_Assistant_API

Changelog

Version 1.0 (Current)

Initial API release
Chat endpoint with context management
Health check endpoint
Local GPU model inference
CORS enabled for web integration

License

This API is provided as-is. Please refer to the main project README for license information.