SB-PoC / gettingstart.md
Chirapath's picture
First draft coding project
963ae98 verified

Getting Started with Unified AI Services

This guide will walk you through setting up and running the complete Unified AI Services system.

πŸ“‹ Quick Overview

The Unified AI Services system consists of:

  • NER Service (Port 8500): Named Entity Recognition with relationship extraction
  • OCR Service (Port 8400): Optical Character Recognition with document processing
  • RAG Service (Port 8401): Retrieval-Augmented Generation with vector search
  • Unified App (Port 8000): Main application coordinating all services

πŸš€ Quick Start (Recommended)

Step 1: Automated Setup

# Run the automated setup wizard
python setup.py

This will:

  • βœ… Check your Python environment
  • βœ… Create necessary directories
  • βœ… Help configure your .env file
  • βœ… Install dependencies
  • βœ… Validate configuration
  • βœ… Create startup scripts

Step 2: Start the System

# Start all services automatically
python app.py

Or use the generated scripts:

  • Windows: Double-click start_services.bat
  • Linux/Mac: Run ./start_services.sh

Step 3: Test the System

# Run comprehensive tests
python test_unified.py

Or use the generated scripts:

  • Windows: Double-click run_tests.bat
  • Linux/Mac: Run ./run_tests.sh

Step 4: Try the Demo

# Run interactive demo
python demo.py

πŸ“ File Structure

After setup, your directory should look like this:

unified-ai-services/
β”œβ”€β”€ app.py                    # 🌐 Main unified application
β”œβ”€β”€ configs.py               # βš™οΈ Configuration management
β”œβ”€β”€ setup.py                 # πŸ› οΈ Automated setup script
β”œβ”€β”€ manage_services.py       # πŸ”§ Service management tool
β”œβ”€β”€ test_unified.py          # πŸ§ͺ Comprehensive test suite
β”œβ”€β”€ demo.py                  # 🎬 Interactive demo
β”œβ”€β”€ requirements.txt         # πŸ“¦ Python dependencies
β”œβ”€β”€ .env                     # πŸ” Environment configuration
β”œβ”€β”€ README.md                # πŸ“– Documentation
β”œβ”€β”€ GETTING_STARTED.md       # πŸš€ This file
β”œβ”€β”€ services/                # πŸ“‚ Service implementations
β”‚   β”œβ”€β”€ ner_service.py      # Named Entity Recognition
β”‚   β”œβ”€β”€ ocr_service.py      # Optical Character Recognition
β”‚   └── rag_service.py      # Retrieval-Augmented Generation
β”œβ”€β”€ exports/                 # πŸ“ Generated export files
β”œβ”€β”€ logs/                    # πŸ“ Application logs
└── temp/                    # πŸ—‚οΈ Temporary files

βš™οΈ Manual Setup (Alternative)

If you prefer manual setup:

Prerequisites

  • Python 3.8 or higher
  • PostgreSQL with vector extension
  • Azure OpenAI account
  • Azure Document Intelligence account
  • DeepSeek API account

1. Install Dependencies

pip install -r requirements.txt

2. Configure Environment

Create a .env file with your configuration:

# Server Configuration
HOST=0.0.0.0
MAIN_PORT=8000
NER_PORT=8500
OCR_PORT=8400
RAG_PORT=8401

# PostgreSQL Configuration
POSTGRES_HOST=your-postgres-server.com
POSTGRES_PORT=5432
POSTGRES_USER=your-username
POSTGRES_PASSWORD=your-password
POSTGRES_DATABASE=postgres

# Azure OpenAI Configuration
AZURE_OPENAI_ENDPOINT=https://your-openai.openai.azure.com/
AZURE_OPENAI_API_KEY=your-api-key
EMBEDDING_MODEL=text-embedding-3-large

# DeepSeek Configuration (for advanced NER)
DEEPSEEK_ENDPOINT=https://your-deepseek-endpoint/
DEEPSEEK_API_KEY=your-deepseek-key
DEEPSEEK_MODEL=DeepSeek-R1-0528

# Azure Document Intelligence Configuration
AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT=https://your-di.cognitiveservices.azure.com/
AZURE_DOCUMENT_INTELLIGENCE_KEY=your-di-key

# Azure Storage Configuration
AZURE_STORAGE_ACCOUNT_URL=https://yourstorage.blob.core.windows.net/
AZURE_BLOB_SAS_TOKEN=your-sas-token
BLOB_CONTAINER=historylog

3. Create Directory Structure

mkdir -p services exports logs temp tests data

4. Place Service Files

Ensure your service files are in the correct locations:

  • services/ner_service.py
  • services/ocr_service.py
  • services/rag_service.py

πŸ”§ Service Management

Using the Service Manager

The manage_services.py script provides easy service management:

# Start individual services
python manage_services.py start ner
python manage_services.py start ocr
python manage_services.py start rag
python manage_services.py start unified

# Start all services
python manage_services.py start all

# Check status
python manage_services.py status

# Test services
python manage_services.py test ner
python manage_services.py test all

# Stop services
python manage_services.py stop all

# Restart services
python manage_services.py restart all

# List available services
python manage_services.py list

Direct Service Management

Start services individually for development:

# Terminal 1: Start OCR service
cd services && python ocr_service.py

# Terminal 2: Start RAG service
cd services && python rag_service.py

# Terminal 3: Start NER service
cd services && python ner_service.py

# Terminal 4: Start unified application
python app.py

πŸ§ͺ Testing and Validation

Comprehensive System Tests

# Run all tests
python test_unified.py

# Test output will show:
# βœ… Unified App Health Check
# βœ… Individual Service Health
# βœ… Unified Analysis (Text)
# βœ… Unified Analysis (URL)
# βœ… Combined Search
# βœ… Service Proxies
# βœ… File Upload (Unified)
# βœ… Service Discovery
# βœ… System Performance
# βœ… Error Handling

Individual Service Tests

# Test NER service specifically
python test_ner.py

# Test RAG service specifically
python test_rag.py

Quick Health Checks

# Check unified system
curl http://localhost:8000/health

# Check individual services
curl http://localhost:8500/health  # NER
curl http://localhost:8400/health  # OCR
curl http://localhost:8401/health  # RAG

🎬 Interactive Demo

The demo script showcases all system capabilities:

python demo.py

Demo includes:

  • Multi-language text analysis (Thai + English)
  • Entity and relationship extraction
  • RAG document indexing
  • Combined search functionality
  • Service proxy testing
  • Real-time performance monitoring

🌐 API Usage

API Documentation

Once running, access interactive documentation:

Key Endpoints

Unified Analysis

# Analyze text with automatic RAG indexing
POST /analyze/unified
{
    "text": "Your text here...",
    "extract_relationships": true,
    "enable_rag_indexing": true,
    "rag_title": "Document Title"
}

Combined Search

# Search with automatic NER enhancement
POST /search/combined
{
    "query": "search terms",
    "include_ner_analysis": true,
    "limit": 10
}

Service Proxies

# Direct access to individual services
POST /ner/analyze/text     # NER analysis
POST /ocr/upload           # OCR processing
POST /rag/search           # RAG search
GET  /rag/documents        # List documents

πŸ” Health Monitoring

System Status

# Get overall system health
GET /health

# Get detailed status
GET /status

# Discover available services
GET /services

Service Monitoring

Each service provides health information:

  • Response times
  • Uptime
  • Resource usage
  • Configuration status
  • Error rates

πŸ› οΈ Troubleshooting

Common Issues

1. Services Won't Start

Check ports:

netstat -an | grep :8000
netstat -an | grep :8500
netstat -an | grep :8400
netstat -an | grep :8401

Verify configuration:

python configs.py

Check dependencies:

pip list | grep fastapi
pip list | grep asyncpg

2. Database Connection Issues

Test connection:

# Use your actual connection details
python -c "
import asyncio
import asyncpg

async def test():
    conn = await asyncpg.connect('postgresql://user:pass@host:5432/db')
    print('Connected successfully')
    await conn.close()

asyncio.run(test())
"

Common fixes:

  • Verify PostgreSQL is running
  • Check firewall rules
  • Confirm SSL requirements
  • Validate credentials

3. Azure Service Issues

Check API keys:

# Test Azure OpenAI
curl -H "api-key: YOUR_KEY" "YOUR_ENDPOINT/openai/deployments/YOUR_MODEL/embeddings?api-version=2024-02-01"

# Test Document Intelligence
curl -H "Ocp-Apim-Subscription-Key: YOUR_KEY" "YOUR_ENDPOINT/formrecognizer/info?api-version=2023-07-31"

Common fixes:

  • Verify API keys are correct
  • Check service regions
  • Confirm quota limits
  • Validate endpoint URLs

4. Performance Issues

Monitor resources:

# Check system resources
top
htop
python manage_services.py status

Common solutions:

  • Increase system memory
  • Optimize database queries
  • Reduce concurrent requests
  • Check network latency

Getting Help

  1. Check logs: Services log to console
  2. Run health checks: Use /health endpoints
  3. Validate configuration: Run python configs.py
  4. Test individual services: Use service manager
  5. Check database connectivity: Test connection strings
  6. Verify Azure services: Check API endpoints

Debug Mode

Enable debug mode for detailed logging:

# In .env file
DEBUG=True

# Or set environment variable
export DEBUG=true
python app.py

πŸš€ Production Deployment

Security Considerations

  1. Environment Variables: Use secure secret management
  2. HTTPS: Enable SSL/TLS in production
  3. Authentication: Implement API authentication
  4. Rate Limiting: Add request rate limiting
  5. Input Validation: Validate all input data

Performance Optimization

  1. Caching: Implement Redis caching
  2. Load Balancing: Use reverse proxy (nginx)
  3. Database: Optimize PostgreSQL configuration
  4. Monitoring: Set up application monitoring
  5. Scaling: Consider horizontal scaling

Deployment Options

  1. Docker: Containerize services
  2. Cloud: Deploy to Azure/AWS/GCP
  3. Kubernetes: Orchestrate with k8s
  4. CI/CD: Automate deployments

πŸ“ž Next Steps

After successful setup:

  1. Explore the API: Use the interactive documentation
  2. Try the demo: Run python demo.py
  3. Run tests: Execute python test_unified.py
  4. Monitor system: Check health endpoints
  5. Customize: Modify services for your needs
  6. Scale: Consider production deployment

🎯 Success Indicators

You know the system is working when:

  • βœ… All health checks pass
  • βœ… Tests complete successfully
  • βœ… Demo runs without errors
  • βœ… API documentation is accessible
  • βœ… Services respond to requests
  • βœ… Database connections work
  • βœ… Azure integrations function
  • βœ… File uploads process correctly
  • βœ… Search returns results
  • βœ… Export files generate properly

Congratulations! Your Unified AI Services system is ready to use! πŸŽ‰