|
# Getting Started with Unified AI Services
|
|
|
|
This guide will walk you through setting up and running the complete Unified AI Services system.
|
|
|
|
## π Quick Overview
|
|
|
|
The Unified AI Services system consists of:
|
|
- **NER Service** (Port 8500): Named Entity Recognition with relationship extraction
|
|
- **OCR Service** (Port 8400): Optical Character Recognition with document processing
|
|
- **RAG Service** (Port 8401): Retrieval-Augmented Generation with vector search
|
|
- **Unified App** (Port 8000): Main application coordinating all services
|
|
|
|
## π Quick Start (Recommended)
|
|
|
|
### Step 1: Automated Setup
|
|
|
|
```bash
|
|
# Run the automated setup wizard
|
|
python setup.py
|
|
```
|
|
|
|
This will:
|
|
- β
Check your Python environment
|
|
- β
Create necessary directories
|
|
- β
Help configure your .env file
|
|
- β
Install dependencies
|
|
- β
Validate configuration
|
|
- β
Create startup scripts
|
|
|
|
### Step 2: Start the System
|
|
|
|
```bash
|
|
# Start all services automatically
|
|
python app.py
|
|
```
|
|
|
|
Or use the generated scripts:
|
|
- **Windows**: Double-click `start_services.bat`
|
|
- **Linux/Mac**: Run `./start_services.sh`
|
|
|
|
### Step 3: Test the System
|
|
|
|
```bash
|
|
# Run comprehensive tests
|
|
python test_unified.py
|
|
```
|
|
|
|
Or use the generated scripts:
|
|
- **Windows**: Double-click `run_tests.bat`
|
|
- **Linux/Mac**: Run `./run_tests.sh`
|
|
|
|
### Step 4: Try the Demo
|
|
|
|
```bash
|
|
# Run interactive demo
|
|
python demo.py
|
|
```
|
|
|
|
## π File Structure
|
|
|
|
After setup, your directory should look like this:
|
|
|
|
```
|
|
unified-ai-services/
|
|
βββ app.py # π Main unified application
|
|
βββ configs.py # βοΈ Configuration management
|
|
βββ setup.py # π οΈ Automated setup script
|
|
βββ manage_services.py # π§ Service management tool
|
|
βββ test_unified.py # π§ͺ Comprehensive test suite
|
|
βββ demo.py # π¬ Interactive demo
|
|
βββ requirements.txt # π¦ Python dependencies
|
|
βββ .env # π Environment configuration
|
|
βββ README.md # π Documentation
|
|
βββ GETTING_STARTED.md # π This file
|
|
βββ services/ # π Service implementations
|
|
β βββ ner_service.py # Named Entity Recognition
|
|
β βββ ocr_service.py # Optical Character Recognition
|
|
β βββ rag_service.py # Retrieval-Augmented Generation
|
|
βββ exports/ # π Generated export files
|
|
βββ logs/ # π Application logs
|
|
βββ temp/ # ποΈ Temporary files
|
|
```
|
|
|
|
## βοΈ Manual Setup (Alternative)
|
|
|
|
If you prefer manual setup:
|
|
|
|
### Prerequisites
|
|
- Python 3.8 or higher
|
|
- PostgreSQL with vector extension
|
|
- Azure OpenAI account
|
|
- Azure Document Intelligence account
|
|
- DeepSeek API account
|
|
|
|
### 1. Install Dependencies
|
|
|
|
```bash
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
### 2. Configure Environment
|
|
|
|
Create a `.env` file with your configuration:
|
|
|
|
```bash
|
|
# Server Configuration
|
|
HOST=0.0.0.0
|
|
MAIN_PORT=8000
|
|
NER_PORT=8500
|
|
OCR_PORT=8400
|
|
RAG_PORT=8401
|
|
|
|
# PostgreSQL Configuration
|
|
POSTGRES_HOST=your-postgres-server.com
|
|
POSTGRES_PORT=5432
|
|
POSTGRES_USER=your-username
|
|
POSTGRES_PASSWORD=your-password
|
|
POSTGRES_DATABASE=postgres
|
|
|
|
# Azure OpenAI Configuration
|
|
AZURE_OPENAI_ENDPOINT=https://your-openai.openai.azure.com/
|
|
AZURE_OPENAI_API_KEY=your-api-key
|
|
EMBEDDING_MODEL=text-embedding-3-large
|
|
|
|
# DeepSeek Configuration (for advanced NER)
|
|
DEEPSEEK_ENDPOINT=https://your-deepseek-endpoint/
|
|
DEEPSEEK_API_KEY=your-deepseek-key
|
|
DEEPSEEK_MODEL=DeepSeek-R1-0528
|
|
|
|
# Azure Document Intelligence Configuration
|
|
AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT=https://your-di.cognitiveservices.azure.com/
|
|
AZURE_DOCUMENT_INTELLIGENCE_KEY=your-di-key
|
|
|
|
# Azure Storage Configuration
|
|
AZURE_STORAGE_ACCOUNT_URL=https://yourstorage.blob.core.windows.net/
|
|
AZURE_BLOB_SAS_TOKEN=your-sas-token
|
|
BLOB_CONTAINER=historylog
|
|
```
|
|
|
|
### 3. Create Directory Structure
|
|
|
|
```bash
|
|
mkdir -p services exports logs temp tests data
|
|
```
|
|
|
|
### 4. Place Service Files
|
|
|
|
Ensure your service files are in the correct locations:
|
|
- `services/ner_service.py`
|
|
- `services/ocr_service.py`
|
|
- `services/rag_service.py`
|
|
|
|
## π§ Service Management
|
|
|
|
### Using the Service Manager
|
|
|
|
The `manage_services.py` script provides easy service management:
|
|
|
|
```bash
|
|
# Start individual services
|
|
python manage_services.py start ner
|
|
python manage_services.py start ocr
|
|
python manage_services.py start rag
|
|
python manage_services.py start unified
|
|
|
|
# Start all services
|
|
python manage_services.py start all
|
|
|
|
# Check status
|
|
python manage_services.py status
|
|
|
|
# Test services
|
|
python manage_services.py test ner
|
|
python manage_services.py test all
|
|
|
|
# Stop services
|
|
python manage_services.py stop all
|
|
|
|
# Restart services
|
|
python manage_services.py restart all
|
|
|
|
# List available services
|
|
python manage_services.py list
|
|
```
|
|
|
|
### Direct Service Management
|
|
|
|
Start services individually for development:
|
|
|
|
```bash
|
|
# Terminal 1: Start OCR service
|
|
cd services && python ocr_service.py
|
|
|
|
# Terminal 2: Start RAG service
|
|
cd services && python rag_service.py
|
|
|
|
# Terminal 3: Start NER service
|
|
cd services && python ner_service.py
|
|
|
|
# Terminal 4: Start unified application
|
|
python app.py
|
|
```
|
|
|
|
## π§ͺ Testing and Validation
|
|
|
|
### Comprehensive System Tests
|
|
|
|
```bash
|
|
# Run all tests
|
|
python test_unified.py
|
|
|
|
# Test output will show:
|
|
# β
Unified App Health Check
|
|
# β
Individual Service Health
|
|
# β
Unified Analysis (Text)
|
|
# β
Unified Analysis (URL)
|
|
# β
Combined Search
|
|
# β
Service Proxies
|
|
# β
File Upload (Unified)
|
|
# β
Service Discovery
|
|
# β
System Performance
|
|
# β
Error Handling
|
|
```
|
|
|
|
### Individual Service Tests
|
|
|
|
```bash
|
|
# Test NER service specifically
|
|
python test_ner.py
|
|
|
|
# Test RAG service specifically
|
|
python test_rag.py
|
|
```
|
|
|
|
### Quick Health Checks
|
|
|
|
```bash
|
|
# Check unified system
|
|
curl http://localhost:8000/health
|
|
|
|
# Check individual services
|
|
curl http://localhost:8500/health # NER
|
|
curl http://localhost:8400/health # OCR
|
|
curl http://localhost:8401/health # RAG
|
|
```
|
|
|
|
## π¬ Interactive Demo
|
|
|
|
The demo script showcases all system capabilities:
|
|
|
|
```bash
|
|
python demo.py
|
|
```
|
|
|
|
Demo includes:
|
|
- Multi-language text analysis (Thai + English)
|
|
- Entity and relationship extraction
|
|
- RAG document indexing
|
|
- Combined search functionality
|
|
- Service proxy testing
|
|
- Real-time performance monitoring
|
|
|
|
## π API Usage
|
|
|
|
### API Documentation
|
|
|
|
Once running, access interactive documentation:
|
|
- **Unified API**: http://localhost:8000/docs
|
|
- **NER Service**: http://localhost:8500/docs
|
|
- **OCR Service**: http://localhost:8400/docs
|
|
- **RAG Service**: http://localhost:8401/docs
|
|
|
|
### Key Endpoints
|
|
|
|
#### Unified Analysis
|
|
```python
|
|
# Analyze text with automatic RAG indexing
|
|
POST /analyze/unified
|
|
{
|
|
"text": "Your text here...",
|
|
"extract_relationships": true,
|
|
"enable_rag_indexing": true,
|
|
"rag_title": "Document Title"
|
|
}
|
|
```
|
|
|
|
#### Combined Search
|
|
```python
|
|
# Search with automatic NER enhancement
|
|
POST /search/combined
|
|
{
|
|
"query": "search terms",
|
|
"include_ner_analysis": true,
|
|
"limit": 10
|
|
}
|
|
```
|
|
|
|
#### Service Proxies
|
|
```python
|
|
# Direct access to individual services
|
|
POST /ner/analyze/text # NER analysis
|
|
POST /ocr/upload # OCR processing
|
|
POST /rag/search # RAG search
|
|
GET /rag/documents # List documents
|
|
```
|
|
|
|
## π Health Monitoring
|
|
|
|
### System Status
|
|
|
|
```bash
|
|
# Get overall system health
|
|
GET /health
|
|
|
|
# Get detailed status
|
|
GET /status
|
|
|
|
# Discover available services
|
|
GET /services
|
|
```
|
|
|
|
### Service Monitoring
|
|
|
|
Each service provides health information:
|
|
- Response times
|
|
- Uptime
|
|
- Resource usage
|
|
- Configuration status
|
|
- Error rates
|
|
|
|
## π οΈ Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
#### 1. Services Won't Start
|
|
|
|
**Check ports:**
|
|
```bash
|
|
netstat -an | grep :8000
|
|
netstat -an | grep :8500
|
|
netstat -an | grep :8400
|
|
netstat -an | grep :8401
|
|
```
|
|
|
|
**Verify configuration:**
|
|
```bash
|
|
python configs.py
|
|
```
|
|
|
|
**Check dependencies:**
|
|
```bash
|
|
pip list | grep fastapi
|
|
pip list | grep asyncpg
|
|
```
|
|
|
|
#### 2. Database Connection Issues
|
|
|
|
**Test connection:**
|
|
```bash
|
|
# Use your actual connection details
|
|
python -c "
|
|
import asyncio
|
|
import asyncpg
|
|
|
|
async def test():
|
|
conn = await asyncpg.connect('postgresql://user:pass@host:5432/db')
|
|
print('Connected successfully')
|
|
await conn.close()
|
|
|
|
asyncio.run(test())
|
|
"
|
|
```
|
|
|
|
**Common fixes:**
|
|
- Verify PostgreSQL is running
|
|
- Check firewall rules
|
|
- Confirm SSL requirements
|
|
- Validate credentials
|
|
|
|
#### 3. Azure Service Issues
|
|
|
|
**Check API keys:**
|
|
```bash
|
|
# Test Azure OpenAI
|
|
curl -H "api-key: YOUR_KEY" "YOUR_ENDPOINT/openai/deployments/YOUR_MODEL/embeddings?api-version=2024-02-01"
|
|
|
|
# Test Document Intelligence
|
|
curl -H "Ocp-Apim-Subscription-Key: YOUR_KEY" "YOUR_ENDPOINT/formrecognizer/info?api-version=2023-07-31"
|
|
```
|
|
|
|
**Common fixes:**
|
|
- Verify API keys are correct
|
|
- Check service regions
|
|
- Confirm quota limits
|
|
- Validate endpoint URLs
|
|
|
|
#### 4. Performance Issues
|
|
|
|
**Monitor resources:**
|
|
```bash
|
|
# Check system resources
|
|
top
|
|
htop
|
|
python manage_services.py status
|
|
```
|
|
|
|
**Common solutions:**
|
|
- Increase system memory
|
|
- Optimize database queries
|
|
- Reduce concurrent requests
|
|
- Check network latency
|
|
|
|
### Getting Help
|
|
|
|
1. **Check logs**: Services log to console
|
|
2. **Run health checks**: Use `/health` endpoints
|
|
3. **Validate configuration**: Run `python configs.py`
|
|
4. **Test individual services**: Use service manager
|
|
5. **Check database connectivity**: Test connection strings
|
|
6. **Verify Azure services**: Check API endpoints
|
|
|
|
### Debug Mode
|
|
|
|
Enable debug mode for detailed logging:
|
|
|
|
```bash
|
|
# In .env file
|
|
DEBUG=True
|
|
|
|
# Or set environment variable
|
|
export DEBUG=true
|
|
python app.py
|
|
```
|
|
|
|
## π Production Deployment
|
|
|
|
### Security Considerations
|
|
|
|
1. **Environment Variables**: Use secure secret management
|
|
2. **HTTPS**: Enable SSL/TLS in production
|
|
3. **Authentication**: Implement API authentication
|
|
4. **Rate Limiting**: Add request rate limiting
|
|
5. **Input Validation**: Validate all input data
|
|
|
|
### Performance Optimization
|
|
|
|
1. **Caching**: Implement Redis caching
|
|
2. **Load Balancing**: Use reverse proxy (nginx)
|
|
3. **Database**: Optimize PostgreSQL configuration
|
|
4. **Monitoring**: Set up application monitoring
|
|
5. **Scaling**: Consider horizontal scaling
|
|
|
|
### Deployment Options
|
|
|
|
1. **Docker**: Containerize services
|
|
2. **Cloud**: Deploy to Azure/AWS/GCP
|
|
3. **Kubernetes**: Orchestrate with k8s
|
|
4. **CI/CD**: Automate deployments
|
|
|
|
## π Next Steps
|
|
|
|
After successful setup:
|
|
|
|
1. **Explore the API**: Use the interactive documentation
|
|
2. **Try the demo**: Run `python demo.py`
|
|
3. **Run tests**: Execute `python test_unified.py`
|
|
4. **Monitor system**: Check health endpoints
|
|
5. **Customize**: Modify services for your needs
|
|
6. **Scale**: Consider production deployment
|
|
|
|
## π― Success Indicators
|
|
|
|
You know the system is working when:
|
|
- β
All health checks pass
|
|
- β
Tests complete successfully
|
|
- β
Demo runs without errors
|
|
- β
API documentation is accessible
|
|
- β
Services respond to requests
|
|
- β
Database connections work
|
|
- β
Azure integrations function
|
|
- β
File uploads process correctly
|
|
- β
Search returns results
|
|
- β
Export files generate properly
|
|
|
|
**Congratulations! Your Unified AI Services system is ready to use! π** |