Spaces:
Paused
Paused
description: docker, infrastructure | |
alwaysApply: false | |
# Infrastructure & DevOps Guidelines | |
## Docker & Containerization | |
### Dockerfile Best Practices | |
```dockerfile | |
# Multi-stage build pattern for production optimization | |
FROM node:18-alpine AS frontend-builder | |
WORKDIR /app | |
COPY frontend/package*.json ./ | |
RUN npm ci --only=production | |
COPY frontend/ ./ | |
RUN npm run build | |
FROM python:3.11-slim AS backend-base | |
WORKDIR /app | |
COPY backend/pyproject.toml ./ | |
RUN pip install -e . | |
FROM backend-base AS backend-production | |
COPY backend/ ./ | |
EXPOSE 8000 | |
CMD ["gunicorn", "api:app", "--host", "0.0.0.0", "--port", "8000"] | |
# Security best practices | |
RUN addgroup --system --gid 1001 nodejs | |
RUN adduser --system --uid 1001 nextjs | |
USER nextjs | |
# Health check implementation | |
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ | |
CMD curl -f http://localhost:8000/health || exit 1 | |
``` | |
### Docker Compose Patterns | |
```yaml | |
# Production-ready docker-compose.yml | |
version: "3.8" | |
services: | |
frontend: | |
build: | |
context: ./frontend | |
dockerfile: Dockerfile | |
target: production | |
ports: | |
- "3000:3000" | |
environment: | |
- NODE_ENV=production | |
- NEXT_PUBLIC_SUPABASE_URL=${SUPABASE_URL} | |
depends_on: | |
- backend | |
restart: unless-stopped | |
networks: | |
- app-network | |
backend: | |
build: | |
context: ./backend | |
dockerfile: Dockerfile | |
ports: | |
- "8000:8000" | |
environment: | |
- DATABASE_URL=${DATABASE_URL} | |
- REDIS_URL=${REDIS_URL} | |
volumes: | |
- ./backend/logs:/app/logs | |
depends_on: | |
redis: | |
condition: service_healthy | |
restart: unless-stopped | |
networks: | |
- app-network | |
redis: | |
image: redis:7-alpine | |
ports: | |
- "6379:6379" | |
volumes: | |
- redis-data:/data | |
healthcheck: | |
test: ["CMD", "redis-cli", "ping"] | |
interval: 10s | |
timeout: 3s | |
retries: 3 | |
restart: unless-stopped | |
networks: | |
- app-network | |
nginx: | |
image: nginx:alpine | |
ports: | |
- "80:80" | |
- "443:443" | |
volumes: | |
- ./nginx.conf:/etc/nginx/nginx.conf:ro | |
- ./ssl:/etc/nginx/ssl:ro | |
depends_on: | |
- frontend | |
- backend | |
restart: unless-stopped | |
networks: | |
- app-network | |
volumes: | |
redis-data: | |
networks: | |
app-network: | |
driver: bridge | |
``` | |
## Environment Management | |
### Environment Configuration | |
```bash | |
# .env.local (development) | |
NODE_ENV=development | |
NEXT_PUBLIC_SUPABASE_URL=http://localhost:54321 | |
NEXT_PUBLIC_SUPABASE_ANON_KEY=your_anon_key | |
DATABASE_URL=postgresql://user:pass@localhost:5432/suna_dev | |
REDIS_URL=redis://localhost:6379 | |
LOG_LEVEL=debug | |
# .env.production | |
NODE_ENV=production | |
NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co | |
DATABASE_URL=postgresql://user:pass@prod-db:5432/suna_prod | |
REDIS_URL=redis://prod-redis:6379 | |
LOG_LEVEL=info | |
SENTRY_DSN=your_sentry_dsn | |
``` | |
### Tool Version Management (mise.toml) | |
```toml | |
[tools] | |
node = "18.17.0" | |
python = "3.11.5" | |
docker = "24.0.0" | |
docker-compose = "2.20.0" | |
[env] | |
UV_VENV = ".venv" | |
PYTHON_KEYRING_BACKEND = "keyring.backends.null.Keyring" | |
``` | |
### Environment-Specific Scripts | |
```bash | |
#!/bin/bash | |
# scripts/start-dev.sh | |
set -e | |
echo "Starting development environment..." | |
# Check if required tools are installed | |
command -v docker >/dev/null 2>&1 || { echo "Docker is required but not installed. Aborting." >&2; exit 1; } | |
command -v docker-compose >/dev/null 2>&1 || { echo "Docker Compose is required but not installed. Aborting." >&2; exit 1; } | |
# Start services | |
docker-compose -f docker-compose.dev.yml up -d | |
echo "✅ Development services started" | |
# Wait for services to be healthy | |
echo "Waiting for services to be ready..." | |
sleep 10 | |
# Run database migrations | |
docker-compose -f docker-compose.dev.yml exec backend python -m alembic upgrade head | |
echo "✅ Database migrations completed" | |
echo "🚀 Development environment is ready!" | |
echo "Frontend: http://localhost:3000" | |
echo "Backend: http://localhost:8000" | |
echo "Redis: localhost:6379" | |
``` | |
## Deployment Strategies | |
### GitHub Actions CI/CD | |
```yaml | |
# .github/workflows/deploy.yml | |
name: Deploy to Production | |
on: | |
push: | |
branches: [main] | |
pull_request: | |
branches: [main] | |
env: | |
REGISTRY: ghcr.io | |
IMAGE_NAME: ${{ github.repository }} | |
jobs: | |
test: | |
runs-on: ubuntu-latest | |
steps: | |
- uses: actions/checkout@v4 | |
- name: Set up Python | |
uses: actions/setup-python@v4 | |
with: | |
python-version: "3.11" | |
- name: Set up Node.js | |
uses: actions/setup-node@v4 | |
with: | |
node-version: "18" | |
cache: "npm" | |
cache-dependency-path: frontend/package-lock.json | |
- name: Install backend dependencies | |
run: | | |
cd backend | |
pip install -e . | |
- name: Install frontend dependencies | |
run: | | |
cd frontend | |
npm ci | |
- name: Run backend tests | |
run: | | |
cd backend | |
pytest | |
- name: Run frontend tests | |
run: | | |
cd frontend | |
npm run test | |
- name: Lint code | |
run: | | |
cd backend && python -m black --check . | |
cd frontend && npm run lint | |
build-and-push: | |
needs: test | |
runs-on: ubuntu-latest | |
if: github.ref == 'refs/heads/main' | |
permissions: | |
contents: read | |
packages: write | |
steps: | |
- uses: actions/checkout@v4 | |
- name: Log in to Container Registry | |
uses: docker/login-action@v3 | |
with: | |
registry: ${{ env.REGISTRY }} | |
username: ${{ github.actor }} | |
password: ${{ secrets.GITHUB_TOKEN }} | |
- name: Extract metadata | |
id: meta | |
uses: docker/metadata-action@v5 | |
with: | |
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }} | |
- name: Build and push Docker image | |
uses: docker/build-push-action@v5 | |
with: | |
context: . | |
push: true | |
tags: ${{ steps.meta.outputs.tags }} | |
labels: ${{ steps.meta.outputs.labels }} | |
deploy: | |
needs: build-and-push | |
runs-on: ubuntu-latest | |
if: github.ref == 'refs/heads/main' | |
steps: | |
- name: Deploy to production | |
run: | | |
# Add your deployment commands here | |
# e.g., kubectl apply, docker-compose up, etc. | |
echo "Deploying to production..." | |
``` | |
### Database Migration Management | |
```bash | |
#!/bin/bash | |
# scripts/migrate.sh | |
set -e | |
ENVIRONMENT=${1:-development} | |
echo "Running migrations for $ENVIRONMENT environment..." | |
case $ENVIRONMENT in | |
development) | |
docker-compose exec backend python -m alembic upgrade head | |
;; | |
production) | |
# Production migration with backup | |
kubectl exec -it backend-pod -- python -m alembic upgrade head | |
;; | |
*) | |
echo "Unknown environment: $ENVIRONMENT" | |
exit 1 | |
;; | |
esac | |
echo "✅ Migrations completed for $ENVIRONMENT" | |
``` | |
## Monitoring & Observability | |
### Health Check Endpoints | |
```python | |
# backend/health.py | |
from fastapi import APIRouter, Depends | |
from sqlalchemy.orm import Session | |
from redis import Redis | |
import time | |
router = APIRouter() | |
@router.get("/health") | |
async def health_check(): | |
"""Comprehensive health check endpoint""" | |
start_time = time.time() | |
checks = { | |
"status": "healthy", | |
"timestamp": start_time, | |
"version": "1.0.0", | |
"environment": os.getenv("NODE_ENV", "development"), | |
"checks": {} | |
} | |
# Database health check | |
try: | |
db.execute("SELECT 1") | |
checks["checks"]["database"] = {"status": "healthy", "latency_ms": 0} | |
except Exception as e: | |
checks["status"] = "unhealthy" | |
checks["checks"]["database"] = {"status": "unhealthy", "error": str(e)} | |
# Redis health check | |
try: | |
redis_client.ping() | |
checks["checks"]["redis"] = {"status": "healthy"} | |
except Exception as e: | |
checks["status"] = "unhealthy" | |
checks["checks"]["redis"] = {"status": "unhealthy", "error": str(e)} | |
checks["response_time_ms"] = (time.time() - start_time) * 1000 | |
return checks | |
@router.get("/metrics") | |
async def metrics(): | |
"""Prometheus-style metrics endpoint""" | |
return { | |
"active_connections": get_active_connections(), | |
"memory_usage_mb": get_memory_usage(), | |
"cpu_usage_percent": get_cpu_usage(), | |
"request_count": get_request_count(), | |
} | |
``` | |
### Logging Configuration | |
```python | |
# backend/utils/logging.py | |
import structlog | |
import logging.config | |
def setup_logging(environment: str = "development"): | |
"""Configure structured logging""" | |
processors = [ | |
structlog.stdlib.filter_by_level, | |
structlog.stdlib.add_logger_name, | |
structlog.stdlib.add_log_level, | |
structlog.stdlib.PositionalArgumentsFormatter(), | |
structlog.processors.TimeStamper(fmt="iso"), | |
structlog.processors.StackInfoRenderer(), | |
structlog.processors.format_exc_info, | |
] | |
if environment == "production": | |
processors.append(structlog.processors.JSONRenderer()) | |
else: | |
processors.append(structlog.dev.ConsoleRenderer()) | |
structlog.configure( | |
processors=processors, | |
wrapper_class=structlog.stdlib.BoundLogger, | |
logger_factory=structlog.stdlib.LoggerFactory(), | |
cache_logger_on_first_use=True, | |
) | |
``` | |
## Security & Compliance | |
### Security Headers (Nginx) | |
```nginx | |
# nginx.conf security configuration | |
server { | |
listen 443 ssl http2; | |
server_name your-domain.com; | |
# SSL configuration | |
ssl_certificate /etc/nginx/ssl/cert.pem; | |
ssl_certificate_key /etc/nginx/ssl/key.pem; | |
ssl_protocols TLSv1.2 TLSv1.3; | |
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512; | |
# Security headers | |
add_header X-Frame-Options DENY always; | |
add_header X-Content-Type-Options nosniff always; | |
add_header X-XSS-Protection "1; mode=block" always; | |
add_header Referrer-Policy "strict-origin-when-cross-origin" always; | |
add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline';" always; | |
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always; | |
# Rate limiting | |
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s; | |
location /api/ { | |
limit_req zone=api burst=20 nodelay; | |
proxy_pass http://backend:8000; | |
proxy_set_header Host $host; | |
proxy_set_header X-Real-IP $remote_addr; | |
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; | |
proxy_set_header X-Forwarded-Proto $scheme; | |
} | |
} | |
``` | |
### Secrets Management | |
```bash | |
#!/bin/bash | |
# scripts/setup-secrets.sh | |
set -e | |
echo "Setting up secrets for production..." | |
# Create Kubernetes secrets | |
kubectl create secret generic suna-secrets \ | |
--from-literal=database-url=$DATABASE_URL \ | |
--from-literal=redis-url=$REDIS_URL \ | |
--from-literal=supabase-key=$SUPABASE_SERVICE_KEY \ | |
--from-literal=openai-api-key=$OPENAI_API_KEY \ | |
--dry-run=client -o yaml | kubectl apply -f - | |
echo "✅ Secrets configured" | |
``` | |
## Performance & Scaling | |
### Load Balancing Configuration | |
```yaml | |
# kubernetes/ingress.yml | |
apiVersion: networking.k8s.io/v1 | |
kind: Ingress | |
metadata: | |
name: suna-ingress | |
annotations: | |
nginx.ingress.kubernetes.io/rewrite-target: / | |
nginx.ingress.kubernetes.io/ssl-redirect: "true" | |
nginx.ingress.kubernetes.io/rate-limit: "100" | |
nginx.ingress.kubernetes.io/rate-limit-window: "1m" | |
spec: | |
tls: | |
- hosts: | |
- suna.example.com | |
secretName: suna-tls | |
rules: | |
- host: suna.example.com | |
http: | |
paths: | |
- path: /api/ | |
pathType: Prefix | |
backend: | |
service: | |
name: backend-service | |
port: | |
number: 8000 | |
- path: / | |
pathType: Prefix | |
backend: | |
service: | |
name: frontend-service | |
port: | |
number: 3000 | |
``` | |
### Auto-scaling Configuration | |
```yaml | |
# kubernetes/hpa.yml | |
apiVersion: autoscaling/v2 | |
kind: HorizontalPodAutoscaler | |
metadata: | |
name: backend-hpa | |
spec: | |
scaleTargetRef: | |
apiVersion: apps/v1 | |
kind: Deployment | |
name: backend-deployment | |
minReplicas: 2 | |
maxReplicas: 10 | |
metrics: | |
- type: Resource | |
resource: | |
name: cpu | |
target: | |
type: Utilization | |
averageUtilization: 70 | |
- type: Resource | |
resource: | |
name: memory | |
target: | |
type: Utilization | |
averageUtilization: 80 | |
``` | |
## Backup & Recovery | |
### Database Backup Strategy | |
```bash | |
#!/bin/bash | |
# scripts/backup-database.sh | |
set -e | |
TIMESTAMP=$(date +%Y%m%d_%H%M%S) | |
BACKUP_DIR="/backups" | |
DATABASE_URL=${DATABASE_URL} | |
echo "Creating database backup..." | |
# Create backup with compression | |
pg_dump "$DATABASE_URL" | gzip > "$BACKUP_DIR/suna_backup_$TIMESTAMP.sql.gz" | |
# Upload to cloud storage (adjust for your provider) | |
aws s3 cp "$BACKUP_DIR/suna_backup_$TIMESTAMP.sql.gz" "s3://suna-backups/" | |
# Clean up local backups older than 7 days | |
find "$BACKUP_DIR" -name "suna_backup_*.sql.gz" -mtime +7 -delete | |
echo "✅ Database backup completed: suna_backup_$TIMESTAMP.sql.gz" | |
``` | |
### Disaster Recovery Plan | |
```bash | |
#!/bin/bash | |
# scripts/restore-database.sh | |
set -e | |
BACKUP_FILE=${1} | |
if [ -z "$BACKUP_FILE" ]; then | |
echo "Usage: $0 <backup-file>" | |
exit 1 | |
fi | |
echo "Restoring database from $BACKUP_FILE..." | |
# Download backup from cloud storage | |
aws s3 cp "s3://suna-backups/$BACKUP_FILE" "/tmp/$BACKUP_FILE" | |
# Restore database | |
gunzip -c "/tmp/$BACKUP_FILE" | psql "$DATABASE_URL" | |
echo "✅ Database restored from $BACKUP_FILE" | |
``` | |
## Key Infrastructure Tools | |
### Container & Orchestration | |
- Docker 24+ for containerization | |
- Docker Compose for local development | |
- Kubernetes for production orchestration | |
- Helm for package management | |
### CI/CD & Automation | |
- GitHub Actions for CI/CD pipelines | |
- Terraform for infrastructure as code | |
- Ansible for configuration management | |
- ArgoCD for GitOps deployments | |
### Monitoring & Observability | |
- Prometheus for metrics collection | |
- Grafana for dashboards and visualization | |
- Jaeger for distributed tracing | |
- ELK stack for log aggregation | |
### Security & Compliance | |
- Vault for secrets management | |
- OWASP ZAP for security testing | |
- Trivy for container vulnerability scanning | |
- Falco for runtime security monitoring | |