Spaces:
Sleeping
Backend Inference Service
FastAPI-based REST API for waste classification inference and feedback collection.
Setup
1. Install Dependencies
```bash pip install -r backend/requirements.txt pip install -r ml/requirements.txt ```
2. Train or Download Model
Ensure you have a trained model at ml/models/best_model.pth:
```bash
Train a model
python ml/train.py
Or download a pretrained model (if available)
Place it in ml/models/best_model.pth
```
3. Start Service
```bash
Development
python backend/inference_service.py
Production with Gunicorn
gunicorn backend.inference_service:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000 ```
Service will be available at http://localhost:8000
API Endpoints
Health Check
```bash GET / GET /health ```
Response: ```json { "status": "healthy", "model_loaded": true, "timestamp": "2024-01-01T00:00:00" } ```
Predict
```bash POST /predict Content-Type: application/json
{ "image": "data:image/jpeg;base64,/9j/4AAQ..." } ```
Response: ```json { "category": "recyclable", "confidence": 0.95, "probabilities": { "recyclable": 0.95, "organic": 0.02, "wet-waste": 0.01, "dry-waste": 0.01, "ewaste": 0.005, "hazardous": 0.003, "landfill": 0.002 }, "timestamp": 1704067200000 } ```
Feedback
```bash POST /feedback Content-Type: application/json
{ "image": "data:image/jpeg;base64,/9j/4AAQ...", "predicted_category": "recyclable", "corrected_category": "organic", "confidence": 0.75 } ```
Response: ```json { "status": "success", "message": "Feedback saved for retraining", "saved_path": "ml/data/retraining/organic/feedback_20240101_120000.jpg" } ```
Trigger Retraining
```bash POST /retrain Authorization: Bearer ```
Response: ```json { "status": "started", "message": "Retraining initiated with 150 new samples", "feedback_count": 150 } ```
Retraining Status
```bash GET /retrain/status ```
Response: ```json { "status": "success", "total_retrains": 3, "events": [...], "latest": { "version": 3, "timestamp": "2024-01-01T00:00:00", "accuracy": 92.5, "improvement": 2.3, "new_samples": 150 } } ```
Statistics
```bash GET /stats ```
Response: ```json { "model_loaded": true, "categories": ["recyclable", "organic", ...], "feedback_samples": 150, "feedback_by_category": { "recyclable": 45, "organic": 38, ... } } ```
Docker Deployment
Build and Run
```bash
Build image
docker build -f backend/Dockerfile -t waste-classification-api .
Run container
docker run -p 8000:8000
-v $(pwd)/ml/models:/app/ml/models
-v $(pwd)/ml/data:/app/ml/data
waste-classification-api
```
Using Docker Compose
```bash
Start all services
docker-compose up -d
View logs
docker-compose logs -f
Stop services
docker-compose down ```
Environment Variables
PORT: Server port (default: 8000)ADMIN_API_KEY: Admin key for retraining endpoint
Performance
- Inference Time: ~50ms per image (CPU)
- Throughput: ~20 requests/second (single worker)
- Memory: ~500MB with model loaded
- Scaling: Deploy multiple workers for higher throughput
Production Deployment
Railway / Render
- Connect your repository
- Set build command:
pip install -r backend/requirements.txt -r ml/requirements.txt - Set start command:
python backend/inference_service.py - Set environment variables
- Deploy
AWS EC2
- Launch EC2 instance (t3.medium or higher)
- Install Docker
- Clone repository
- Run with Docker Compose
- Configure security group (port 8000)
- Set up SSL with Nginx reverse proxy
Vercel (Not Recommended)
FastAPI with ML models exceeds serverless function limits. Use Railway, Render, or AWS EC2 instead.
Monitoring
Add application monitoring:
```python from prometheus_fastapi_instrumentator import Instrumentator
Instrumentator().instrument(app).expose(app) ```
Access metrics at /metrics
Security
- Add rate limiting with
slowapi - Implement proper authentication
- Validate image sizes and formats
- Use HTTPS in production
- Restrict CORS origins
- Sanitize file uploads ```