GeoQuery / SETUP.md
GerardCB's picture
Deploy to Spaces (Final Clean)
4851501

GeoQuery Setup Guide

Complete guide for setting up the GeoQuery development environment.


Prerequisites

Required Software

Requirement Minimum Version Purpose
Python 3.11+ Backend runtime
Node.js 18+ Frontend runtime
npm 9+ Package management
Git 2.0+ Version control

API Keys

System Requirements

  • RAM: 4GB minimum, 8GB recommended (for DuckDB in-memory database)
  • Disk: 2GB for datasets
  • OS: macOS, Linux, or Windows (WSL recommended)

Installation

1. Clone Repository

git clone https://github.com/GerardCB/GeoQuery.git
cd GeoQuery

2. Backend Setup

Create Virtual Environment

cd backend
python3 -m venv venv

Activate Virtual Environment

macOS/Linux:

source venv/bin/activate

Windows (PowerShell):

venv\Scripts\Activate.ps1

Windows (CMD):

venv\Scripts\activate.bat

Install Dependencies

pip install --upgrade pip
pip install -e .

This installs the package in editable mode, including all dependencies from setup.py.

Key Dependencies:

  • fastapi - Web framework
  • uvicorn - ASGI server
  • duckdb - Embedded database
  • geopandas - Geospatial data processing
  • sentence-transformers - Embeddings
  • google-generativeai - Gemini SDK

Configure Environment Variables

Create .env file in backend/ directory:

# Required
GEMINI_API_KEY=your-api-key-here

# Optional (defaults shown)
PORT=8000
HOST=0.0.0.0
LOG_LEVEL=INFO

Alternative: Export directly in terminal:

export GEMINI_API_KEY="your-api-key-here"

Windows:

$env:GEMINI_API_KEY="your-api-key-here"

Verify Backend Installation

python -c "import backend; print('Backend installed successfully')"

3. Frontend Setup

cd ../frontend  # From backend directory
npm install

Key Dependencies:

  • next - React framework
  • react - UI library
  • leaflet - Map library
  • react-leaflet - React bindings for Leaflet
  • @dnd-kit/core - Drag and drop

Configure Frontend (Optional)

Edit frontend/.env.local if backend is not on default port:

NEXT_PUBLIC_API_URL=http://localhost:8000

Running Locally

Start Backend

From backend/ directory with venv activated:

uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000

Flags:

  • --reload: Auto-restart on code changes
  • --host 0.0.0.0: Allow external connections
  • --port 8000: Port number

Expected Output:

INFO:     Uvicorn running on http://0.0.0.0:8000
INFO:     Application startup complete.

Verify:

Start Frontend

From frontend/ directory:

npm run dev

Expected Output:

β–² Next.js 15.1.3
- Local:        http://localhost:3000
- Ready in 2.1s

Verify:


Database Setup

DuckDB Initialization

Automatic: Database is created in-memory on first query.

Manual Test:

from backend.core.geo_engine import get_geo_engine

engine = get_geo_engine()
print(f"Loaded tables: {list(engine.loaded_tables.keys())}")

Load Initial Datasets

Datasets are loaded lazily (on-demand). To pre-load common datasets:

from backend.core.geo_engine import get_geo_engine

engine = get_geo_engine()
engine.ensure_table_loaded("pan_admin1")  # Provinces
engine.ensure_table_loaded("panama_healthsites_geojson")  # Hospitals

Generate Embeddings

Required for semantic search:

cd backend
python -c "from backend.core.semantic_search import get_semantic_search; get_semantic_search()"

This generates backend/data/embeddings.npy (cached for future use).


Directory Structure After Setup

GeoQuery/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ venv/                   # Virtual environment (created)
β”‚   β”œβ”€β”€ .env                    # Environment variables (created)
β”‚   β”œβ”€β”€ data/
β”‚   β”‚   β”œβ”€β”€ embeddings.npy      # Generated embeddings (created)
β”‚   β”‚   β”œβ”€β”€ catalog.json        # Dataset registry (existing)
β”‚   β”‚   └── osm/                # GeoJSON datasets (existing)
β”‚   └── <source files>
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ node_modules/           # npm packages (created)
β”‚   β”œβ”€β”€ .next/                  # Build output (created)
β”‚   └── <source files>
└── <other files>

Common Issues & Troubleshooting

Backend Issues

Issue: "ModuleNotFoundError: No module named 'backend'"

Cause: Virtual environment not activated or package not installed.

Solution:

source venv/bin/activate  # Activate venv
pip install -e .          # Install package

Issue: "duckdb.IOException: No files found that match the pattern"

Cause: GeoJSON file missing or incorrect path in catalog.json.

Solution:

  1. Check file exists: ls backend/data/osm/hospitals.geojson
  2. Verify path in catalog.json
  3. Download missing data: python backend/scripts/download_geofabrik.py

Issue: "google.api_core.exceptions.PermissionDenied: API key not valid"

Cause: Invalid or missing GEMINI_API_KEY.

Solution:

export GEMINI_API_KEY="your-actual-api-key"
# Restart backend

Issue: "Module 'sentence_transformers' has no attribute 'SentenceTransformer'"

Cause: Corrupted installation.

Solution:

pip uninstall sentence-transformers
pip install sentence-transformers --no-cache-dir

Frontend Issues

Issue: "Error: Cannot find module 'next'"

Cause: npm packages not installed.

Solution:

cd frontend
rm -rf node_modules package-lock.json
npm install

Issue: "Failed to fetch from localhost:8000"

Cause: Backend not running or CORS issue.

Solution:

  1. Verify backend is running: curl http://localhost:8000/api/catalog
  2. Check CORS settings in backend/main.py
  3. Verify NEXT_PUBLIC_API_URL in frontend .env.local

Issue: "Map tiles not loading"

Cause: Network issue or ad blocker.

Solution:

  1. Check internet connection
  2. Disable ad blocker for localhost
  3. Alternative tile server in MapViewer.tsx:
    url="https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png"
    

General Issues

Issue: Port 8000 already in use

Solution:

# Find process using port
lsof -ti:8000

# Kill process
kill -9 $(lsof -ti:8000)

# Or use different port
uvicorn backend.main:app --port 8001

Issue: Out of memory errors

Cause: Loading too many large datasets.

Solution:

  1. Reduce dataset size (filter before loading)
  2. Increase system RAM
  3. Use query limits: LIMIT 10000

Development Workflow

Code Changes

Backend:

  • Python files auto-reload with --reload flag
  • Changes in core/, services/, api/ take effect immediately

Frontend:

  • Hot Module Replacement (HMR) enabled
  • Changes in components/, app/ reload automatically

Adding New Datasets

  1. Add GeoJSON file to appropriate directory (e.g., backend/data/osm/)

  2. Update catalog.json:

    "my_new_dataset": {
      "path": "osm/my_new_dataset.geojson",
      "description": "Description for display",
      "semantic_description": "Detailed description for AI",
      "categories": ["infrastructure"],
      "tags": ["roads", "transport"]
    }
    
  3. Regenerate embeddings:

    rm backend/data/embeddings.npy
    python -c "from backend.core.semantic_search import get_semantic_search; get_semantic_search()"
    
  4. Test: Query for the new dataset

See docs/backend/SCRIPTS.md for data ingestion scripts.

Testing API Endpoints

Using curl:

# Get catalog
curl http://localhost:8000/api/catalog

# Query chat endpoint
curl -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Show me provinces", "history": []}'

Using Swagger UI:


Environment Variables Reference

Variable Required Default Description
GEMINI_API_KEY βœ… Yes - Google AI API key
PORT ❌ No 8000 Backend server port
HOST ❌ No 0.0.0.0 Backend host
LOG_LEVEL ❌ No INFO Logging level (DEBUG, INFO, WARNING, ERROR)
DATABASE_PATH ❌ No :memory: DuckDB database path (use for persistence)

IDE Setup

VS Code

Recommended Extensions:

  • Python (ms-python.python)
  • Pylance (ms-python.vscode-pylance)
  • ESLint (dbaeumer.vscode-eslint)
  • Prettier (esbenp.prettier-vscode)

Settings (.vscode/settings.json):

{
  "python.defaultInterpreterPath": "./backend/venv/bin/python",
  "python.linting.enabled": true,
  "python.formatting.provider": "black",
  "editor.formatOnSave": true,
  "[typescript]": {
    "editor.defaultFormatter": "esbenp.prettier-vscode"
  }
}

PyCharm

  1. Set Python Interpreter: Settings β†’ Project β†’ Python Interpreter β†’ Add β†’ Existing Environment β†’ backend/venv/bin/python
  2. Enable FastAPI: Settings β†’ Languages & Frameworks β†’ FastAPI
  3. Configure Run: Run β†’ Edit Configurations β†’ Add β†’ Python β†’ Script path: backend/main.py

Next Steps