Spaces:

GerardCB
/

GeoQuery

Running

App Files Files Community

GeoQuery / SETUP.md

GerardCB

Deploy to Spaces (Final Clean)

4851501 4 days ago

preview code

raw

history blame contribute delete

10.2 kB

GeoQuery Setup Guide

Complete guide for setting up the GeoQuery development environment.

Prerequisites

Required Software

Requirement	Minimum Version	Purpose
Python	3.11+	Backend runtime
Node.js	18+	Frontend runtime
npm	9+	Package management
Git	2.0+	Version control

API Keys

Google AI API Key (Gemini): Required for LLM functionality
- Get one free at: https://aistudio.google.com/app/apikey
- Free tier: 15 requests/minute, 1500/day

System Requirements

RAM: 4GB minimum, 8GB recommended (for DuckDB in-memory database)
Disk: 2GB for datasets
OS: macOS, Linux, or Windows (WSL recommended)

Installation

1. Clone Repository

git clone https://github.com/GerardCB/GeoQuery.git
cd GeoQuery

2. Backend Setup

Create Virtual Environment

cd backend
python3 -m venv venv

Activate Virtual Environment

macOS/Linux:

source venv/bin/activate

Windows (PowerShell):

venv\Scripts\Activate.ps1

Windows (CMD):

venv\Scripts\activate.bat

Install Dependencies

pip install --upgrade pip
pip install -e .

This installs the package in editable mode, including all dependencies from setup.py.

Key Dependencies:

fastapi - Web framework
uvicorn - ASGI server
duckdb - Embedded database
geopandas - Geospatial data processing
sentence-transformers - Embeddings
google-generativeai - Gemini SDK

Configure Environment Variables

Create .env file in backend/ directory:

# Required
GEMINI_API_KEY=your-api-key-here

# Optional (defaults shown)
PORT=8000
HOST=0.0.0.0
LOG_LEVEL=INFO

Alternative: Export directly in terminal:

export GEMINI_API_KEY="your-api-key-here"

Windows:

$env:GEMINI_API_KEY="your-api-key-here"

Verify Backend Installation

python -c "import backend; print('Backend installed successfully')"

3. Frontend Setup

cd ../frontend  # From backend directory
npm install

Key Dependencies:

next - React framework
react - UI library
leaflet - Map library
react-leaflet - React bindings for Leaflet
@dnd-kit/core - Drag and drop

Configure Frontend (Optional)

Edit frontend/.env.local if backend is not on default port:

NEXT_PUBLIC_API_URL=http://localhost:8000

Running Locally

Start Backend

From backend/ directory with venv activated:

uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000

Flags:

--reload: Auto-restart on code changes
--host 0.0.0.0: Allow external connections
--port 8000: Port number

Expected Output:

INFO:     Uvicorn running on http://0.0.0.0:8000
INFO:     Application startup complete.

Verify:

Open http://localhost:8000/docs → Should show FastAPI Swagger UI
Check http://localhost:8000/api/catalog → Should return GeoJSON catalog

Start Frontend

From frontend/ directory:

npm run dev

Expected Output:

▲ Next.js 15.1.3
- Local:        http://localhost:3000
- Ready in 2.1s

Verify:

Open http://localhost:3000 → Should show GeoQuery chat interface

Database Setup

DuckDB Initialization

Automatic: Database is created in-memory on first query.

Manual Test:

from backend.core.geo_engine import get_geo_engine

engine = get_geo_engine()
print(f"Loaded tables: {list(engine.loaded_tables.keys())}")

Load Initial Datasets

Datasets are loaded lazily (on-demand). To pre-load common datasets:

from backend.core.geo_engine import get_geo_engine

engine = get_geo_engine()
engine.ensure_table_loaded("pan_admin1")  # Provinces
engine.ensure_table_loaded("panama_healthsites_geojson")  # Hospitals

Generate Embeddings

Required for semantic search:

cd backend
python -c "from backend.core.semantic_search import get_semantic_search; get_semantic_search()"

This generates backend/data/embeddings.npy (cached for future use).

Directory Structure After Setup

GeoQuery/
├── backend/
│   ├── venv/                   # Virtual environment (created)
│   ├── .env                    # Environment variables (created)
│   ├── data/
│   │   ├── embeddings.npy      # Generated embeddings (created)
│   │   ├── catalog.json        # Dataset registry (existing)
│   │   └── osm/                # GeoJSON datasets (existing)
│   └── <source files>
├── frontend/
│   ├── node_modules/           # npm packages (created)
│   ├── .next/                  # Build output (created)
│   └── <source files>
└── <other files>

Common Issues & Troubleshooting

Backend Issues

Issue: "ModuleNotFoundError: No module named 'backend'"

Cause: Virtual environment not activated or package not installed.

Solution:

source venv/bin/activate  # Activate venv
pip install -e .          # Install package

Issue: "duckdb.IOException: No files found that match the pattern"

Cause: GeoJSON file missing or incorrect path in catalog.json.

Solution:

Check file exists: ls backend/data/osm/hospitals.geojson
Verify path in catalog.json
Download missing data: python backend/scripts/download_geofabrik.py

Issue: "google.api_core.exceptions.PermissionDenied: API key not valid"

Cause: Invalid or missing GEMINI_API_KEY.

Solution:

export GEMINI_API_KEY="your-actual-api-key"
# Restart backend

Issue: "Module 'sentence_transformers' has no attribute 'SentenceTransformer'"

Cause: Corrupted installation.

Solution:

pip uninstall sentence-transformers
pip install sentence-transformers --no-cache-dir

Frontend Issues

Issue: "Error: Cannot find module 'next'"

Cause: npm packages not installed.

Solution:

cd frontend
rm -rf node_modules package-lock.json
npm install

Issue: "Failed to fetch from localhost:8000"

Cause: Backend not running or CORS issue.

Solution:

Verify backend is running: curl http://localhost:8000/api/catalog
Check CORS settings in backend/main.py
Verify NEXT_PUBLIC_API_URL in frontend .env.local

Issue: "Map tiles not loading"

Cause: Network issue or ad blocker.

Solution:

Check internet connection
Disable ad blocker for localhost

Alternative tile server in MapViewer.tsx:

url="https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png"

General Issues

Issue: Port 8000 already in use

Solution:

# Find process using port
lsof -ti:8000

# Kill process
kill -9 $(lsof -ti:8000)

# Or use different port
uvicorn backend.main:app --port 8001

Issue: Out of memory errors

Cause: Loading too many large datasets.

Solution:

Reduce dataset size (filter before loading)
Increase system RAM
Use query limits: LIMIT 10000

Development Workflow

Code Changes

Backend:

Python files auto-reload with --reload flag
Changes in core/, services/, api/ take effect immediately

Frontend:

Hot Module Replacement (HMR) enabled
Changes in components/, app/ reload automatically

Adding New Datasets

Add GeoJSON file to appropriate directory (e.g., backend/data/osm/)

Update catalog.json:

"my_new_dataset": {
  "path": "osm/my_new_dataset.geojson",
  "description": "Description for display",
  "semantic_description": "Detailed description for AI",
  "categories": ["infrastructure"],
  "tags": ["roads", "transport"]
}

Regenerate embeddings:

rm backend/data/embeddings.npy
python -c "from backend.core.semantic_search import get_semantic_search; get_semantic_search()"

Test: Query for the new dataset

See docs/backend/SCRIPTS.md for data ingestion scripts.

Testing API Endpoints

Using curl:

# Get catalog
curl http://localhost:8000/api/catalog

# Query chat endpoint
curl -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Show me provinces", "history": []}'

Using Swagger UI:

Open http://localhost:8000/docs
Try endpoints interactively

Environment Variables Reference

Variable	Required	Default	Description
`GEMINI_API_KEY`	✅ Yes	-	Google AI API key
`PORT`	❌ No	8000	Backend server port
`HOST`	❌ No	0.0.0.0	Backend host
`LOG_LEVEL`	❌ No	INFO	Logging level (DEBUG, INFO, WARNING, ERROR)
`DATABASE_PATH`	❌ No	:memory:	DuckDB database path (use for persistence)

IDE Setup

VS Code

Recommended Extensions:

Python (ms-python.python)
Pylance (ms-python.vscode-pylance)
ESLint (dbaeumer.vscode-eslint)
Prettier (esbenp.prettier-vscode)

Settings (.vscode/settings.json):

{
  "python.defaultInterpreterPath": "./backend/venv/bin/python",
  "python.linting.enabled": true,
  "python.formatting.provider": "black",
  "editor.formatOnSave": true,
  "[typescript]": {
    "editor.defaultFormatter": "esbenp.prettier-vscode"
  }
}

PyCharm

Set Python Interpreter: Settings → Project → Python Interpreter → Add → Existing Environment → backend/venv/bin/python
Enable FastAPI: Settings → Languages & Frameworks → FastAPI
Configure Run: Run → Edit Configurations → Add → Python → Script path: backend/main.py

Next Steps

✅ Verify installation by running a test query
📖 Read ARCHITECTURE.md to understand the system
🔧 Explore docs/backend/CORE_SERVICES.md for component details
📊 Review docs/data/DATASET_SOURCES.md for available data