Spaces:

iitian
/

SentinelAI

Running

App Files Files Community

SentinelAI / README.md

iitian

Serve Next.js SOC dashboard at /ui with FastAPI redirect from /.

81fe24b 3 days ago

preview code

raw

history blame contribute delete

8.54 kB

metadata

title: SentinelAI
emoji: 🏃
colorFrom: red
colorTo: pink
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: SentinelAI — Autonomous Multi-Agent AI SOC

This Hugging Face Space runs the FastAPI control plane in Docker (no bundled PostgreSQL/Redis). The API is served on port 7860; set SKIP_DB=1 at build time for demo-grade startup.

title: SentinelAI emoji: 🏃 colorFrom: red colorTo: pink sdk: docker app_port: 7860 pinned: false license: apache-2.0 short_description: SentinelAI — Autonomous Multi-Agent AI SOC

This Hugging Face Space serves the SOC dashboard at /ui/ (static Next.js behind FastAPI). Open the Space URL in a browser and you are redirected from / to the deck; API docs stay at /docs. The container uses SKIP_DB=1 (no bundled PostgreSQL/Redis).

SentinelAI — Autonomous Multi-Agent AI SOC

SentinelAI is a hackathon-grade, production-shaped autonomous Security Operations Center. It continuously ingests telemetry through collector agents, normalizes and enriches events, runs multi-modal detection (rules + heuristics + optional LLM reasoning on AMD ROCm), correlates attack chains, scores risk, drafts analyst narratives, emits remediation, and fans out alerts—while a Next.js 15 command deck visualizes live operations.

Powered by AMD ROCm compute

Local open models: wire OLLAMA_HOST to an Ollama instance backed by AMD ROCm on Linux (ollama/ollama:rocm in docker/docker-compose.yml comments).
Parallel agents: FastAPI + asyncio execute enrichment, detection, correlation, and analyst tasks concurrently; GPU inference accelerates the analyst LLM path without shipping prompts to a proprietary SaaS.
Throughput: ROCm lowers per-token latency for Llama 3, Qwen 2.5, Mistral, or DeepSeek-class models so multiple agents can reason on overlapping incidents.

Architecture

flowchart TB
  subgraph Infra[Infrastructure]
    L[Linux auth/syslog]
    D[Docker / K8s / Cloud mocks]
  end
  C[Collector Agent]
  P[Parser Agent]
  N[Normalization Agent]
  E[Threat Enrichment Agent]
  T[Threat Detection Agent]
  G[LangGraph orchestration]
  X[Incident Correlation Agent]
  R[Risk Scoring Agent]
  A[AI Analyst Agent]
  M[Remediation Agent]
  AL[Alerting Agent]
  DB[(PostgreSQL)]
  RD[(Redis)]
  V[(Chroma optional)]
  UI[Next.js 15 Dashboard]

  Infra --> C
  C --> P --> N --> E --> T
  T --> G
  G --> X --> R --> A --> M
  E -.intel.-> V
  T --> DB
  X --> DB
  AL --> RD
  A --> UI
  T --> UI

Repository layout

Path	Role
`frontend/`	Next.js 15 + Tailwind + shadcn + Framer Motion SOC deck
`backend/app/main.py`	FastAPI control plane + WebSockets
`agents/`	Threat, risk, analyst, remediation, alerting logic
`collectors/`	Autonomous async tailing collectors
`parsers/`	Log → structured `SecurityEvent`
`workflows/`	LangGraph multi-agent DAG
`database/`	SQLAlchemy models + async session
`models/`	Shared Pydantic schemas
`services/`	Pipeline, hub, metrics, optional Chroma
`docker/`	Compose + GPU-ready notes
`scripts/`	Demo attack replay

Quick start (local)

cd SentinelAI
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# optional: start postgres + redis, or export SKIP_DB=1 for demo-only persistence
export PYTHONPATH=$PWD
export SKIP_DB=1  # remove when PostgreSQL is available
./scripts/run-backend-dev.sh

Use ./scripts/run-backend-dev.sh instead of uvicorn ... --reload from the repo root: reloading the whole tree also watches .venv/site-packages and can restart endlessly. The script scopes --reload-dir to Python source folders only.

cd frontend
npm install
export NEXT_PUBLIC_API_URL=http://127.0.0.1:8000
npm run dev:22

Use npm run dev:22 (Node 22) if npm run dev fails with a Next.js semver error on newer Node versions.

Replay the scripted attack chain:

python scripts/demo_attack.py

Continuous demo stream (keeps generating traffic for judges):

python scripts/continuous_demo.py

Linux auth.log (production-style): set COLLECT_AUTH_LOG=1 (and optionally AUTH_LOG_PATH) or add paths to COLLECTOR_FILE_PATHS. The collector waits until the file exists and tails new lines asynchronously.

Attack replay (WOW): after traffic has populated the buffer, call POST /replay/start with {"delay_ms": 420} or use the dashboard Replay last chain button to re-broadcast buffered detections/incidents over WebSockets.

vLLM / OpenAI-compatible inference: set VLLM_BASE_URL (or OPENAI_BASE_URL) and SENTINEL_LLM_MODEL to your served model; analyst reports use /v1/chat/completions before falling back to Ollama.

The UI listens on NEXT_PUBLIC_API_URL and opens a WebSocket to /live-events.

Docker Compose

docker compose -f docker/docker-compose.yml up --build

API: http://localhost:8000
UI: http://localhost:3000
Uncomment the ollama service for ROCm hosts and align OLLAMA_HOST.

Install optional vector memory:

pip install -r requirements-optional.txt

Required API surface

Endpoint	Description
`POST /ingest-logs`	Push raw logs / JSON events
`WS /live-events`	Real-time detections + incidents
`POST /detect-threats`	Parser → enrich → detect
`POST /correlate-incidents`	Recompute chains
`POST /generate-summary`	Body: `{ "incident_id": "..." }`
`POST /remediation`	Body: `{ "incident_id": "..." }`
`POST /send-alert`	Slack / Discord / Teams / webhook
`GET /dashboard-metrics`	KPIs for the deck
`POST /replay/start`	Re-stream buffered threat frames to WebSocket clients
`GET /replay-buffer`	Inspect replay buffer (debug)
`GET /rocm-panel`	AMD ROCm story + simulated GPU/agent load for the UI

Open-source model matrix

Role	Suggested weights
Reasoning	Llama 3, Qwen 2.5, DeepSeek, Mistral
Vision (future)	Qwen-VL, LLaVA for phishing/malware screenshots
Embeddings	BGE, E5 (plug into Chroma ingestion)

Set SENTINEL_LLM_MODEL to the tag served by your ROCm Ollama runtime.

Live demo script (judges)

Start stack — Docker Compose or local uvicorn + npm run dev.
Show autonomous collection — tail demo_logs/auth_demo.log without manual uploads.
Fire demo — python scripts/demo_attack.py or the in-UI Simulate attack chain button.
Narrate agents — Collector → Parser → Normalization → Enrichment → Detection → LangGraph hop → Correlation → Risk → (optional) Analyst LLM on ROCm.
Pivot to response — call /remediation + /send-alert with a webhook sink.
Close with differentiation — autonomous agents, not a chatbot; on-prem models on AMD GPUs; evidence in PostgreSQL.

Pitch deck outline (copy into Slides / Gamma)

Problem — SOC teams drown in telemetry; correlation is manual; cloud-only AI breaks data residency.
Solution — SentinelAI fuses autonomous collectors, graph-based correlation, and open-weight LLMs.
Why now — AMD ROCm makes on-prem inference cost-viable; LangGraph standardizes agent choreography.
Demo — live WebSocket feed + incident graph + analyst summary.
Moat — modular agents, MITRE mapping, optional TI hooks, Terraform-ready remediation stubs.
Ask — design partners for managed SOC + on-prem appliance.

Demo & pitch (read before presenting)

Exact demo steps: docs/DEMO_SCRIPT.md
One-line pitch: docs/PITCH.md
Backup recording: docs/RECORDING_CHECKLIST.md
AMD panel API: GET /rocm-panel (drives the “Powered by AMD ROCm” dashboard section)

Judge explanation notes

Autonomy: collectors run continuously; pipeline executes without human prompts.
Multi-agent: LangGraph DAG + discrete services per concern (enrichment vs detection vs correlation).
Enterprise UX: glassmorphism SOC deck, severity analytics, world heatmap, terminal channel.
Honest scope: optional APIs (AbuseIPDB, VT, OTX) degrade gracefully; LLM path falls back to deterministic narratives if Ollama is offline.

Security notice

This repository ships defensive tooling and demo payloads. Only run against systems you own or have permission to test.