cacodex commited on
Commit
c74679b
·
verified ·
1 Parent(s): 85201c3

Upload 14 files

Browse files
.dockerignore ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ __pycache__/
2
+ .pytest_cache/
3
+ .pytest_tmp/
4
+ *.pyc
5
+ *.pyo
6
+ *.sqlite3
7
+ .smoke.sqlite3
8
+ uvicorn.log
9
+ uvicorn.err.log
.env.example ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ PASSWORD=change-me
2
+ SESSION_SECRET=change-me-too
3
+ GATEWAY_API_KEY=
4
+ NVIDIA_API_BASE=https://integrate.api.nvidia.com/v1
5
+ NVIDIA_NIM_API_KEY=
6
+ HEALTHCHECK_INTERVAL_MINUTES=60
7
+ HEALTHCHECK_PROMPT=Reply with the single word OK.
8
+ PUBLIC_HISTORY_HOURS=48
9
+ DATABASE_PATH=./data.sqlite3
Dockerfile ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.13-slim
2
+
3
+ ENV PYTHONDONTWRITEBYTECODE=1 \
4
+ PYTHONUNBUFFERED=1 \
5
+ PORT=7860
6
+
7
+ WORKDIR /app
8
+
9
+ COPY requirements.txt ./
10
+ RUN pip install --no-cache-dir -r requirements.txt
11
+
12
+ COPY . .
13
+
14
+ EXPOSE 7860
15
+
16
+ CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "7860"]
README.md CHANGED
@@ -1,10 +1,112 @@
1
- ---
2
- title: N2r
3
- emoji: 🏢
4
- colorFrom: pink
5
- colorTo: purple
6
  sdk: docker
 
7
  pinned: false
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: NVIDIA NIM Responses Gateway
 
 
 
3
  sdk: docker
4
+ app_port: 7860
5
  pinned: false
6
  ---
7
 
8
+ # NVIDIA NIM Responses Gateway
9
+
10
+ A FastAPI gateway that converts NVIDIA NIM's official chat endpoint:
11
+
12
+ `https://integrate.api.nvidia.com/v1/chat/completions`
13
+
14
+ into an OpenAI-style `/v1/responses` interface, with:
15
+
16
+ - tool calling / function calling passthrough
17
+ - `previous_response_id` conversation chaining
18
+ - `/v1/models` model listing
19
+ - a public model health dashboard
20
+ - an admin SPA for model management, NVIDIA NIM key management, health checks, and scheduler settings
21
+ - Docker packaging for Hugging Face Spaces
22
+
23
+ ## Included NVIDIA models
24
+
25
+ The app seeds these models on first startup:
26
+
27
+ - `z-ai/glm5`
28
+ - `minimaxai/minimax-m2.5`
29
+ - `moonshotai/kimi-k2.5`
30
+ - `deepseek-ai/deepseek-v3.2`
31
+ - `google/gemma-4-31b-it`
32
+ - `qwen/qwen3.5-397b-a17b`
33
+
34
+ You can add or remove more models from the admin page.
35
+
36
+ ## Routes
37
+
38
+ - `GET /` public health dashboard
39
+ - `GET /admin` admin SPA
40
+ - `GET /api/health/public` public hourly health data
41
+ - `GET /v1/models` OpenAI-style model list
42
+ - `POST /v1/responses` OpenAI-style responses endpoint
43
+ - `GET /v1/responses/{response_id}` retrieve a stored response
44
+
45
+ Admin API:
46
+
47
+ - `POST /admin/api/login`
48
+ - `GET /admin/api/overview`
49
+ - `GET/POST/DELETE /admin/api/models...`
50
+ - `GET/POST/DELETE /admin/api/keys...`
51
+ - `GET /admin/api/healthchecks`
52
+ - `POST /admin/api/healthchecks/run`
53
+ - `GET/PUT /admin/api/settings`
54
+
55
+ ## Environment variables
56
+
57
+ - `PASSWORD` required for admin login
58
+ - `SESSION_SECRET` optional cookie signing secret; falls back to `PASSWORD`
59
+ - `GATEWAY_API_KEY` optional bearer token to protect `/v1/models` and `/v1/responses`
60
+ - `NVIDIA_API_BASE` defaults to `https://integrate.api.nvidia.com/v1`
61
+ - `NVIDIA_NIM_API_KEY` optional bootstrap key inserted on first startup
62
+ - `HEALTHCHECK_INTERVAL_MINUTES` default `60`
63
+ - `HEALTHCHECK_PROMPT` default `Reply with the single word OK.`
64
+ - `PUBLIC_HISTORY_HOURS` default `48`
65
+ - `DATABASE_PATH` default `./data.sqlite3`
66
+
67
+ A starter file is available at `.env.example`.
68
+
69
+ ## Local run
70
+
71
+ Install runtime dependencies:
72
+
73
+ ```bash
74
+ pip install -r requirements.txt
75
+ ```
76
+
77
+ For local verification with the smoke script:
78
+
79
+ ```bash
80
+ pip install -r requirements-dev.txt
81
+ python scripts/local_smoke_test.py
82
+ ```
83
+
84
+ Run the app:
85
+
86
+ ```bash
87
+ uvicorn app.main:app --host 0.0.0.0 --port 7860
88
+ ```
89
+
90
+ ## Hugging Face Space deployment
91
+
92
+ This repository is prepared as a Docker Space.
93
+
94
+ 1. Create a new Hugging Face Space with `SDK: Docker`.
95
+ 2. Push this repository to the Space.
96
+ 3. Add Space secrets for at least `PASSWORD` and one NVIDIA NIM key.
97
+ 4. Open `/admin`, add or verify the stored keys, then run health checks.
98
+
99
+ ## Notes on API compatibility
100
+
101
+ - The gateway accepts OpenAI-style `input` payloads and converts them to chat-completions `messages`.
102
+ - Function tools are mapped to NVIDIA NIM's OpenAI-compatible `tools` format.
103
+ - Returned tool calls are exposed as `function_call` items inside the `output` array.
104
+ - `stream: true` is supported as SSE, but the current implementation emits buffered response events after the upstream completion finishes.
105
+
106
+ ## References
107
+
108
+ - OpenAI Responses API guide: https://platform.openai.com/docs/guides/responses-vs-chat-completions
109
+ - OpenAI function calling guide: https://platform.openai.com/docs/guides/function-calling
110
+ - NVIDIA Build portal: https://build.nvidia.com/
111
+ - NVIDIA NIM API reference: https://docs.api.nvidia.com/
112
+
app/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """NVIDIA NIM to OpenAI Responses gateway."""
app/__pycache__/__init__.cpython-313.pyc ADDED
Binary file (199 Bytes). View file
 
app/__pycache__/main.cpython-313.pyc ADDED
Binary file (79.4 kB). View file
 
app/main.py ADDED
@@ -0,0 +1,1314 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ import json
4
+ import os
5
+ import sqlite3
6
+ import time
7
+ import uuid
8
+ from contextlib import asynccontextmanager
9
+ from datetime import UTC, datetime, timedelta
10
+ from pathlib import Path
11
+ from typing import Any
12
+
13
+ import httpx
14
+ from apscheduler.schedulers.asyncio import AsyncIOScheduler
15
+ from fastapi import Depends, FastAPI, Header, HTTPException, Request, Response, status
16
+ from fastapi.responses import FileResponse, JSONResponse, StreamingResponse
17
+ from fastapi.staticfiles import StaticFiles
18
+ from itsdangerous import BadSignature, SignatureExpired, URLSafeTimedSerializer
19
+
20
+
21
+ BASE_DIR = Path(__file__).resolve().parent.parent
22
+ STATIC_DIR = BASE_DIR / "static"
23
+ DB_PATH = Path(os.getenv("DATABASE_PATH", BASE_DIR / "data.sqlite3"))
24
+ RAW_NVIDIA_API_BASE = os.getenv("NVIDIA_API_BASE", os.getenv("NIM_BASE_URL", "https://integrate.api.nvidia.com/v1")).rstrip("/")
25
+ NVIDIA_API_BASE = RAW_NVIDIA_API_BASE if RAW_NVIDIA_API_BASE.endswith("/v1") else f"{RAW_NVIDIA_API_BASE}/v1"
26
+ CHAT_COMPLETIONS_URL = f"{NVIDIA_API_BASE}/chat/completions"
27
+ MODELS_URL = f"{NVIDIA_API_BASE}/models"
28
+ ADMIN_PASSWORD = os.getenv("PASSWORD")
29
+ SESSION_SECRET = os.getenv("SESSION_SECRET") or ADMIN_PASSWORD or "nim-responses-dev-secret"
30
+ COOKIE_NAME = os.getenv("COOKIE_NAME", "nim_admin_session")
31
+ GATEWAY_API_KEY = os.getenv("GATEWAY_API_KEY")
32
+ DEFAULT_ENV_KEY = os.getenv("NVIDIA_NIM_API_KEY") or os.getenv("NVIDIA_API_KEY")
33
+ REQUEST_TIMEOUT_SECONDS = float(os.getenv("REQUEST_TIMEOUT_SECONDS", "90"))
34
+ DEFAULT_HEALTH_INTERVAL_MINUTES = int(os.getenv("HEALTHCHECK_INTERVAL_MINUTES", "60"))
35
+ DEFAULT_HEALTH_PROMPT = os.getenv("HEALTHCHECK_PROMPT", "Reply with the single word OK.")
36
+ PUBLIC_HISTORY_HOURS = int(os.getenv("PUBLIC_HISTORY_HOURS", "48"))
37
+
38
+ DEFAULT_MODELS = [
39
+ ("z-ai/glm5", "GLM-5", "Reasoning and general assistant model from Z.ai", 10, 1),
40
+ ("minimaxai/minimax-m2.5", "MiniMax M2.5", "Long-context assistant model from MiniMax", 20, 1),
41
+ ("moonshotai/kimi-k2.5", "Kimi K2.5", "Kimi family model tuned for tool use and code", 30, 1),
42
+ ("deepseek-ai/deepseek-v3.2", "DeepSeek V3.2", "DeepSeek production general-purpose model", 40, 1),
43
+ ("google/gemma-4-31b-it", "Gemma 4 31B IT", "Instruction-tuned Gemma model", 50, 0),
44
+ ("qwen/qwen3.5-397b-a17b", "Qwen 3.5 397B A17B", "Large-scale Qwen model with broad capabilities", 60, 0),
45
+ ]
46
+
47
+ scheduler = AsyncIOScheduler(timezone="UTC")
48
+
49
+
50
+ def utcnow() -> datetime:
51
+ return datetime.now(UTC)
52
+
53
+
54
+ def utcnow_iso() -> str:
55
+ return utcnow().isoformat()
56
+
57
+
58
+ def parse_datetime(value: str | None) -> datetime | None:
59
+ if not value:
60
+ return None
61
+ try:
62
+ return datetime.fromisoformat(value)
63
+ except ValueError:
64
+ return None
65
+
66
+
67
+ def bool_value(value: Any) -> bool:
68
+ if isinstance(value, bool):
69
+ return value
70
+ if isinstance(value, (int, float)):
71
+ return bool(value)
72
+ if value is None:
73
+ return False
74
+ return str(value).strip().lower() in {"1", "true", "yes", "on", "enabled"}
75
+
76
+
77
+ def json_dumps(value: Any) -> str:
78
+ return json.dumps(value, ensure_ascii=False)
79
+
80
+
81
+ def get_db_connection() -> sqlite3.Connection:
82
+ conn = sqlite3.connect(DB_PATH, check_same_thread=False)
83
+ conn.row_factory = sqlite3.Row
84
+ return conn
85
+
86
+
87
+ def init_db() -> None:
88
+ DB_PATH.parent.mkdir(parents=True, exist_ok=True)
89
+ conn = get_db_connection()
90
+ try:
91
+ conn.executescript(
92
+ """
93
+ CREATE TABLE IF NOT EXISTS proxy_models (
94
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
95
+ model_id TEXT UNIQUE NOT NULL,
96
+ display_name TEXT NOT NULL,
97
+ provider TEXT NOT NULL DEFAULT 'nvidia-nim',
98
+ description TEXT,
99
+ enabled INTEGER NOT NULL DEFAULT 1,
100
+ featured INTEGER NOT NULL DEFAULT 0,
101
+ sort_order INTEGER NOT NULL DEFAULT 0,
102
+ request_count INTEGER NOT NULL DEFAULT 0,
103
+ success_count INTEGER NOT NULL DEFAULT 0,
104
+ failure_count INTEGER NOT NULL DEFAULT 0,
105
+ healthcheck_count INTEGER NOT NULL DEFAULT 0,
106
+ healthcheck_success_count INTEGER NOT NULL DEFAULT 0,
107
+ last_used_at TEXT,
108
+ last_healthcheck_at TEXT,
109
+ last_health_status INTEGER,
110
+ last_latency_ms REAL,
111
+ created_at TEXT NOT NULL,
112
+ updated_at TEXT NOT NULL
113
+ );
114
+
115
+ CREATE TABLE IF NOT EXISTS api_keys (
116
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
117
+ name TEXT UNIQUE NOT NULL,
118
+ api_key TEXT NOT NULL,
119
+ enabled INTEGER NOT NULL DEFAULT 1,
120
+ request_count INTEGER NOT NULL DEFAULT 0,
121
+ success_count INTEGER NOT NULL DEFAULT 0,
122
+ failure_count INTEGER NOT NULL DEFAULT 0,
123
+ healthcheck_count INTEGER NOT NULL DEFAULT 0,
124
+ healthcheck_success_count INTEGER NOT NULL DEFAULT 0,
125
+ last_used_at TEXT,
126
+ last_tested_at TEXT,
127
+ last_latency_ms REAL,
128
+ created_at TEXT NOT NULL,
129
+ updated_at TEXT NOT NULL
130
+ );
131
+
132
+ CREATE TABLE IF NOT EXISTS response_records (
133
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
134
+ response_id TEXT UNIQUE NOT NULL,
135
+ parent_response_id TEXT,
136
+ model_id INTEGER,
137
+ api_key_id INTEGER,
138
+ request_json TEXT NOT NULL,
139
+ input_items_json TEXT NOT NULL,
140
+ output_json TEXT NOT NULL,
141
+ output_items_json TEXT NOT NULL,
142
+ status TEXT NOT NULL,
143
+ created_at TEXT NOT NULL
144
+ );
145
+
146
+ CREATE TABLE IF NOT EXISTS health_check_records (
147
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
148
+ model_id INTEGER NOT NULL,
149
+ api_key_id INTEGER,
150
+ ok INTEGER NOT NULL,
151
+ status_code INTEGER,
152
+ latency_ms REAL,
153
+ error_message TEXT,
154
+ response_excerpt TEXT,
155
+ checked_at TEXT NOT NULL
156
+ );
157
+
158
+ CREATE TABLE IF NOT EXISTS settings (
159
+ key TEXT PRIMARY KEY,
160
+ value TEXT NOT NULL
161
+ );
162
+ """
163
+ )
164
+
165
+ now = utcnow_iso()
166
+ for model_id, display_name, description, sort_order, featured in DEFAULT_MODELS:
167
+ conn.execute(
168
+ """
169
+ INSERT OR IGNORE INTO proxy_models (
170
+ model_id, display_name, provider, description, enabled, featured, sort_order, created_at, updated_at
171
+ ) VALUES (?, ?, 'nvidia-nim', ?, 1, ?, ?, ?, ?)
172
+ """,
173
+ (model_id, display_name, description, featured, sort_order, now, now),
174
+ )
175
+
176
+ defaults = {
177
+ "healthcheck_enabled": "true",
178
+ "healthcheck_interval_minutes": str(DEFAULT_HEALTH_INTERVAL_MINUTES),
179
+ "healthcheck_prompt": DEFAULT_HEALTH_PROMPT,
180
+ "public_history_hours": str(PUBLIC_HISTORY_HOURS),
181
+ }
182
+ for key, value in defaults.items():
183
+ conn.execute("INSERT OR IGNORE INTO settings (key, value) VALUES (?, ?)", (key, value))
184
+
185
+ if DEFAULT_ENV_KEY:
186
+ conn.execute(
187
+ """
188
+ INSERT OR IGNORE INTO api_keys (name, api_key, enabled, created_at, updated_at)
189
+ VALUES ('env-default', ?, 1, ?, ?)
190
+ """,
191
+ (DEFAULT_ENV_KEY, now, now),
192
+ )
193
+
194
+ conn.commit()
195
+ finally:
196
+ conn.close()
197
+
198
+
199
+ def get_setting(conn: sqlite3.Connection, key: str, default: str) -> str:
200
+ row = conn.execute("SELECT value FROM settings WHERE key = ?", (key,)).fetchone()
201
+ return row["value"] if row else default
202
+
203
+
204
+ def set_setting(conn: sqlite3.Connection, key: str, value: str) -> None:
205
+ conn.execute(
206
+ """
207
+ INSERT INTO settings (key, value) VALUES (?, ?)
208
+ ON CONFLICT(key) DO UPDATE SET value = excluded.value
209
+ """,
210
+ (key, value),
211
+ )
212
+
213
+
214
+ def get_settings_payload(conn: sqlite3.Connection) -> dict[str, Any]:
215
+ return {
216
+ "healthcheck_enabled": bool_value(get_setting(conn, "healthcheck_enabled", "true")),
217
+ "healthcheck_interval_minutes": int(get_setting(conn, "healthcheck_interval_minutes", str(DEFAULT_HEALTH_INTERVAL_MINUTES))),
218
+ "healthcheck_prompt": get_setting(conn, "healthcheck_prompt", DEFAULT_HEALTH_PROMPT),
219
+ "public_history_hours": int(get_setting(conn, "public_history_hours", str(PUBLIC_HISTORY_HOURS))),
220
+ }
221
+
222
+
223
+ def mask_secret(secret: str) -> str:
224
+ if len(secret) <= 8:
225
+ return f"{secret[:2]}***"
226
+ return f"{secret[:4]}...{secret[-4:]}"
227
+
228
+
229
+ def create_admin_token() -> str:
230
+ serializer = URLSafeTimedSerializer(SESSION_SECRET, salt="nim-admin-auth")
231
+ return serializer.dumps({"role": "admin"})
232
+
233
+
234
+ def verify_admin_token(token: str) -> bool:
235
+ serializer = URLSafeTimedSerializer(SESSION_SECRET, salt="nim-admin-auth")
236
+ try:
237
+ payload = serializer.loads(token, max_age=60 * 60 * 24 * 7)
238
+ except (BadSignature, SignatureExpired):
239
+ return False
240
+ return payload.get("role") == "admin"
241
+
242
+
243
+ def require_admin(request: Request, authorization: str | None = Header(default=None)) -> bool:
244
+ token: str | None = None
245
+ if authorization and authorization.startswith("Bearer "):
246
+ token = authorization.removeprefix("Bearer ").strip()
247
+ if not token:
248
+ token = request.cookies.get(COOKIE_NAME)
249
+ if not token or not verify_admin_token(token):
250
+ raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Admin authentication required.")
251
+ return True
252
+
253
+
254
+ def require_proxy_token_if_configured(authorization: str | None = Header(default=None)) -> bool:
255
+ if not GATEWAY_API_KEY:
256
+ return True
257
+ if not authorization or not authorization.startswith("Bearer "):
258
+ raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Missing bearer token.")
259
+ token = authorization.removeprefix("Bearer ").strip()
260
+ if token != GATEWAY_API_KEY:
261
+ raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid bearer token.")
262
+ return True
263
+
264
+
265
+ def fetch_model_by_identifier(conn: sqlite3.Connection, identifier: str | int, enabled_only: bool = False) -> sqlite3.Row | None:
266
+ clause = "AND enabled = 1" if enabled_only else ""
267
+ if isinstance(identifier, int) or (isinstance(identifier, str) and identifier.isdigit()):
268
+ row = conn.execute(f"SELECT * FROM proxy_models WHERE id = ? {clause}", (int(identifier),)).fetchone()
269
+ if row:
270
+ return row
271
+ return conn.execute(f"SELECT * FROM proxy_models WHERE model_id = ? {clause}", (str(identifier),)).fetchone()
272
+
273
+
274
+ def fetch_key_by_identifier(conn: sqlite3.Connection, identifier: str | int, enabled_only: bool = False) -> sqlite3.Row | None:
275
+ clause = "AND enabled = 1" if enabled_only else ""
276
+ if isinstance(identifier, int) or (isinstance(identifier, str) and str(identifier).isdigit()):
277
+ row = conn.execute(f"SELECT * FROM api_keys WHERE id = ? {clause}", (int(identifier),)).fetchone()
278
+ if row:
279
+ return row
280
+ return conn.execute(f"SELECT * FROM api_keys WHERE name = ? {clause}", (str(identifier),)).fetchone()
281
+
282
+
283
+ def select_api_key(conn: sqlite3.Connection, explicit_id: int | None = None) -> sqlite3.Row:
284
+ if explicit_id is not None:
285
+ row = fetch_key_by_identifier(conn, explicit_id, enabled_only=True)
286
+ if row:
287
+ return row
288
+ row = conn.execute(
289
+ """
290
+ SELECT * FROM api_keys
291
+ WHERE enabled = 1
292
+ ORDER BY CASE WHEN last_used_at IS NULL THEN 0 ELSE 1 END, last_used_at ASC, id ASC
293
+ LIMIT 1
294
+ """
295
+ ).fetchone()
296
+ if not row:
297
+ raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="No enabled NVIDIA NIM API key is configured.")
298
+ return row
299
+
300
+
301
+ def row_to_model_item(row: sqlite3.Row) -> dict[str, Any]:
302
+ status_name = "unknown"
303
+ if row["last_health_status"] is not None:
304
+ status_name = "healthy" if bool(row["last_health_status"]) else "down"
305
+ return {
306
+ "id": row["id"],
307
+ "model_id": row["model_id"],
308
+ "name": row["model_id"],
309
+ "display_name": row["display_name"],
310
+ "endpoint": "/v1/responses",
311
+ "provider": row["provider"],
312
+ "description": row["description"],
313
+ "enabled": bool(row["enabled"]),
314
+ "featured": bool(row["featured"]),
315
+ "sort_order": row["sort_order"],
316
+ "status": status_name,
317
+ "request_count": row["request_count"],
318
+ "success_count": row["success_count"],
319
+ "failure_count": row["failure_count"],
320
+ "healthcheck_count": row["healthcheck_count"],
321
+ "healthcheck_success_count": row["healthcheck_success_count"],
322
+ "last_used_at": row["last_used_at"],
323
+ "last_healthcheck_at": row["last_healthcheck_at"],
324
+ "last_health_status": None if row["last_health_status"] is None else bool(row["last_health_status"]),
325
+ "last_latency_ms": row["last_latency_ms"],
326
+ "created_at": row["created_at"],
327
+ "updated_at": row["updated_at"],
328
+ }
329
+
330
+
331
+ def row_to_key_item(row: sqlite3.Row) -> dict[str, Any]:
332
+ total_checks = row["healthcheck_count"] or 0
333
+ ok_checks = row["healthcheck_success_count"] or 0
334
+ success_ratio = (ok_checks / total_checks) if total_checks else None
335
+ status_name = "healthy" if success_ratio and success_ratio >= 0.8 else "unknown"
336
+ return {
337
+ "id": row["id"],
338
+ "name": row["name"],
339
+ "label": row["name"],
340
+ "masked_key": mask_secret(row["api_key"]),
341
+ "enabled": bool(row["enabled"]),
342
+ "status": status_name,
343
+ "request_count": row["request_count"],
344
+ "success_count": row["success_count"],
345
+ "failure_count": row["failure_count"],
346
+ "healthcheck_count": row["healthcheck_count"],
347
+ "healthcheck_success_count": row["healthcheck_success_count"],
348
+ "last_used_at": row["last_used_at"],
349
+ "last_tested": row["last_tested_at"],
350
+ "last_tested_at": row["last_tested_at"],
351
+ "last_latency_ms": row["last_latency_ms"],
352
+ "created_at": row["created_at"],
353
+ "updated_at": row["updated_at"],
354
+ }
355
+
356
+
357
+ def make_error(status_code: int, message: str, error_type: str = "invalid_request_error") -> JSONResponse:
358
+ return JSONResponse(
359
+ status_code=status_code,
360
+ content={"error": {"message": message, "type": error_type, "code": status_code}},
361
+ )
362
+
363
+ def normalize_content(content: Any, role: str) -> list[dict[str, Any]]:
364
+ if content is None:
365
+ return []
366
+ if isinstance(content, str):
367
+ return [{"type": "output_text" if role == "assistant" else "input_text", "text": content}]
368
+ if isinstance(content, list):
369
+ normalized: list[dict[str, Any]] = []
370
+ for part in content:
371
+ if isinstance(part, str):
372
+ normalized.append({"type": "output_text" if role == "assistant" else "input_text", "text": part})
373
+ continue
374
+ if not isinstance(part, dict):
375
+ normalized.append({"type": "input_text", "text": str(part)})
376
+ continue
377
+ if part.get("type") in {"input_text", "output_text", "text", "tool_call", "function_call"}:
378
+ normalized.append(part)
379
+ continue
380
+ if "text" in part:
381
+ normalized.append({"type": part.get("type", "input_text"), "text": part.get("text", "")})
382
+ return normalized
383
+ if isinstance(content, dict):
384
+ if "text" in content:
385
+ return [{"type": content.get("type", "input_text"), "text": content.get("text", "")}]
386
+ return [{"type": "input_text", "text": json_dumps(content)}]
387
+ return [{"type": "input_text", "text": str(content)}]
388
+
389
+
390
+ def normalize_input_items(value: Any) -> list[dict[str, Any]]:
391
+ if value is None:
392
+ return []
393
+ if isinstance(value, str):
394
+ return [{"type": "message", "role": "user", "content": [{"type": "input_text", "text": value}]}]
395
+ if isinstance(value, dict):
396
+ value = [value]
397
+
398
+ items: list[dict[str, Any]] = []
399
+ for item in value:
400
+ if isinstance(item, str):
401
+ items.append({"type": "message", "role": "user", "content": [{"type": "input_text", "text": item}]})
402
+ continue
403
+ if not isinstance(item, dict):
404
+ items.append({"type": "message", "role": "user", "content": [{"type": "input_text", "text": str(item)}]})
405
+ continue
406
+
407
+ item_type = item.get("type")
408
+ if item_type == "message" or item.get("role"):
409
+ role = item.get("role", "user")
410
+ items.append({"type": "message", "role": role, "content": normalize_content(item.get("content"), role)})
411
+ continue
412
+ if item_type == "function_call_output":
413
+ output = item.get("output")
414
+ if not isinstance(output, str):
415
+ output = json_dumps(output) if output is not None else ""
416
+ items.append({"type": "function_call_output", "call_id": item.get("call_id"), "output": output})
417
+ continue
418
+ if item_type == "function_call":
419
+ arguments = item.get("arguments", "{}")
420
+ if not isinstance(arguments, str):
421
+ arguments = json_dumps(arguments)
422
+ items.append(
423
+ {
424
+ "type": "function_call",
425
+ "call_id": item.get("call_id") or f"call_{uuid.uuid4().hex[:12]}",
426
+ "name": item.get("name"),
427
+ "arguments": arguments,
428
+ }
429
+ )
430
+ continue
431
+ if item_type in {"input_text", "output_text", "text"}:
432
+ items.append({"type": "message", "role": "user", "content": [{"type": "input_text", "text": item.get("text", "")}]})
433
+ continue
434
+ items.append({"type": "message", "role": "user", "content": [{"type": "input_text", "text": json_dumps(item)}]})
435
+ return items
436
+
437
+
438
+ def extract_text_from_content(content: Any) -> str:
439
+ if content is None:
440
+ return ""
441
+ if isinstance(content, str):
442
+ return content
443
+ if isinstance(content, dict):
444
+ if "text" in content:
445
+ return str(content.get("text", ""))
446
+ return json_dumps(content)
447
+ if isinstance(content, list):
448
+ chunks: list[str] = []
449
+ for part in content:
450
+ if isinstance(part, str):
451
+ chunks.append(part)
452
+ elif isinstance(part, dict) and part.get("type") in {"input_text", "output_text", "text"}:
453
+ chunks.append(str(part.get("text", "")))
454
+ return "\n".join(filter(None, chunks))
455
+ return str(content)
456
+
457
+
458
+ def load_previous_conversation_items(conn: sqlite3.Connection, previous_response_id: str | None) -> list[dict[str, Any]]:
459
+ if not previous_response_id:
460
+ return []
461
+ records: list[sqlite3.Row] = []
462
+ current = previous_response_id
463
+ while current:
464
+ row = conn.execute("SELECT * FROM response_records WHERE response_id = ?", (current,)).fetchone()
465
+ if not row:
466
+ raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=f"previous_response_id '{current}' was not found.")
467
+ records.append(row)
468
+ current = row["parent_response_id"]
469
+
470
+ items: list[dict[str, Any]] = []
471
+ for row in reversed(records):
472
+ items.extend(json.loads(row["input_items_json"]))
473
+ items.extend(json.loads(row["output_items_json"]))
474
+ return items
475
+
476
+
477
+ def items_to_chat_messages(items: list[dict[str, Any]]) -> list[dict[str, Any]]:
478
+ messages: list[dict[str, Any]] = []
479
+ pending_tool_calls: list[dict[str, Any]] = []
480
+
481
+ def flush_pending_tool_calls() -> None:
482
+ nonlocal pending_tool_calls
483
+ if pending_tool_calls:
484
+ messages.append({"role": "assistant", "content": "", "tool_calls": pending_tool_calls})
485
+ pending_tool_calls = []
486
+
487
+ for item in items:
488
+ item_type = item.get("type")
489
+ if item_type == "function_call":
490
+ pending_tool_calls.append(
491
+ {
492
+ "id": item.get("call_id") or f"call_{uuid.uuid4().hex[:12]}",
493
+ "type": "function",
494
+ "function": {"name": item.get("name"), "arguments": item.get("arguments", "{}")},
495
+ }
496
+ )
497
+ continue
498
+ if item_type == "function_call_output":
499
+ flush_pending_tool_calls()
500
+ messages.append({"role": "tool", "tool_call_id": item.get("call_id"), "content": item.get("output", "")})
501
+ continue
502
+ if item_type != "message":
503
+ continue
504
+ flush_pending_tool_calls()
505
+ role = item.get("role", "user")
506
+ text_value = extract_text_from_content(item.get("content"))
507
+ if role in {"system", "developer"}:
508
+ messages.append({"role": "system", "content": text_value})
509
+ elif role == "assistant":
510
+ messages.append({"role": "assistant", "content": text_value})
511
+ else:
512
+ messages.append({"role": role, "content": text_value})
513
+
514
+ flush_pending_tool_calls()
515
+ return [message for message in messages if message.get("content") is not None or message.get("tool_calls")]
516
+
517
+
518
+ def response_tools_to_chat_tools(tools: Any) -> list[dict[str, Any]]:
519
+ normalized: list[dict[str, Any]] = []
520
+ for tool in tools or []:
521
+ if not isinstance(tool, dict) or tool.get("type") != "function":
522
+ continue
523
+ function_payload = tool.get("function") if isinstance(tool.get("function"), dict) else tool
524
+ name = function_payload.get("name")
525
+ if not name:
526
+ continue
527
+ normalized.append(
528
+ {
529
+ "type": "function",
530
+ "function": {
531
+ "name": name,
532
+ "description": function_payload.get("description"),
533
+ "parameters": function_payload.get("parameters") or {"type": "object", "properties": {}},
534
+ },
535
+ }
536
+ )
537
+ return normalized
538
+
539
+
540
+ def normalize_tool_choice(tool_choice: Any, tools: list[dict[str, Any]]) -> tuple[Any, list[dict[str, Any]]]:
541
+ if tool_choice is None:
542
+ return None, tools
543
+ if isinstance(tool_choice, str):
544
+ return tool_choice, tools
545
+ if not isinstance(tool_choice, dict):
546
+ return None, tools
547
+ if tool_choice.get("type") == "function":
548
+ function_name = tool_choice.get("name") or (tool_choice.get("function") or {}).get("name")
549
+ if function_name:
550
+ return {"type": "function", "function": {"name": function_name}}, tools
551
+ if tool_choice.get("type") == "allowed_tools":
552
+ allowed = tool_choice.get("tools") or []
553
+ allowed_names = {
554
+ entry if isinstance(entry, str) else entry.get("name")
555
+ for entry in allowed
556
+ if entry is not None
557
+ }
558
+ filtered_tools = [tool for tool in tools if tool["function"]["name"] in allowed_names]
559
+ mode = tool_choice.get("mode", "auto")
560
+ return mode if isinstance(mode, str) else "auto", filtered_tools
561
+ return None, tools
562
+
563
+
564
+ def build_chat_payload(body: dict[str, Any], items: list[dict[str, Any]]) -> dict[str, Any]:
565
+ tools = response_tools_to_chat_tools(body.get("tools"))
566
+ tool_choice, tools = normalize_tool_choice(body.get("tool_choice"), tools)
567
+ payload: dict[str, Any] = {"model": body.get("model"), "messages": items_to_chat_messages(items)}
568
+ if tools:
569
+ payload["tools"] = tools
570
+ if tool_choice is not None:
571
+ payload["tool_choice"] = tool_choice
572
+ if body.get("temperature") is not None:
573
+ payload["temperature"] = body.get("temperature")
574
+ if body.get("top_p") is not None:
575
+ payload["top_p"] = body.get("top_p")
576
+ if body.get("parallel_tool_calls") is not None:
577
+ payload["parallel_tool_calls"] = body.get("parallel_tool_calls")
578
+ if body.get("max_output_tokens") is not None:
579
+ payload["max_tokens"] = body.get("max_output_tokens")
580
+ if body.get("instructions"):
581
+ payload["messages"] = [{"role": "system", "content": body["instructions"]}] + payload["messages"]
582
+ text_config = body.get("text") or {}
583
+ text_format = text_config.get("format") if isinstance(text_config, dict) else None
584
+ if isinstance(text_format, dict):
585
+ if text_format.get("type") == "json_object":
586
+ payload["response_format"] = {"type": "json_object"}
587
+ elif text_format.get("type") == "json_schema":
588
+ payload["response_format"] = {"type": "json_schema", "json_schema": text_format.get("json_schema") or {}}
589
+ return payload
590
+
591
+
592
+ def extract_upstream_message(upstream_json: dict[str, Any]) -> tuple[dict[str, Any], str | None]:
593
+ choices = upstream_json.get("choices") or []
594
+ if not choices:
595
+ return {}, None
596
+ choice = choices[0] or {}
597
+ return choice.get("message") or {}, choice.get("finish_reason")
598
+
599
+
600
+ def extract_text_and_tool_calls(message: dict[str, Any]) -> tuple[str, list[dict[str, Any]]]:
601
+ content = message.get("content")
602
+ text_chunks: list[str] = []
603
+ tool_calls: list[dict[str, Any]] = []
604
+
605
+ if isinstance(content, str):
606
+ text_chunks.append(content)
607
+ elif isinstance(content, list):
608
+ for part in content:
609
+ if isinstance(part, str):
610
+ text_chunks.append(part)
611
+ continue
612
+ if not isinstance(part, dict):
613
+ text_chunks.append(str(part))
614
+ continue
615
+ if part.get("type") in {"input_text", "output_text", "text"}:
616
+ text_chunks.append(str(part.get("text", "")))
617
+ continue
618
+ if part.get("type") in {"tool_call", "function_call"}:
619
+ arguments = part.get("arguments") or "{}"
620
+ if not isinstance(arguments, str):
621
+ arguments = json_dumps(arguments)
622
+ tool_calls.append({"id": part.get("id") or part.get("call_id") or f"call_{uuid.uuid4().hex[:12]}", "name": part.get("name"), "arguments": arguments})
623
+
624
+ for tool_call in message.get("tool_calls") or []:
625
+ if not isinstance(tool_call, dict):
626
+ continue
627
+ function_data = tool_call.get("function") or {}
628
+ arguments = function_data.get("arguments") or tool_call.get("arguments") or "{}"
629
+ if not isinstance(arguments, str):
630
+ arguments = json_dumps(arguments)
631
+ tool_calls.append({"id": tool_call.get("id") or f"call_{uuid.uuid4().hex[:12]}", "name": function_data.get("name") or tool_call.get("name"), "arguments": arguments})
632
+
633
+ deduped: list[dict[str, Any]] = []
634
+ seen_ids: set[str] = set()
635
+ for tool_call in tool_calls:
636
+ if tool_call["id"] in seen_ids:
637
+ continue
638
+ seen_ids.add(tool_call["id"])
639
+ deduped.append(tool_call)
640
+ return "\n".join(filter(None, text_chunks)).strip(), deduped
641
+
642
+
643
+ def build_choice_alias(output_items: list[dict[str, Any]], finish_reason: str | None) -> list[dict[str, Any]]:
644
+ content_parts: list[dict[str, Any]] = []
645
+ for item in output_items:
646
+ if item.get("type") == "message":
647
+ for part in item.get("content", []):
648
+ content_parts.append({"type": part.get("type", "output_text"), "text": part.get("text", "")})
649
+ elif item.get("type") == "function_call":
650
+ arguments = item.get("arguments") or "{}"
651
+ try:
652
+ parsed_arguments = json.loads(arguments)
653
+ except Exception:
654
+ parsed_arguments = arguments
655
+ content_parts.append({"type": "tool_call", "id": item.get("call_id"), "name": item.get("name"), "arguments": parsed_arguments})
656
+ return [{"index": 0, "message": {"role": "assistant", "content": content_parts}, "finish_reason": finish_reason or "stop"}]
657
+
658
+
659
+ def chat_completion_to_response(body: dict[str, Any], upstream_json: dict[str, Any], previous_response_id: str | None) -> dict[str, Any]:
660
+ upstream_message, finish_reason = extract_upstream_message(upstream_json)
661
+ assistant_text, tool_calls = extract_text_and_tool_calls(upstream_message)
662
+ response_id = upstream_json.get("id") or f"resp_{uuid.uuid4().hex}"
663
+ output_items: list[dict[str, Any]] = []
664
+ if assistant_text:
665
+ output_items.append({"id": f"msg_{uuid.uuid4().hex[:24]}", "type": "message", "status": "completed", "role": "assistant", "content": [{"type": "output_text", "text": assistant_text, "annotations": []}]})
666
+ for tool_call in tool_calls:
667
+ output_items.append({"id": f"fc_{uuid.uuid4().hex[:24]}", "type": "function_call", "status": "completed", "call_id": tool_call["id"], "name": tool_call.get("name"), "arguments": tool_call.get("arguments", "{}")})
668
+ usage = upstream_json.get("usage") or {}
669
+ return {
670
+ "id": response_id,
671
+ "object": "response",
672
+ "created_at": int(time.time()),
673
+ "status": "completed",
674
+ "model": body.get("model"),
675
+ "output": output_items,
676
+ "output_text": assistant_text,
677
+ "parallel_tool_calls": bool(body.get("parallel_tool_calls", True)),
678
+ "previous_response_id": previous_response_id,
679
+ "store": True,
680
+ "text": body.get("text") or {"format": {"type": "text"}},
681
+ "usage": {"input_tokens": usage.get("prompt_tokens"), "output_tokens": usage.get("completion_tokens"), "total_tokens": usage.get("total_tokens")},
682
+ "choices": build_choice_alias(output_items, finish_reason),
683
+ "upstream": {"id": upstream_json.get("id"), "object": upstream_json.get("object", "chat.completion"), "finish_reason": finish_reason or "stop"},
684
+ }
685
+
686
+ def store_response_record(conn: sqlite3.Connection, response_payload: dict[str, Any], request_body: dict[str, Any], input_items: list[dict[str, Any]], model_row: sqlite3.Row, api_key_row: sqlite3.Row) -> None:
687
+ conn.execute(
688
+ """
689
+ INSERT OR REPLACE INTO response_records (
690
+ response_id, parent_response_id, model_id, api_key_id, request_json,
691
+ input_items_json, output_json, output_items_json, status, created_at
692
+ ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
693
+ """,
694
+ (
695
+ response_payload["id"],
696
+ request_body.get("previous_response_id"),
697
+ model_row["id"],
698
+ api_key_row["id"],
699
+ json_dumps(request_body),
700
+ json_dumps(input_items),
701
+ json_dumps(response_payload),
702
+ json_dumps(response_payload.get("output") or []),
703
+ response_payload.get("status", "completed"),
704
+ utcnow_iso(),
705
+ ),
706
+ )
707
+
708
+
709
+ def update_usage_stats(conn: sqlite3.Connection, model_row: sqlite3.Row, api_key_row: sqlite3.Row, *, ok: bool, latency_ms: float | None, is_healthcheck: bool) -> None:
710
+ now = utcnow_iso()
711
+ if is_healthcheck:
712
+ conn.execute(
713
+ """
714
+ UPDATE proxy_models
715
+ SET healthcheck_count = healthcheck_count + 1,
716
+ healthcheck_success_count = healthcheck_success_count + ?,
717
+ last_healthcheck_at = ?,
718
+ last_health_status = ?,
719
+ last_latency_ms = ?,
720
+ updated_at = ?
721
+ WHERE id = ?
722
+ """,
723
+ (1 if ok else 0, now, 1 if ok else 0, latency_ms, now, model_row["id"]),
724
+ )
725
+ conn.execute(
726
+ """
727
+ UPDATE api_keys
728
+ SET healthcheck_count = healthcheck_count + 1,
729
+ healthcheck_success_count = healthcheck_success_count + ?,
730
+ last_tested_at = ?,
731
+ last_latency_ms = ?,
732
+ updated_at = ?
733
+ WHERE id = ?
734
+ """,
735
+ (1 if ok else 0, now, latency_ms, now, api_key_row["id"]),
736
+ )
737
+ return
738
+ conn.execute(
739
+ """
740
+ UPDATE proxy_models
741
+ SET request_count = request_count + 1,
742
+ success_count = success_count + ?,
743
+ failure_count = failure_count + ?,
744
+ last_used_at = ?,
745
+ last_latency_ms = ?,
746
+ updated_at = ?
747
+ WHERE id = ?
748
+ """,
749
+ (1 if ok else 0, 0 if ok else 1, now, latency_ms, now, model_row["id"]),
750
+ )
751
+ conn.execute(
752
+ """
753
+ UPDATE api_keys
754
+ SET request_count = request_count + 1,
755
+ success_count = success_count + ?,
756
+ failure_count = failure_count + ?,
757
+ last_used_at = ?,
758
+ last_latency_ms = ?,
759
+ updated_at = ?
760
+ WHERE id = ?
761
+ """,
762
+ (1 if ok else 0, 0 if ok else 1, now, latency_ms, now, api_key_row["id"]),
763
+ )
764
+
765
+
766
+ def insert_health_record(conn: sqlite3.Connection, model_row: sqlite3.Row, api_key_row: sqlite3.Row, *, ok: bool, status_code: int | None, latency_ms: float | None, error_message: str | None, response_excerpt: str | None) -> None:
767
+ conn.execute(
768
+ """
769
+ INSERT INTO health_check_records (
770
+ model_id, api_key_id, ok, status_code, latency_ms, error_message, response_excerpt, checked_at
771
+ ) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
772
+ """,
773
+ (model_row["id"], api_key_row["id"], 1 if ok else 0, status_code, latency_ms, error_message, response_excerpt, utcnow_iso()),
774
+ )
775
+
776
+
777
+ async def post_nvidia_chat_completion(api_key: str, payload: dict[str, Any]) -> tuple[dict[str, Any], float]:
778
+ started = time.perf_counter()
779
+ async with httpx.AsyncClient(timeout=REQUEST_TIMEOUT_SECONDS) as client:
780
+ response = await client.post(CHAT_COMPLETIONS_URL, headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}, json=payload)
781
+ latency_ms = round((time.perf_counter() - started) * 1000, 2)
782
+ if response.status_code >= 400:
783
+ try:
784
+ error_payload = response.json()
785
+ detail = error_payload.get("error", {}).get("message") or json_dumps(error_payload)
786
+ except Exception:
787
+ detail = response.text
788
+ raise HTTPException(status_code=response.status_code, detail=f"NVIDIA NIM request failed: {detail}")
789
+ return response.json(), latency_ms
790
+
791
+
792
+ async def perform_healthcheck(conn: sqlite3.Connection, model_row: sqlite3.Row, api_key_row: sqlite3.Row, prompt: str) -> dict[str, Any]:
793
+ payload = {"model": model_row["model_id"], "messages": [{"role": "user", "content": prompt}], "max_tokens": 32, "temperature": 0}
794
+ try:
795
+ upstream_json, latency_ms = await post_nvidia_chat_completion(api_key_row["api_key"], payload)
796
+ message, _finish_reason = extract_upstream_message(upstream_json)
797
+ assistant_text, _tool_calls = extract_text_and_tool_calls(message)
798
+ ok = True
799
+ detail = assistant_text or "Model responded successfully."
800
+ status_code = 200
801
+ error_message = None
802
+ response_excerpt = detail[:200]
803
+ except HTTPException as exc:
804
+ ok = False
805
+ latency_ms = None
806
+ detail = exc.detail
807
+ status_code = exc.status_code
808
+ error_message = exc.detail
809
+ response_excerpt = None
810
+ update_usage_stats(conn, model_row, api_key_row, ok=ok, latency_ms=latency_ms, is_healthcheck=True)
811
+ insert_health_record(conn, model_row, api_key_row, ok=ok, status_code=status_code, latency_ms=latency_ms, error_message=error_message, response_excerpt=response_excerpt)
812
+ conn.commit()
813
+ return {"model": model_row["model_id"], "display_name": model_row["display_name"], "api_key": api_key_row["name"], "status": "healthy" if ok else "down", "ok": ok, "latency": latency_ms, "status_code": status_code, "detail": detail, "checked_at": utcnow_iso()}
814
+
815
+
816
+ async def run_healthchecks(model_identifier: str | int | None = None, api_key_identifier: str | int | None = None, prompt: str | None = None) -> list[dict[str, Any]]:
817
+ conn = get_db_connection()
818
+ try:
819
+ settings_payload = get_settings_payload(conn)
820
+ effective_prompt = prompt or settings_payload["healthcheck_prompt"]
821
+ if api_key_identifier is not None:
822
+ api_key_row = fetch_key_by_identifier(conn, api_key_identifier, enabled_only=True)
823
+ if not api_key_row:
824
+ raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="API key not found.")
825
+ key_rows = [api_key_row]
826
+ else:
827
+ key_rows = conn.execute("SELECT * FROM api_keys WHERE enabled = 1 ORDER BY id ASC").fetchall()
828
+ if not key_rows:
829
+ raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="No enabled NVIDIA NIM API keys are configured.")
830
+ if model_identifier is not None:
831
+ model_row = fetch_model_by_identifier(conn, model_identifier, enabled_only=True)
832
+ if not model_row:
833
+ raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Model not found.")
834
+ model_rows = [model_row]
835
+ else:
836
+ model_rows = conn.execute("SELECT * FROM proxy_models WHERE enabled = 1 ORDER BY sort_order ASC, model_id ASC").fetchall()
837
+ results: list[dict[str, Any]] = []
838
+ for index, model_row in enumerate(model_rows):
839
+ api_key_row = key_rows[index % len(key_rows)]
840
+ results.append(await perform_healthcheck(conn, model_row, api_key_row, effective_prompt))
841
+ return results
842
+ finally:
843
+ conn.close()
844
+
845
+
846
+ def build_public_health_payload(hours: int | None = None) -> dict[str, Any]:
847
+ conn = get_db_connection()
848
+ try:
849
+ settings_payload = get_settings_payload(conn)
850
+ effective_hours = hours or settings_payload["public_history_hours"]
851
+ since = utcnow() - timedelta(hours=effective_hours)
852
+ models = conn.execute("SELECT * FROM proxy_models WHERE enabled = 1 ORDER BY sort_order ASC, model_id ASC").fetchall()
853
+ result_models: list[dict[str, Any]] = []
854
+ last_updated: str | None = None
855
+ for model in models:
856
+ rows = conn.execute("SELECT * FROM health_check_records WHERE model_id = ? AND checked_at >= ? ORDER BY checked_at ASC", (model["id"], since.isoformat())).fetchall()
857
+ hourly = []
858
+ ok_count = 0
859
+ for row in rows:
860
+ status_name = "healthy" if row["ok"] else "down"
861
+ hourly.append({"time": row["checked_at"], "status": status_name, "latency": row["latency_ms"]})
862
+ ok_count += 1 if row["ok"] else 0
863
+ last_updated = row["checked_at"]
864
+ total = len(rows)
865
+ success_rate = round((ok_count / total) * 100, 1) if total else 0.0
866
+ model_status = "unknown" if model["last_health_status"] is None else ("healthy" if model["last_health_status"] else "down")
867
+ result_models.append({"id": model["id"], "model_id": model["model_id"], "name": model["display_name"], "display_name": model["display_name"], "endpoint": "/v1/responses", "status": model_status, "beat": f"{success_rate}%", "hourly": hourly, "last_health_status": None if model["last_health_status"] is None else bool(model["last_health_status"]), "last_healthcheck_at": model["last_healthcheck_at"], "success_rate": success_rate, "points": [{"hour": entry["time"], "label": parse_datetime(entry["time"]).strftime("%H:%M") if parse_datetime(entry["time"]) else entry["time"], "ok": entry["status"] == "healthy", "latency_ms": entry["latency"]} for entry in hourly]})
868
+ return {"generated_at": utcnow_iso(), "last_updated": last_updated, "hours": effective_hours, "models": result_models}
869
+ finally:
870
+ conn.close()
871
+
872
+
873
+ def schedule_healthchecks() -> None:
874
+ conn = get_db_connection()
875
+ try:
876
+ settings_payload = get_settings_payload(conn)
877
+ finally:
878
+ conn.close()
879
+ interval = max(5, int(settings_payload["healthcheck_interval_minutes"]))
880
+ enabled = bool(settings_payload["healthcheck_enabled"])
881
+ if scheduler.get_job("nim-hourly-healthcheck"):
882
+ scheduler.remove_job("nim-hourly-healthcheck")
883
+ if enabled:
884
+ scheduler.add_job(run_healthchecks, "interval", minutes=interval, id="nim-hourly-healthcheck", replace_existing=True, next_run_time=utcnow() + timedelta(seconds=10))
885
+
886
+
887
+ init_db()
888
+
889
+
890
+ @asynccontextmanager
891
+ async def lifespan(_app: FastAPI):
892
+ init_db()
893
+ if not scheduler.running:
894
+ scheduler.start()
895
+ schedule_healthchecks()
896
+ try:
897
+ yield
898
+ finally:
899
+ if scheduler.running:
900
+ scheduler.shutdown(wait=False)
901
+
902
+
903
+ app = FastAPI(title="NIM Responses Gateway", lifespan=lifespan)
904
+ app.mount("/static", StaticFiles(directory=str(STATIC_DIR)), name="static")
905
+
906
+
907
+ @app.get("/")
908
+ async def public_dashboard() -> FileResponse:
909
+ return FileResponse(STATIC_DIR / "index.html")
910
+
911
+
912
+ @app.get("/admin")
913
+ async def admin_dashboard() -> FileResponse:
914
+ return FileResponse(STATIC_DIR / "admin.html")
915
+
916
+
917
+ @app.get("/api/health/public")
918
+ async def public_health(hours: int | None = None) -> dict[str, Any]:
919
+ return build_public_health_payload(hours)
920
+
921
+ @app.get("/v1/models")
922
+ async def list_models(_: bool = Depends(require_proxy_token_if_configured)) -> dict[str, Any]:
923
+ conn = get_db_connection()
924
+ try:
925
+ rows = conn.execute("SELECT * FROM proxy_models WHERE enabled = 1 ORDER BY sort_order ASC, model_id ASC").fetchall()
926
+ data = [{"id": row["model_id"], "object": "model", "created": 0, "owned_by": "nvidia-nim", "display_name": row["display_name"], "status": ("unknown" if row["last_health_status"] is None else ("healthy" if row["last_health_status"] else "down"))} for row in rows]
927
+ return {"object": "list", "data": data, "models": data}
928
+ finally:
929
+ conn.close()
930
+
931
+
932
+ @app.get("/v1/responses/{response_id}")
933
+ async def get_response(response_id: str, _: bool = Depends(require_proxy_token_if_configured)):
934
+ conn = get_db_connection()
935
+ try:
936
+ row = conn.execute("SELECT output_json FROM response_records WHERE response_id = ?", (response_id,)).fetchone()
937
+ if not row:
938
+ raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Response not found.")
939
+ return json.loads(row["output_json"])
940
+ finally:
941
+ conn.close()
942
+
943
+
944
+ @app.post("/v1/responses")
945
+ async def create_response(request: Request, _: bool = Depends(require_proxy_token_if_configured)):
946
+ body = await request.json()
947
+ if not isinstance(body, dict):
948
+ return make_error(status.HTTP_400_BAD_REQUEST, "Request body must be a JSON object.")
949
+ if not body.get("model"):
950
+ return make_error(status.HTTP_400_BAD_REQUEST, "The 'model' field is required.")
951
+ if body.get("input") is None:
952
+ return make_error(status.HTTP_400_BAD_REQUEST, "The 'input' field is required.")
953
+
954
+ conn = get_db_connection()
955
+ try:
956
+ model_row = fetch_model_by_identifier(conn, body["model"], enabled_only=True)
957
+ if not model_row:
958
+ return make_error(status.HTTP_404_NOT_FOUND, f"Model '{body['model']}' is not configured or is disabled.")
959
+ api_key_row = select_api_key(conn)
960
+ previous_items = load_previous_conversation_items(conn, body.get("previous_response_id"))
961
+ input_items = normalize_input_items(body.get("input"))
962
+ merged_items = previous_items + input_items
963
+ chat_payload = build_chat_payload(body, merged_items)
964
+ try:
965
+ upstream_json, latency_ms = await post_nvidia_chat_completion(api_key_row["api_key"], chat_payload)
966
+ except HTTPException as exc:
967
+ update_usage_stats(conn, model_row, api_key_row, ok=False, latency_ms=None, is_healthcheck=False)
968
+ conn.commit()
969
+ raise exc
970
+ response_payload = chat_completion_to_response(body, upstream_json, body.get("previous_response_id"))
971
+ update_usage_stats(conn, model_row, api_key_row, ok=True, latency_ms=latency_ms, is_healthcheck=False)
972
+ store_response_record(conn, response_payload, body, input_items, model_row, api_key_row)
973
+ conn.commit()
974
+
975
+ if body.get("stream"):
976
+ async def event_stream() -> Any:
977
+ yield f"event: response.created\ndata: {json_dumps({'type': 'response.created', 'response': {'id': response_payload['id'], 'model': response_payload['model'], 'status': 'in_progress'}})}\n\n"
978
+ for index, item in enumerate(response_payload.get("output") or []):
979
+ yield f"event: response.output_item.added\ndata: {json_dumps({'type': 'response.output_item.added', 'output_index': index, 'item': item})}\n\n"
980
+ if item.get("type") == "message":
981
+ text_value = extract_text_from_content(item.get("content"))
982
+ if text_value:
983
+ yield f"event: response.output_text.delta\ndata: {json_dumps({'type': 'response.output_text.delta', 'output_index': index, 'delta': text_value})}\n\n"
984
+ yield f"event: response.output_text.done\ndata: {json_dumps({'type': 'response.output_text.done', 'output_index': index, 'text': text_value})}\n\n"
985
+ if item.get("type") == "function_call":
986
+ yield f"event: response.function_call_arguments.done\ndata: {json_dumps({'type': 'response.function_call_arguments.done', 'output_index': index, 'arguments': item.get('arguments', '{}'), 'call_id': item.get('call_id')})}\n\n"
987
+ yield f"event: response.output_item.done\ndata: {json_dumps({'type': 'response.output_item.done', 'output_index': index, 'item': item})}\n\n"
988
+ yield f"event: response.completed\ndata: {json_dumps({'type': 'response.completed', 'response': response_payload})}\n\n"
989
+ return StreamingResponse(event_stream(), media_type="text/event-stream")
990
+ return response_payload
991
+ finally:
992
+ conn.close()
993
+
994
+ @app.post("/admin/api/login")
995
+ async def admin_login(request: Request, response: Response):
996
+ if not ADMIN_PASSWORD:
997
+ raise HTTPException(status_code=status.HTTP_503_SERVICE_UNAVAILABLE, detail="PASSWORD is not configured.")
998
+ body = await request.json()
999
+ password = body.get("password") if isinstance(body, dict) else None
1000
+ if password != ADMIN_PASSWORD:
1001
+ raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Invalid password.")
1002
+ token = create_admin_token()
1003
+ response.set_cookie(COOKIE_NAME, token, httponly=True, samesite="lax", secure=False, max_age=60 * 60 * 24 * 7)
1004
+ return {"token": token, "access_token": token, "token_type": "bearer"}
1005
+
1006
+
1007
+ @app.post("/admin/api/logout")
1008
+ async def admin_logout(response: Response, _: bool = Depends(require_admin)):
1009
+ response.delete_cookie(COOKIE_NAME)
1010
+ return {"message": "Logged out."}
1011
+
1012
+
1013
+ @app.get("/admin/api/session")
1014
+ async def admin_session(_: bool = Depends(require_admin)):
1015
+ return {"ok": True}
1016
+
1017
+
1018
+ @app.get("/admin/api/overview")
1019
+ async def admin_overview(_: bool = Depends(require_admin)):
1020
+ conn = get_db_connection()
1021
+ try:
1022
+ total_models = conn.execute("SELECT COUNT(*) AS count FROM proxy_models").fetchone()["count"]
1023
+ enabled_models = conn.execute("SELECT COUNT(*) AS count FROM proxy_models WHERE enabled = 1").fetchone()["count"]
1024
+ total_keys = conn.execute("SELECT COUNT(*) AS count FROM api_keys").fetchone()["count"]
1025
+ enabled_keys = conn.execute("SELECT COUNT(*) AS count FROM api_keys WHERE enabled = 1").fetchone()["count"]
1026
+ usage = conn.execute("SELECT COALESCE(SUM(request_count), 0) AS total_requests, COALESCE(SUM(success_count), 0) AS total_success, COALESCE(SUM(failure_count), 0) AS total_failures FROM proxy_models").fetchone()
1027
+ recent_rows = conn.execute("SELECT h.checked_at, h.ok, h.latency_ms, m.model_id FROM health_check_records h JOIN proxy_models m ON m.id = h.model_id ORDER BY h.checked_at DESC LIMIT 8").fetchall()
1028
+ return {
1029
+ "metrics": [
1030
+ {"label": "Enabled Models", "value": enabled_models},
1031
+ {"label": "Enabled Keys", "value": enabled_keys},
1032
+ {"label": "Proxy Requests", "value": usage["total_requests"]},
1033
+ {"label": "Failures", "value": usage["total_failures"]},
1034
+ ],
1035
+ "recent_checks": [{"time": row["checked_at"], "model": row["model_id"], "status": "healthy" if row["ok"] else "down", "latency": row["latency_ms"]} for row in recent_rows],
1036
+ "totals": {
1037
+ "total_models": total_models,
1038
+ "enabled_models": enabled_models,
1039
+ "total_keys": total_keys,
1040
+ "enabled_keys": enabled_keys,
1041
+ "total_requests": usage["total_requests"],
1042
+ "total_success": usage["total_success"],
1043
+ "total_failures": usage["total_failures"],
1044
+ },
1045
+ }
1046
+ finally:
1047
+ conn.close()
1048
+
1049
+
1050
+ @app.get("/admin/api/models")
1051
+ async def admin_models(_: bool = Depends(require_admin)):
1052
+ conn = get_db_connection()
1053
+ try:
1054
+ rows = conn.execute("SELECT * FROM proxy_models ORDER BY sort_order ASC, model_id ASC").fetchall()
1055
+ return {"items": [row_to_model_item(row) for row in rows]}
1056
+ finally:
1057
+ conn.close()
1058
+
1059
+
1060
+ @app.get("/admin/api/models/usage")
1061
+ async def admin_models_usage(_: bool = Depends(require_admin)):
1062
+ conn = get_db_connection()
1063
+ try:
1064
+ rows = conn.execute("SELECT * FROM proxy_models ORDER BY request_count DESC, model_id ASC").fetchall()
1065
+ return {"items": [row_to_model_item(row) for row in rows]}
1066
+ finally:
1067
+ conn.close()
1068
+
1069
+
1070
+ @app.post("/admin/api/models")
1071
+ async def admin_add_model(request: Request, _: bool = Depends(require_admin)):
1072
+ body = await request.json()
1073
+ model_id = (body.get("model_id") or body.get("name") or "").strip()
1074
+ display_name = (body.get("display_name") or model_id).strip()
1075
+ if not model_id:
1076
+ raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="model_id is required.")
1077
+ conn = get_db_connection()
1078
+ try:
1079
+ now = utcnow_iso()
1080
+ conn.execute(
1081
+ """
1082
+ INSERT INTO proxy_models (model_id, display_name, provider, description, enabled, featured, sort_order, created_at, updated_at)
1083
+ VALUES (?, ?, 'nvidia-nim', ?, ?, ?, ?, ?, ?)
1084
+ ON CONFLICT(model_id) DO UPDATE SET
1085
+ display_name = excluded.display_name,
1086
+ description = excluded.description,
1087
+ enabled = excluded.enabled,
1088
+ featured = excluded.featured,
1089
+ sort_order = excluded.sort_order,
1090
+ updated_at = excluded.updated_at
1091
+ """,
1092
+ (model_id, display_name, body.get("description"), 1 if body.get("enabled", True) else 0, 1 if body.get("featured", False) else 0, int(body.get("sort_order", 0)), now, now),
1093
+ )
1094
+ conn.commit()
1095
+ row = fetch_model_by_identifier(conn, model_id)
1096
+ return {"item": row_to_model_item(row)}
1097
+ finally:
1098
+ conn.close()
1099
+
1100
+
1101
+ def delete_model_internal(model_identifier: str) -> dict[str, Any]:
1102
+ conn = get_db_connection()
1103
+ try:
1104
+ row = fetch_model_by_identifier(conn, model_identifier)
1105
+ if not row:
1106
+ raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Model not found.")
1107
+ conn.execute("DELETE FROM proxy_models WHERE id = ?", (row["id"],))
1108
+ conn.commit()
1109
+ return {"message": "Model deleted."}
1110
+ finally:
1111
+ conn.close()
1112
+
1113
+
1114
+ @app.delete("/admin/api/models/{model_identifier}")
1115
+ async def admin_delete_model(model_identifier: str, _: bool = Depends(require_admin)):
1116
+ return delete_model_internal(model_identifier)
1117
+
1118
+
1119
+ @app.post("/admin/api/models/remove")
1120
+ async def admin_remove_model_alias(request: Request, _: bool = Depends(require_admin)):
1121
+ body = await request.json()
1122
+ value = body.get("value") if isinstance(body, dict) else None
1123
+ if not value:
1124
+ raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="value is required.")
1125
+ return delete_model_internal(str(value))
1126
+
1127
+
1128
+ async def test_model_internal(model_identifier: str, payload: dict[str, Any] | None = None) -> dict[str, Any]:
1129
+ conn = get_db_connection()
1130
+ try:
1131
+ row = fetch_model_by_identifier(conn, model_identifier, enabled_only=True)
1132
+ if not row:
1133
+ raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Model not found.")
1134
+ api_key_row = select_api_key(conn, payload.get("api_key_id") if payload else None)
1135
+ return await perform_healthcheck(conn, row, api_key_row, (payload or {}).get("prompt") or DEFAULT_HEALTH_PROMPT)
1136
+ finally:
1137
+ conn.close()
1138
+
1139
+
1140
+ @app.post("/admin/api/models/test")
1141
+ async def admin_test_model_alias(request: Request, _: bool = Depends(require_admin)):
1142
+ body = await request.json()
1143
+ identifier = body.get("value") or body.get("model_id")
1144
+ if not identifier:
1145
+ raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="value is required.")
1146
+ return await test_model_internal(str(identifier), body)
1147
+
1148
+
1149
+ @app.post("/admin/api/models/{model_identifier}/test")
1150
+ async def admin_test_model(model_identifier: str, request: Request, _: bool = Depends(require_admin)):
1151
+ body = await request.json() if request.method == "POST" else {}
1152
+ return await test_model_internal(model_identifier, body)
1153
+
1154
+ @app.get("/admin/api/keys")
1155
+ async def admin_keys(_: bool = Depends(require_admin)):
1156
+ conn = get_db_connection()
1157
+ try:
1158
+ rows = conn.execute("SELECT * FROM api_keys ORDER BY id ASC").fetchall()
1159
+ return {"items": [row_to_key_item(row) for row in rows]}
1160
+ finally:
1161
+ conn.close()
1162
+
1163
+
1164
+ @app.get("/admin/api/keys/usage")
1165
+ async def admin_keys_usage(_: bool = Depends(require_admin)):
1166
+ conn = get_db_connection()
1167
+ try:
1168
+ rows = conn.execute("SELECT * FROM api_keys ORDER BY request_count DESC, id ASC").fetchall()
1169
+ return {"items": [row_to_key_item(row) for row in rows]}
1170
+ finally:
1171
+ conn.close()
1172
+
1173
+
1174
+ @app.post("/admin/api/keys")
1175
+ async def admin_add_key(request: Request, _: bool = Depends(require_admin)):
1176
+ body = await request.json()
1177
+ name = (body.get("name") or body.get("label") or "").strip()
1178
+ api_key = (body.get("api_key") or body.get("key") or "").strip()
1179
+ if not name or not api_key:
1180
+ raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Both name and api_key are required.")
1181
+ conn = get_db_connection()
1182
+ try:
1183
+ now = utcnow_iso()
1184
+ conn.execute(
1185
+ """
1186
+ INSERT INTO api_keys (name, api_key, enabled, created_at, updated_at)
1187
+ VALUES (?, ?, ?, ?, ?)
1188
+ ON CONFLICT(name) DO UPDATE SET api_key = excluded.api_key, enabled = excluded.enabled, updated_at = excluded.updated_at
1189
+ """,
1190
+ (name, api_key, 1 if body.get("enabled", True) else 0, now, now),
1191
+ )
1192
+ conn.commit()
1193
+ row = fetch_key_by_identifier(conn, name)
1194
+ return {"item": row_to_key_item(row)}
1195
+ finally:
1196
+ conn.close()
1197
+
1198
+
1199
+ def delete_key_internal(key_identifier: str) -> dict[str, Any]:
1200
+ conn = get_db_connection()
1201
+ try:
1202
+ row = fetch_key_by_identifier(conn, key_identifier)
1203
+ if not row:
1204
+ raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="API key not found.")
1205
+ conn.execute("DELETE FROM api_keys WHERE id = ?", (row["id"],))
1206
+ conn.commit()
1207
+ return {"message": "API key deleted."}
1208
+ finally:
1209
+ conn.close()
1210
+
1211
+
1212
+ @app.delete("/admin/api/keys/{key_identifier}")
1213
+ async def admin_delete_key(key_identifier: str, _: bool = Depends(require_admin)):
1214
+ return delete_key_internal(key_identifier)
1215
+
1216
+
1217
+ @app.post("/admin/api/keys/remove")
1218
+ async def admin_remove_key_alias(request: Request, _: bool = Depends(require_admin)):
1219
+ body = await request.json()
1220
+ value = body.get("value") if isinstance(body, dict) else None
1221
+ if not value:
1222
+ raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="value is required.")
1223
+ return delete_key_internal(str(value))
1224
+
1225
+
1226
+ async def test_key_internal(key_identifier: str, payload: dict[str, Any] | None = None) -> dict[str, Any]:
1227
+ conn = get_db_connection()
1228
+ try:
1229
+ key_row = fetch_key_by_identifier(conn, key_identifier, enabled_only=True)
1230
+ if not key_row:
1231
+ raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="API key not found.")
1232
+ model_identifier = (payload or {}).get("model_id") or DEFAULT_MODELS[0][0]
1233
+ model_row = fetch_model_by_identifier(conn, model_identifier, enabled_only=True)
1234
+ if not model_row:
1235
+ raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Model not found.")
1236
+ return await perform_healthcheck(conn, model_row, key_row, (payload or {}).get("prompt") or DEFAULT_HEALTH_PROMPT)
1237
+ finally:
1238
+ conn.close()
1239
+
1240
+
1241
+ @app.post("/admin/api/keys/test")
1242
+ async def admin_test_key_alias(request: Request, _: bool = Depends(require_admin)):
1243
+ body = await request.json()
1244
+ identifier = body.get("value") or body.get("name") or body.get("label")
1245
+ if not identifier:
1246
+ raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="value is required.")
1247
+ return await test_key_internal(str(identifier), body)
1248
+
1249
+
1250
+ @app.post("/admin/api/keys/{key_identifier}/test")
1251
+ async def admin_test_key(key_identifier: str, request: Request, _: bool = Depends(require_admin)):
1252
+ body = await request.json() if request.method == "POST" else {}
1253
+ return await test_key_internal(key_identifier, body)
1254
+
1255
+
1256
+ @app.get("/admin/api/healthchecks")
1257
+ async def admin_healthchecks(hours: int = 48, _: bool = Depends(require_admin)):
1258
+ conn = get_db_connection()
1259
+ try:
1260
+ since = utcnow() - timedelta(hours=hours)
1261
+ rows = conn.execute(
1262
+ """
1263
+ SELECT h.*, m.model_id, m.display_name, k.name AS key_name
1264
+ FROM health_check_records h
1265
+ JOIN proxy_models m ON m.id = h.model_id
1266
+ LEFT JOIN api_keys k ON k.id = h.api_key_id
1267
+ WHERE h.checked_at >= ?
1268
+ ORDER BY h.checked_at DESC
1269
+ LIMIT 200
1270
+ """,
1271
+ (since.isoformat(),),
1272
+ ).fetchall()
1273
+ items = [{"id": row["id"], "model": row["display_name"], "model_id": row["model_id"], "api_key": row["key_name"], "status": "healthy" if row["ok"] else "down", "detail": row["response_excerpt"] or row["error_message"] or "No details available.", "latency": row["latency_ms"], "status_code": row["status_code"], "checked_at": row["checked_at"]} for row in rows]
1274
+ return {"items": items}
1275
+ finally:
1276
+ conn.close()
1277
+
1278
+
1279
+ @app.post("/admin/api/healthchecks/run")
1280
+ async def admin_run_healthchecks(request: Request, _: bool = Depends(require_admin)):
1281
+ body = await request.json() if request.method == "POST" else {}
1282
+ results = await run_healthchecks(model_identifier=body.get("model_id") or body.get("model"), api_key_identifier=body.get("api_key_id") or body.get("key_id"), prompt=body.get("prompt"))
1283
+ return {"items": results, "results": results}
1284
+
1285
+
1286
+ @app.get("/admin/api/settings")
1287
+ async def admin_settings(_: bool = Depends(require_admin)):
1288
+ conn = get_db_connection()
1289
+ try:
1290
+ return get_settings_payload(conn)
1291
+ finally:
1292
+ conn.close()
1293
+
1294
+
1295
+ @app.put("/admin/api/settings")
1296
+ async def admin_update_settings(request: Request, _: bool = Depends(require_admin)):
1297
+ body = await request.json()
1298
+ conn = get_db_connection()
1299
+ try:
1300
+ set_setting(conn, "healthcheck_enabled", "true" if body.get("healthcheck_enabled", True) else "false")
1301
+ set_setting(conn, "healthcheck_interval_minutes", str(max(5, int(body.get("healthcheck_interval_minutes", DEFAULT_HEALTH_INTERVAL_MINUTES)))))
1302
+ set_setting(conn, "healthcheck_prompt", body.get("healthcheck_prompt") or DEFAULT_HEALTH_PROMPT)
1303
+ if body.get("public_history_hours"):
1304
+ set_setting(conn, "public_history_hours", str(max(1, int(body.get("public_history_hours")))))
1305
+ conn.commit()
1306
+ finally:
1307
+ conn.close()
1308
+ schedule_healthchecks()
1309
+ conn = get_db_connection()
1310
+ try:
1311
+ return get_settings_payload(conn)
1312
+ finally:
1313
+ conn.close()
1314
+
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ fastapi>=0.116.0,<1.0.0
2
+ uvicorn[standard]>=0.35.0,<1.0.0
3
+ httpx>=0.28.1,<1.0.0
4
+ apscheduler>=3.10.4,<4.0.0
5
+ python-multipart>=0.0.20,<1.0.0
6
+ itsdangerous>=2.2.0,<3.0.0
static/admin.html ADDED
@@ -0,0 +1,141 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8" />
5
+ <meta name="viewport" content="width=device-width, initial-scale=1" />
6
+ <title>Admin - NVIDIA NIM Operations</title>
7
+ <link rel="preconnect" href="https://fonts.googleapis.com" />
8
+ <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
9
+ <link
10
+ href="https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@400;500;600;700&display=swap"
11
+ rel="stylesheet"
12
+ />
13
+ <link rel="stylesheet" href="/static/style.css" />
14
+ </head>
15
+ <body>
16
+ <div class="admin-shell">
17
+ <aside class="admin-sidebar">
18
+ <h3>Sections</h3>
19
+ <button class="sidebar-btn active" data-panel="overview">Overview</button>
20
+ <button class="sidebar-btn" data-panel="models">Models</button>
21
+ <button class="sidebar-btn" data-panel="keys">API Keys</button>
22
+ <button class="sidebar-btn" data-panel="health">Health Checks</button>
23
+ <button class="sidebar-btn" data-panel="settings">Settings</button>
24
+ </aside>
25
+ <section class="admin-content">
26
+ <div class="glass-panel" data-panel="overview">
27
+ <h2>Command center</h2>
28
+ <div class="section-grid" id="overview-metrics"></div>
29
+ <div class="glass-panel" style="margin-top: 1rem;">
30
+ <h3>Recent checks</h3>
31
+ <table class="table">
32
+ <thead>
33
+ <tr>
34
+ <th>Time</th>
35
+ <th>Model</th>
36
+ <th>Status</th>
37
+ <th>Latency</th>
38
+ </tr>
39
+ </thead>
40
+ <tbody id="recent-checks"></tbody>
41
+ </table>
42
+ </div>
43
+ </div>
44
+
45
+ <div class="glass-panel hidden" data-panel="models">
46
+ <div class="section-grid compact-grid">
47
+ <div class="metric-card">
48
+ <h3>Total models</h3>
49
+ <strong id="model-count">-</strong>
50
+ </div>
51
+ <div class="metric-card">
52
+ <h3>Healthy</h3>
53
+ <strong id="model-healthy">-</strong>
54
+ </div>
55
+ </div>
56
+ <div class="form-grid" style="margin-top: 1rem;">
57
+ <input id="model-id" placeholder="Model ID (e.g. z-ai/glm5)" />
58
+ <input id="model-display-name" placeholder="Display name" />
59
+ <textarea id="model-description" placeholder="Description for the admin catalog"></textarea>
60
+ <button id="model-add" type="button">Add or update model</button>
61
+ </div>
62
+ <table class="table" style="margin-top: 1rem;">
63
+ <thead>
64
+ <tr>
65
+ <th>Model</th>
66
+ <th>Status</th>
67
+ <th>Requests</th>
68
+ <th>Health</th>
69
+ <th>Actions</th>
70
+ </tr>
71
+ </thead>
72
+ <tbody id="model-table"></tbody>
73
+ </table>
74
+ </div>
75
+
76
+ <div class="glass-panel hidden" data-panel="keys">
77
+ <h3>API Keys</h3>
78
+ <div class="form-grid compact-grid">
79
+ <input id="key-label" placeholder="Key label" />
80
+ <input id="key-value" placeholder="NVIDIA NIM key" />
81
+ <button id="key-add" type="button">Store key</button>
82
+ </div>
83
+ <table class="table" style="margin-top: 1rem;">
84
+ <thead>
85
+ <tr>
86
+ <th>Label</th>
87
+ <th>Masked</th>
88
+ <th>Requests</th>
89
+ <th>Last tested</th>
90
+ <th>Status</th>
91
+ <th>Actions</th>
92
+ </tr>
93
+ </thead>
94
+ <tbody id="key-table"></tbody>
95
+ </table>
96
+ </div>
97
+
98
+ <div class="glass-panel hidden" data-panel="health">
99
+ <div class="toolbar-row">
100
+ <div>
101
+ <h3>Health checks</h3>
102
+ <p class="status-text">Manual runs are stored and surfaced on the public board hour by hour.</p>
103
+ </div>
104
+ <button id="run-healthcheck" type="button">Run checks now</button>
105
+ </div>
106
+ <div class="section-grid" id="health-grid"></div>
107
+ </div>
108
+
109
+ <div class="glass-panel hidden" data-panel="settings">
110
+ <h3>Scheduler settings</h3>
111
+ <div class="form-grid">
112
+ <label class="checkbox-row">
113
+ <input id="healthcheck-enabled" type="checkbox" />
114
+ <span>Enable scheduled health checks</span>
115
+ </label>
116
+ <input id="healthcheck-interval" type="number" min="5" step="5" placeholder="Interval in minutes" />
117
+ <input id="public-history-hours" type="number" min="1" step="1" placeholder="Public history hours" />
118
+ <textarea id="healthcheck-prompt" placeholder="Prompt used for hourly health checks"></textarea>
119
+ <div class="inline-actions">
120
+ <button id="settings-save" type="button">Save settings</button>
121
+ <button class="secondary-btn" id="refresh-now" type="button">Reload dashboard</button>
122
+ </div>
123
+ </div>
124
+ <p class="status-text" id="settings-status"></p>
125
+ </div>
126
+ </section>
127
+ </div>
128
+
129
+ <div class="login-overlay" id="login-overlay">
130
+ <div class="login-card">
131
+ <h2>Admin login</h2>
132
+ <p class="status-text">Enter the PASSWORD environment variable to continue.</p>
133
+ <label for="admin-password">Password</label>
134
+ <input type="password" id="admin-password" autocomplete="current-password" />
135
+ <button id="login-btn">Unlock dashboard</button>
136
+ <p class="status-text" id="login-status"></p>
137
+ </div>
138
+ </div>
139
+ <script src="/static/admin.js" defer></script>
140
+ </body>
141
+ </html>
static/admin.js ADDED
@@ -0,0 +1,273 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ const PANEL_ATTR = "data-panel";
2
+ const sidebarButtons = document.querySelectorAll(".sidebar-btn");
3
+ const panels = document.querySelectorAll(`.glass-panel[${PANEL_ATTR}]`);
4
+ const loginOverlay = document.getElementById("login-overlay");
5
+ const loginBtn = document.getElementById("login-btn");
6
+ const loginStatus = document.getElementById("login-status");
7
+ const overviewMetrics = document.getElementById("overview-metrics");
8
+ const recentChecks = document.getElementById("recent-checks");
9
+ const modelTable = document.getElementById("model-table");
10
+ const keyTable = document.getElementById("key-table");
11
+ const healthGrid = document.getElementById("health-grid");
12
+ const modelCount = document.getElementById("model-count");
13
+ const modelHealthy = document.getElementById("model-healthy");
14
+ const settingsStatus = document.getElementById("settings-status");
15
+
16
+ const state = {
17
+ token: sessionStorage.getItem("nim_token"),
18
+ panel: "overview",
19
+ };
20
+
21
+ const showPanel = (name) => {
22
+ panels.forEach((panel) => panel.classList.toggle("hidden", panel.getAttribute(PANEL_ATTR) !== name));
23
+ sidebarButtons.forEach((button) => button.classList.toggle("active", button.dataset.panel === name));
24
+ state.panel = name;
25
+ };
26
+
27
+ sidebarButtons.forEach((button) => button.addEventListener("click", () => showPanel(button.dataset.panel)));
28
+
29
+ const apiRequest = async (endpoint, opts = {}) => {
30
+ const headers = { "Content-Type": "application/json" };
31
+ if (state.token) headers.Authorization = `Bearer ${state.token}`;
32
+ const response = await fetch(`/admin/api/${endpoint}`, { ...opts, headers: { ...headers, ...(opts.headers || {}) } });
33
+ if (!response.ok) {
34
+ const payload = await response.json().catch(() => ({}));
35
+ throw new Error(payload.message || payload.detail || payload.error?.message || "Request failed");
36
+ }
37
+ return response.json();
38
+ };
39
+
40
+ const metricCard = ({ label, value }) => {
41
+ const div = document.createElement("div");
42
+ div.className = "metric-card";
43
+ div.innerHTML = `<h3>${label}</h3><strong>${value}</strong>`;
44
+ return div;
45
+ };
46
+
47
+ const pill = (status) => `<span class="pill">${status || "unknown"}</span>`;
48
+
49
+ async function renderOverview() {
50
+ const payload = await apiRequest("overview");
51
+ overviewMetrics.innerHTML = "";
52
+ (payload.metrics || []).forEach((metric) => overviewMetrics.appendChild(metricCard(metric)));
53
+
54
+ recentChecks.innerHTML = "";
55
+ (payload.recent_checks || []).forEach((check) => {
56
+ const row = document.createElement("tr");
57
+ row.innerHTML = `
58
+ <td>${new Date(check.time).toLocaleString()}</td>
59
+ <td>${check.model}</td>
60
+ <td>${pill(check.status)}</td>
61
+ <td>${check.latency ? `${check.latency} ms` : "-"}</td>
62
+ `;
63
+ recentChecks.appendChild(row);
64
+ });
65
+ }
66
+
67
+ async function renderModels() {
68
+ const payload = await apiRequest("models");
69
+ const items = payload.items || [];
70
+ modelCount.textContent = items.length;
71
+ modelHealthy.textContent = items.filter((item) => item.status === "healthy").length;
72
+ modelTable.innerHTML = "";
73
+ items.forEach((item) => {
74
+ const row = document.createElement("tr");
75
+ row.innerHTML = `
76
+ <td>
77
+ <strong>${item.display_name || item.model_id}</strong><br />
78
+ <span class="status-text">${item.model_id}</span>
79
+ </td>
80
+ <td>${pill(item.status)}</td>
81
+ <td>${item.request_count}</td>
82
+ <td>${item.healthcheck_success_count}/${item.healthcheck_count}</td>
83
+ <td>
84
+ <div class="inline-actions">
85
+ <button class="secondary-btn" data-action="test-model" data-id="${item.model_id}">Test</button>
86
+ <button class="secondary-btn" data-action="remove-model" data-id="${item.model_id}">Remove</button>
87
+ </div>
88
+ </td>
89
+ `;
90
+ modelTable.appendChild(row);
91
+ });
92
+ }
93
+
94
+ async function renderKeys() {
95
+ const payload = await apiRequest("keys");
96
+ const items = payload.items || [];
97
+ keyTable.innerHTML = "";
98
+ items.forEach((item) => {
99
+ const row = document.createElement("tr");
100
+ row.innerHTML = `
101
+ <td>${item.label}</td>
102
+ <td>${item.masked_key}</td>
103
+ <td>${item.request_count}</td>
104
+ <td>${item.last_tested ? new Date(item.last_tested).toLocaleString() : "-"}</td>
105
+ <td>${pill(item.status)}</td>
106
+ <td>
107
+ <div class="inline-actions">
108
+ <button class="secondary-btn" data-action="test-key" data-id="${item.name}">Test</button>
109
+ <button class="secondary-btn" data-action="remove-key" data-id="${item.name}">Delete</button>
110
+ </div>
111
+ </td>
112
+ `;
113
+ keyTable.appendChild(row);
114
+ });
115
+ }
116
+
117
+ async function renderHealth() {
118
+ const payload = await apiRequest("healthchecks");
119
+ healthGrid.innerHTML = "";
120
+ (payload.items || []).slice(0, 12).forEach((item) => {
121
+ const card = document.createElement("div");
122
+ card.className = "glass-panel";
123
+ card.innerHTML = `
124
+ <div class="toolbar-row">
125
+ <h4>${item.model}</h4>
126
+ ${pill(item.status)}
127
+ </div>
128
+ <p class="status-text">${item.detail || "No detail"}</p>
129
+ <div class="health-meta">
130
+ <span>${item.api_key || "No key recorded"}</span>
131
+ <span>${item.latency ? `${item.latency} ms` : "-"}</span>
132
+ <span>${item.checked_at ? new Date(item.checked_at).toLocaleString() : "-"}</span>
133
+ </div>
134
+ `;
135
+ healthGrid.appendChild(card);
136
+ });
137
+ }
138
+
139
+ async function renderSettings() {
140
+ const payload = await apiRequest("settings");
141
+ document.getElementById("healthcheck-enabled").checked = Boolean(payload.healthcheck_enabled);
142
+ document.getElementById("healthcheck-interval").value = payload.healthcheck_interval_minutes || 60;
143
+ document.getElementById("public-history-hours").value = payload.public_history_hours || 48;
144
+ document.getElementById("healthcheck-prompt").value = payload.healthcheck_prompt || "Reply with the single word OK.";
145
+ }
146
+
147
+ async function loadAll() {
148
+ await Promise.all([renderOverview(), renderModels(), renderKeys(), renderHealth(), renderSettings()]);
149
+ }
150
+
151
+ async function testModel(modelId) {
152
+ const payload = await apiRequest(`models/${encodeURIComponent(modelId)}/test`, { method: "POST", body: JSON.stringify({}) });
153
+ alert(`${payload.display_name || payload.model} -> ${payload.status}`);
154
+ await loadAll();
155
+ }
156
+
157
+ async function removeModel(modelId) {
158
+ await apiRequest("models/remove", { method: "POST", body: JSON.stringify({ value: modelId }) });
159
+ await loadAll();
160
+ }
161
+
162
+ async function testKey(keyName) {
163
+ const payload = await apiRequest("keys/test", { method: "POST", body: JSON.stringify({ value: keyName }) });
164
+ alert(`${payload.api_key} -> ${payload.status}`);
165
+ await loadAll();
166
+ }
167
+
168
+ async function removeKey(keyName) {
169
+ await apiRequest("keys/remove", { method: "POST", body: JSON.stringify({ value: keyName }) });
170
+ await loadAll();
171
+ }
172
+
173
+ modelTable.addEventListener("click", (event) => {
174
+ const button = event.target.closest("button[data-action]");
175
+ if (!button) return;
176
+ if (button.dataset.action === "test-model") testModel(button.dataset.id);
177
+ if (button.dataset.action === "remove-model") removeModel(button.dataset.id);
178
+ });
179
+
180
+ keyTable.addEventListener("click", (event) => {
181
+ const button = event.target.closest("button[data-action]");
182
+ if (!button) return;
183
+ if (button.dataset.action === "test-key") testKey(button.dataset.id);
184
+ if (button.dataset.action === "remove-key") removeKey(button.dataset.id);
185
+ });
186
+
187
+ document.getElementById("model-add")?.addEventListener("click", async () => {
188
+ const modelId = document.getElementById("model-id").value.trim();
189
+ const displayName = document.getElementById("model-display-name").value.trim();
190
+ const description = document.getElementById("model-description").value.trim();
191
+ if (!modelId) {
192
+ alert("Model ID is required.");
193
+ return;
194
+ }
195
+ await apiRequest("models", { method: "POST", body: JSON.stringify({ model_id: modelId, display_name: displayName || modelId, description }) });
196
+ document.getElementById("model-id").value = "";
197
+ document.getElementById("model-display-name").value = "";
198
+ document.getElementById("model-description").value = "";
199
+ await renderModels();
200
+ });
201
+
202
+ document.getElementById("key-add")?.addEventListener("click", async () => {
203
+ const name = document.getElementById("key-label").value.trim();
204
+ const apiKey = document.getElementById("key-value").value.trim();
205
+ if (!name || !apiKey) {
206
+ alert("Label and key are required.");
207
+ return;
208
+ }
209
+ await apiRequest("keys", { method: "POST", body: JSON.stringify({ name, api_key: apiKey }) });
210
+ document.getElementById("key-label").value = "";
211
+ document.getElementById("key-value").value = "";
212
+ await renderKeys();
213
+ });
214
+
215
+ document.getElementById("run-healthcheck")?.addEventListener("click", async () => {
216
+ await apiRequest("healthchecks/run", { method: "POST", body: JSON.stringify({}) });
217
+ await loadAll();
218
+ });
219
+
220
+ document.getElementById("settings-save")?.addEventListener("click", async () => {
221
+ try {
222
+ const payload = {
223
+ healthcheck_enabled: document.getElementById("healthcheck-enabled").checked,
224
+ healthcheck_interval_minutes: Number(document.getElementById("healthcheck-interval").value || 60),
225
+ public_history_hours: Number(document.getElementById("public-history-hours").value || 48),
226
+ healthcheck_prompt: document.getElementById("healthcheck-prompt").value.trim(),
227
+ };
228
+ await apiRequest("settings", { method: "PUT", body: JSON.stringify(payload) });
229
+ settingsStatus.textContent = "Settings saved.";
230
+ await loadAll();
231
+ } catch (error) {
232
+ settingsStatus.textContent = error.message;
233
+ }
234
+ });
235
+
236
+ document.getElementById("refresh-now")?.addEventListener("click", loadAll);
237
+
238
+ loginBtn.addEventListener("click", async () => {
239
+ const password = document.getElementById("admin-password").value.trim();
240
+ if (!password) {
241
+ loginStatus.textContent = "Enter a password to continue.";
242
+ return;
243
+ }
244
+ try {
245
+ loginStatus.textContent = "Authenticating...";
246
+ const response = await fetch("/admin/api/login", {
247
+ method: "POST",
248
+ headers: { "Content-Type": "application/json" },
249
+ body: JSON.stringify({ password }),
250
+ });
251
+ const payload = await response.json().catch(() => ({}));
252
+ if (!response.ok) throw new Error(payload.detail || payload.message || "Invalid password");
253
+ state.token = payload.access_token || payload.token;
254
+ sessionStorage.setItem("nim_token", state.token);
255
+ loginOverlay.classList.add("hidden");
256
+ await loadAll();
257
+ } catch (error) {
258
+ loginStatus.textContent = error.message;
259
+ }
260
+ });
261
+
262
+ window.addEventListener("DOMContentLoaded", async () => {
263
+ showPanel(state.panel);
264
+ if (!state.token) return;
265
+ loginOverlay.classList.add("hidden");
266
+ try {
267
+ await loadAll();
268
+ setInterval(loadAll, 90 * 1000);
269
+ } catch (error) {
270
+ sessionStorage.removeItem("nim_token");
271
+ loginOverlay.classList.remove("hidden");
272
+ }
273
+ });
static/index.html ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8" />
5
+ <meta name="viewport" content="width=device-width, initial-scale=1" />
6
+ <title>Model Health �� NVIDIA NIM</title>
7
+ <link rel="preconnect" href="https://fonts.googleapis.com" />
8
+ <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
9
+ <link
10
+ href="https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@400;500;600;700&display=swap"
11
+ rel="stylesheet"
12
+ />
13
+ <link rel="stylesheet" href="/static/style.css" />
14
+ </head>
15
+ <body>
16
+ <main class="app-shell">
17
+ <section class="glass-panel">
18
+ <div class="hero">
19
+ <div>
20
+ <p class="chip chip--healthy">Live metrics</p>
21
+ <h1>Model health, hour by hour</h1>
22
+ <p>
23
+ Each hour block shows whether the model responded with a healthy,
24
+ intermittent, or degraded signal. We poll NVIDIA NIM to keep the
25
+ grid in sync.
26
+ </p>
27
+ </div>
28
+ <div class="chip-list" id="summary-chips"></div>
29
+ </div>
30
+ </section>
31
+ <section class="glass-panel">
32
+ <div class="status-line">
33
+ <strong>Heat map</strong>
34
+ <span id="last-updated">��</span>
35
+ </div>
36
+ <div class="hour-grid" id="model-grid"></div>
37
+ <p class="status-text" id="error-text"></p>
38
+ </section>
39
+ </main>
40
+ <script src="/static/public.js" defer></script>
41
+ </body>
42
+ </html>
static/public.js ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ const summaryChips = document.getElementById("summary-chips");
2
+ const modelGrid = document.getElementById("model-grid");
3
+ const lastUpdated = document.getElementById("last-updated");
4
+ const errorText = document.getElementById("error-text");
5
+
6
+ const statusStyles = {
7
+ healthy: "ok",
8
+ degraded: "warn",
9
+ down: "down",
10
+ unknown: "warn",
11
+ };
12
+
13
+ function formatHourSegment(segment) {
14
+ const span = document.createElement("span");
15
+ span.textContent = new Date(segment.time).getHours();
16
+ span.classList.add(statusStyles[segment.status] || "warn");
17
+ span.title = `${segment.status} �� ${new Date(segment.time).toLocaleTimeString()} `;
18
+ return span;
19
+ }
20
+
21
+ function renderModel(model) {
22
+ const card = document.createElement("article");
23
+ card.className = "model-card";
24
+
25
+ card.innerHTML = `
26
+ <div class="health-meta">
27
+ <span class="pill">${model.status || "unknown"}</span>
28
+ <span>Beat: ${model.beat || "��"}</span>
29
+ </div>
30
+ <h2>${model.name}</h2>
31
+ <small>${model.endpoint || "NIM chat"}</small>
32
+ `.trim();
33
+
34
+ const timeline = document.createElement("div");
35
+ timeline.className = "timeline";
36
+
37
+ (model.hourly || [])
38
+ .slice(-12)
39
+ .forEach((segment) => timeline.appendChild(formatHourSegment(segment)));
40
+
41
+ card.appendChild(timeline);
42
+ return card;
43
+ }
44
+
45
+ function renderSummary(models) {
46
+ summaryChips.innerHTML = "";
47
+ const total = models.length;
48
+ const healthy = models.filter((m) => m.status === "healthy").length;
49
+ const open = models.filter((m) => m.status === "down").length;
50
+
51
+ [
52
+ { label: `Monitored models`, value: total },
53
+ { label: `Healthy`, value: healthy },
54
+ { label: `Issues`, value: open },
55
+ ].forEach((metric) => {
56
+ const chip = document.createElement("span");
57
+ chip.className = "chip";
58
+ chip.textContent = `${metric.label}: ${metric.value}`;
59
+ if (metric.label === "Issues" && metric.value > 0) {
60
+ chip.style.borderColor = "#ff5f6d";
61
+ chip.style.color = "#ffb3a6";
62
+ }
63
+ summaryChips.appendChild(chip);
64
+ });
65
+ }
66
+
67
+ async function loadHealth() {
68
+ try {
69
+ errorText.textContent = "";
70
+ const response = await fetch("/api/health/public");
71
+ if (!response.ok) {
72
+ throw new Error("Health endpoint unavailable");
73
+ }
74
+ const payload = await response.json();
75
+ const models = payload.models || [];
76
+
77
+ renderSummary(models);
78
+ modelGrid.innerHTML = "";
79
+ models.forEach((model) => modelGrid.appendChild(renderModel(model)));
80
+
81
+ lastUpdated.textContent = payload.last_updated
82
+ ? new Date(payload.last_updated).toLocaleString()
83
+ : new Date().toLocaleString();
84
+ } catch (err) {
85
+ errorText.textContent = "Unable to reach NVIDIA NIM. Please check your keys.";
86
+ lastUpdated.textContent = "��";
87
+ }
88
+ }
89
+
90
+ window.addEventListener("DOMContentLoaded", () => {
91
+ loadHealth();
92
+ setInterval(loadHealth, 60 * 1000);
93
+ });
static/style.css ADDED
@@ -0,0 +1,455 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ :root {
2
+ --base-bg: #030711;
3
+ --panel-bg: rgba(7, 18, 34, 0.87);
4
+ --accent: #00f18d;
5
+ --accent-strong: #32ffd3;
6
+ --muted: #8ca3c5;
7
+ --border: rgba(255, 255, 255, 0.12);
8
+ --glow: 0 10px 40px rgba(0, 241, 141, 0.25);
9
+ --font-sans: "Space Grotesk", "Titillium Web", "Segoe UI", sans-serif;
10
+ color-scheme: dark;
11
+ }
12
+
13
+ * {
14
+ box-sizing: border-box;
15
+ }
16
+
17
+ body {
18
+ margin: 0;
19
+ font-family: var(--font-sans);
20
+ background: radial-gradient(circle at top right, rgba(0, 241, 141, 0.18), transparent 40%),
21
+ linear-gradient(180deg, #050a15 0%, #020408 50%, #030711 100%);
22
+ color: #f1f6ff;
23
+ min-height: 100vh;
24
+ }
25
+
26
+ .app-shell {
27
+ padding: 2rem;
28
+ max-width: 1200px;
29
+ margin: 0 auto;
30
+ }
31
+
32
+ .glass-panel {
33
+ background: var(--panel-bg);
34
+ border: 1px solid var(--border);
35
+ padding: 1.5rem;
36
+ border-radius: 18px;
37
+ box-shadow: var(--glow);
38
+ backdrop-filter: blur(16px);
39
+ margin-bottom: 1.75rem;
40
+ }
41
+
42
+ .hero {
43
+ display: flex;
44
+ flex-wrap: wrap;
45
+ gap: 1rem;
46
+ align-items: center;
47
+ justify-content: space-between;
48
+ }
49
+
50
+ .hero h1 {
51
+ font-size: clamp(2rem, 1.8vw + 2rem, 3rem);
52
+ margin: 0;
53
+ line-height: 1.2;
54
+ }
55
+
56
+ .hero p {
57
+ color: var(--muted);
58
+ max-width: 540px;
59
+ margin: 0.5rem 0 0;
60
+ font-size: 1rem;
61
+ }
62
+
63
+ .chip-list {
64
+ display: flex;
65
+ flex-wrap: wrap;
66
+ gap: 0.5rem;
67
+ margin-top: 1rem;
68
+ }
69
+
70
+ .chip {
71
+ padding: 0.3rem 0.9rem;
72
+ border-radius: 999px;
73
+ border: 1px solid rgba(255, 255, 255, 0.15);
74
+ font-size: 0.9rem;
75
+ color: #b1c2dd;
76
+ }
77
+
78
+ .chip--healthy {
79
+ border-color: rgba(0, 241, 141, 0.5);
80
+ color: var(--accent-strong);
81
+ }
82
+
83
+ .hour-grid {
84
+ display: grid;
85
+ grid-template-columns: repeat(auto-fit, minmax(240px, 1fr));
86
+ gap: 1rem;
87
+ }
88
+
89
+ .model-card {
90
+ padding: 1.25rem;
91
+ border-radius: 16px;
92
+ background: linear-gradient(135deg, rgba(255, 255, 255, 0.02), rgba(255, 255, 255, 0.04));
93
+ border: 1px solid transparent;
94
+ transition: border 0.3s ease, transform 0.3s ease;
95
+ }
96
+
97
+ .model-card:hover {
98
+ transform: translateY(-6px);
99
+ border-color: rgba(0, 241, 141, 0.6);
100
+ }
101
+
102
+ .model-card h2 {
103
+ margin: 0;
104
+ font-size: 1.25rem;
105
+ }
106
+
107
+ .model-card small {
108
+ color: var(--muted);
109
+ }
110
+
111
+ .timeline {
112
+ display: flex;
113
+ align-items: center;
114
+ gap: 0.3rem;
115
+ margin-top: 0.9rem;
116
+ flex-wrap: wrap;
117
+ }
118
+
119
+ .timeline span {
120
+ width: 28px;
121
+ height: 28px;
122
+ border-radius: 8px;
123
+ background: rgba(255, 255, 255, 0.05);
124
+ display: inline-flex;
125
+ align-items: center;
126
+ justify-content: center;
127
+ font-size: 0.7rem;
128
+ font-weight: 600;
129
+ }
130
+
131
+ .timeline span.ok {
132
+ background: linear-gradient(120deg, #00d97a, #00f18d);
133
+ box-shadow: 0 6px 12px rgba(0, 241, 141, 0.4);
134
+ }
135
+
136
+ .timeline span.warn {
137
+ background: linear-gradient(120deg, #ff9a56, #ffa63a);
138
+ }
139
+
140
+ .timeline span.down {
141
+ background: linear-gradient(120deg, #ff5f6d, #ffc371);
142
+ }
143
+
144
+ .status-line {
145
+ margin-top: 1rem;
146
+ display: flex;
147
+ justify-content: space-between;
148
+ font-size: 0.9rem;
149
+ color: var(--muted);
150
+ align-items: center;
151
+ }
152
+
153
+ .status-line strong {
154
+ color: #fff;
155
+ }
156
+
157
+ .health-meta {
158
+ display: flex;
159
+ gap: 0.75rem;
160
+ flex-wrap: wrap;
161
+ align-items: center;
162
+ margin-top: 0.5rem;
163
+ color: var(--muted);
164
+ }
165
+
166
+ .pulse {
167
+ width: 8px;
168
+ height: 8px;
169
+ border-radius: 50%;
170
+ background: var(--accent);
171
+ animation: pulse 1.6s infinite;
172
+ }
173
+
174
+ @keyframes pulse {
175
+ 0% {
176
+ box-shadow: 0 0 0 0 rgba(0, 241, 141, 0.6);
177
+ }
178
+ 70% {
179
+ box-shadow: 0 0 0 12px rgba(0, 241, 141, 0);
180
+ }
181
+ 100% {
182
+ box-shadow: 0 0 0 0 rgba(0, 241, 141, 0);
183
+ }
184
+ }
185
+
186
+ button {
187
+ font-family: var(--font-sans);
188
+ border: none;
189
+ cursor: pointer;
190
+ border-radius: 999px;
191
+ padding: 0.65rem 1.2rem;
192
+ background: linear-gradient(120deg, #16a085, #00f18d);
193
+ color: #020408;
194
+ font-weight: 600;
195
+ transition: transform 0.2s ease;
196
+ }
197
+
198
+ button:hover {
199
+ transform: translateY(-2px);
200
+ }
201
+
202
+ .admin-shell {
203
+ display: grid;
204
+ grid-template-columns: 260px 1fr;
205
+ min-height: 100vh;
206
+ }
207
+
208
+ .admin-sidebar {
209
+ background: rgba(3, 7, 17, 0.9);
210
+ border-right: 1px solid rgba(255, 255, 255, 0.06);
211
+ padding: 2rem 1.5rem;
212
+ display: flex;
213
+ flex-direction: column;
214
+ gap: 0.75rem;
215
+ }
216
+
217
+ .admin-sidebar h3 {
218
+ margin: 0 0 1rem;
219
+ font-size: 1rem;
220
+ letter-spacing: 0.2em;
221
+ text-transform: uppercase;
222
+ color: var(--muted);
223
+ }
224
+
225
+ .admin-sidebar button {
226
+ width: 100%;
227
+ justify-content: flex-start;
228
+ background: transparent;
229
+ border-radius: 12px;
230
+ border: 1px solid rgba(255, 255, 255, 0.1);
231
+ color: #fff;
232
+ padding-left: 0.9rem;
233
+ text-align: left;
234
+ letter-spacing: 0.05em;
235
+ }
236
+
237
+ .admin-sidebar button.active {
238
+ border-color: var(--accent);
239
+ color: var(--accent);
240
+ box-shadow: var(--glow);
241
+ }
242
+
243
+ .admin-content {
244
+ padding: 2rem;
245
+ background: linear-gradient(180deg, rgba(4, 6, 15, 0.9), rgba(2, 3, 6, 0.95));
246
+ }
247
+
248
+ .login-overlay {
249
+ position: fixed;
250
+ inset: 0;
251
+ background: rgba(2, 3, 6, 0.8);
252
+ display: flex;
253
+ align-items: center;
254
+ justify-content: center;
255
+ z-index: 10;
256
+ }
257
+
258
+ .login-card {
259
+ width: min(400px, 90vw);
260
+ padding: 2rem;
261
+ background: var(--panel-bg);
262
+ border-radius: 22px;
263
+ border: 1px solid var(--border);
264
+ box-shadow: var(--glow);
265
+ }
266
+
267
+ .login-card h2 {
268
+ margin-top: 0;
269
+ letter-spacing: 0.08em;
270
+ }
271
+
272
+ .login-card label {
273
+ display: block;
274
+ font-size: 0.85rem;
275
+ text-transform: uppercase;
276
+ margin-bottom: 0.25rem;
277
+ color: var(--muted);
278
+ letter-spacing: 0.2em;
279
+ }
280
+
281
+ .login-card input {
282
+ width: 100%;
283
+ padding: 0.9rem;
284
+ border-radius: 12px;
285
+ border: 1px solid rgba(255, 255, 255, 0.15);
286
+ background: rgba(255, 255, 255, 0.03);
287
+ color: #fff;
288
+ margin-bottom: 1rem;
289
+ font-size: 1rem;
290
+ }
291
+
292
+ .section-grid {
293
+ display: grid;
294
+ grid-template-columns: repeat(auto-fit, minmax(250px, 1fr));
295
+ gap: 1.25rem;
296
+ }
297
+
298
+ .metric-card {
299
+ background: rgba(255, 255, 255, 0.03);
300
+ border-radius: 16px;
301
+ padding: 1rem;
302
+ border: 1px solid rgba(255, 255, 255, 0.06);
303
+ }
304
+
305
+ .metric-card h3 {
306
+ margin: 0;
307
+ font-size: 1.1rem;
308
+ }
309
+
310
+ .metric-card strong {
311
+ font-size: 2rem;
312
+ display: block;
313
+ margin-top: 0.5rem;
314
+ }
315
+
316
+ .table {
317
+ width: 100%;
318
+ border-collapse: separate;
319
+ border-spacing: 0;
320
+ }
321
+
322
+ .table thead th {
323
+ text-align: left;
324
+ font-size: 0.85rem;
325
+ text-transform: uppercase;
326
+ color: var(--muted);
327
+ padding-bottom: 0.5rem;
328
+ border-bottom: 1px solid rgba(255, 255, 255, 0.1);
329
+ }
330
+
331
+ .table tbody tr {
332
+ border-bottom: 1px solid rgba(255, 255, 255, 0.05);
333
+ }
334
+
335
+ .table td {
336
+ padding: 0.75rem 0;
337
+ }
338
+
339
+ .inline-actions {
340
+ display: flex;
341
+ gap: 0.5rem;
342
+ }
343
+
344
+ .pill {
345
+ padding: 0.25rem 0.8rem;
346
+ border-radius: 999px;
347
+ border: 1px solid transparent;
348
+ font-size: 0.75rem;
349
+ letter-spacing: 0.1em;
350
+ text-transform: uppercase;
351
+ background: rgba(0, 241, 141, 0.1);
352
+ color: var(--accent);
353
+ }
354
+
355
+ .form-inline {
356
+ display: flex;
357
+ gap: 0.6rem;
358
+ flex-wrap: wrap;
359
+ margin-top: 0.5rem;
360
+ }
361
+
362
+ .form-inline input {
363
+ flex: 1;
364
+ min-width: 120px;
365
+ background: rgba(255, 255, 255, 0.03);
366
+ border: 1px solid rgba(255, 255, 255, 0.1);
367
+ border-radius: 12px;
368
+ padding: 0.75rem;
369
+ color: #fff;
370
+ }
371
+
372
+ .status-text {
373
+ font-size: 0.85rem;
374
+ color: var(--muted);
375
+ }
376
+
377
+ .secondary-btn {
378
+ border-radius: 12px;
379
+ padding: 0.55rem 1rem;
380
+ background: transparent;
381
+ border: 1px solid rgba(255, 255, 255, 0.25);
382
+ color: #fff;
383
+ }
384
+
385
+ .secondary-btn:hover {
386
+ border-color: var(--accent);
387
+ color: var(--accent);
388
+ }
389
+
390
+ @media (max-width: 768px) {
391
+ .admin-shell {
392
+ grid-template-columns: 1fr;
393
+ }
394
+
395
+ .admin-sidebar {
396
+ flex-direction: row;
397
+ overflow-x: auto;
398
+ }
399
+ }
400
+
401
+ .hidden { display: none !important; }
402
+
403
+ .form-grid {
404
+ display: grid;
405
+ gap: 0.75rem;
406
+ grid-template-columns: repeat(2, minmax(0, 1fr));
407
+ }
408
+
409
+ .compact-grid {
410
+ grid-template-columns: repeat(auto-fit, minmax(220px, 1fr));
411
+ }
412
+
413
+ .form-grid textarea,
414
+ .form-grid input {
415
+ width: 100%;
416
+ min-width: 0;
417
+ background: rgba(255, 255, 255, 0.03);
418
+ border: 1px solid rgba(255, 255, 255, 0.1);
419
+ border-radius: 12px;
420
+ padding: 0.85rem;
421
+ color: #fff;
422
+ font: inherit;
423
+ }
424
+
425
+ .form-grid textarea {
426
+ min-height: 110px;
427
+ grid-column: 1 / -1;
428
+ resize: vertical;
429
+ }
430
+
431
+ .toolbar-row {
432
+ display: flex;
433
+ justify-content: space-between;
434
+ gap: 1rem;
435
+ align-items: center;
436
+ flex-wrap: wrap;
437
+ }
438
+
439
+ .checkbox-row {
440
+ display: flex;
441
+ align-items: center;
442
+ gap: 0.75rem;
443
+ color: #fff;
444
+ }
445
+
446
+ .checkbox-row input {
447
+ width: 18px;
448
+ height: 18px;
449
+ }
450
+
451
+ @media (max-width: 768px) {
452
+ .form-grid {
453
+ grid-template-columns: 1fr;
454
+ }
455
+ }