Spaces:

FocusGuard
/

final_v2

Sleeping

k22056537 commited on 8 days ago

Commit

eb4abb8

1 Parent(s): 9d50060

feat: sync integration updates across app and ML pipeline

Update backend/frontend integration, dataset config flow, and training/evaluation scripts, and refresh generated evaluation reports to match the latest runs.

Files changed (33) hide show

README.md +62 -0
api/__init__.py +1 -0
api/db.py +201 -0
api/drawing.py +124 -0
config/__init__.py +57 -0
config/default.yaml +79 -0
data_preparation/prepare_dataset.py +47 -9
evaluation/GROUPED_SPLIT_BENCHMARK.md +13 -0
evaluation/README.md +4 -1
evaluation/THRESHOLD_JUSTIFICATION.md +19 -125
evaluation/feature_importance.py +106 -57
evaluation/feature_selection_justification.md +15 -16
evaluation/grouped_split_benchmark.py +107 -0
evaluation/justify_thresholds.py +17 -14
evaluation/plots/roc_xgb.png +0 -0
main.py +57 -341
models/L2CS-Net/l2cs/datasets.py +0 -10
models/mlp/eval_accuracy.py +0 -2
models/mlp/sweep.py +3 -3
models/mlp/train.py +169 -51
models/xgboost/add_accuracy.py +1 -3
models/xgboost/config.py +52 -0
models/xgboost/eval_accuracy.py +0 -2
models/xgboost/sweep_local.py +2 -3
models/xgboost/train.py +126 -57
requirements.txt +1 -0
src/App.jsx +1 -1
src/components/Achievement.jsx +0 -16
src/components/Customise.jsx +1 -1
src/components/FocusPageLocal.jsx +5 -9
src/utils/VideoManagerLocal.js +5 -1
tests/test_api_settings.py +4 -18
tests/test_data_preparation.py +18 -5

README.md CHANGED Viewed

@@ -2,6 +2,8 @@
 Webcam-based focus detection: MediaPipe face mesh -> 17 features (EAR, gaze, head pose, PERCLOS, etc.) -> MLP or XGBoost for focused/unfocused. React + FastAPI app with WebSocket video.
 ## Project layout
 ```
@@ -27,6 +29,10 @@ Webcam-based focus detection: MediaPipe face mesh -> 17 features (EAR, gaze, hea
 └── package.json
 ```
 ## Setup
 ```bash
@@ -74,10 +80,30 @@ python -m models.mlp.train
 python -m models.xgboost.train
 ```
 ## Data
 9 participants, 144,793 samples, 10 features, binary labels. Collect with `python -m models.collect_features --name <name>`. Data lives in `data/collected_<name>/`.
 ## Models
 | Model | What it uses | Best for |
@@ -95,6 +121,42 @@ python -m models.xgboost.train
 | XGBoost (600 trees, depth 8) | 95.87% | 0.959 | 0.991 |
 | MLP (64->32) | 92.92% | 0.929 | 0.971 |
 ## L2CS Gaze Tracking
 L2CS-Net predicts where your eyes are looking, not just where your head is pointed. This catches the scenario where your head faces the screen but your eyes wander.

 Webcam-based focus detection: MediaPipe face mesh -> 17 features (EAR, gaze, head pose, PERCLOS, etc.) -> MLP or XGBoost for focused/unfocused. React + FastAPI app with WebSocket video.
+**Repository:** Add your repo link here (e.g. `https://github.com/your-org/FocusGuard`).
 ## Project layout
 ```
 └── package.json
 ```
+## Config
+Hyperparameters and app settings live in `config/default.yaml` (learning rates, batch size, thresholds, L2CS weights, etc.). Override with env `FOCUSGUARD_CONFIG` pointing to another YAML.
 ## Setup
 ```bash
 python -m models.xgboost.train
 ```
+### ClearML experiment tracking
+All training and evaluation config (from `config/default.yaml`) is exposed as ClearML task parameters. Enable logging with `USE_CLEARML=1`; optionally run on a **remote GPU agent** instead of locally:
+```bash
+USE_CLEARML=1 CLEARML_QUEUE=gpu python -m models.mlp.train
+USE_CLEARML=1 CLEARML_QUEUE=gpu python -m models.xgboost.train
+USE_CLEARML=1 CLEARML_QUEUE=gpu python -m evaluation.justify_thresholds --clearml
+```
+The script enqueues the task and exits; a `clearml-agent` listening on the named queue (e.g. `gpu`) runs the same command with the same parameters. Start an agent with:
+```bash
+clearml-agent daemon --queue gpu
+```
+Logged to ClearML: **parameters** (full flattened config), **scalars** (loss, accuracy, F1, ROC-AUC, per-class precision/recall/F1, dataset sizes and class counts), **artifacts** (best checkpoint, training log JSON), and **plots** (confusion matrix, ROC curves in evaluation).
 ## Data
 9 participants, 144,793 samples, 10 features, binary labels. Collect with `python -m models.collect_features --name <name>`. Data lives in `data/collected_<name>/`.
+**Train/val/test split:** All pooled training and evaluation use the same split for reproducibility. The test set is held out before any preprocessing; `StandardScaler` is fit on the training set only, then applied to val and test. Split ratios and random seed come from `config/default.yaml` (`data.split_ratios`, `mlp.seed`) via `data_preparation.prepare_dataset.get_default_split_config()`. MLP train, XGBoost train, eval_accuracy scripts, and benchmarks all use this single source so reported test accuracy is on the same held-out set.
 ## Models
 | Model | What it uses | Best for |
 | XGBoost (600 trees, depth 8) | 95.87% | 0.959 | 0.991 |
 | MLP (64->32) | 92.92% | 0.929 | 0.971 |
+## Model numbers (LOPO, 9 participants)
+| Model | LOPO AUC | Best threshold (Youden's J) | F1 @ best threshold | F1 @ 0.50 |
+|-------|----------|------------------------------|---------------------|------------|
+| MLP | 0.8624 | 0.228 | 0.8578 | 0.8149 |
+| XGBoost | 0.8695 | 0.280 | 0.8549 | 0.8324 |
+From the latest `python -m evaluation.justify_thresholds` run:
+- Best geometric face weight (`alpha`) = `0.7` (mean LOPO F1 = `0.8195`)
+- Best hybrid MLP weight (`w_mlp`) = `0.3` (mean LOPO F1 = `0.8409`)
+## Grouped vs pooled benchmark
+Latest quick benchmark (`python -m evaluation.grouped_split_benchmark --quick`) shows the expected gap between pooled random split and person-held-out LOPO:
+| Protocol | Accuracy | F1 (weighted) | ROC-AUC |
+|----------|---------:|--------------:|--------:|
+| Pooled random split | 0.9510 | 0.9507 | 0.9869 |
+| Grouped LOPO (9 folds) | 0.8303 | 0.8304 | 0.8801 |
+This is why LOPO is the primary generalisation metric for reporting.
+## Feature ablation snapshot
+Latest quick feature-selection run (`python -m evaluation.feature_importance --quick --skip-lofo`):
+| Subset | Mean LOPO F1 |
+|--------|--------------|
+| all_10 | 0.8286 |
+| eye_state | 0.8071 |
+| head_pose | 0.7480 |
+| gaze | 0.7260 |
+Top-5 XGBoost gain features: `s_face`, `ear_right`, `head_deviation`, `ear_avg`, `perclos`.
+For full leave-one-feature-out ablation, run `python -m evaluation.feature_importance` (slower).
 ## L2CS Gaze Tracking
 L2CS-Net predicts where your eyes are looking, not just where your head is pointed. This catches the scenario where your head faces the screen but your eyes wander.

api/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # API package: db, drawing, routes, websocket.

api/db.py ADDED Viewed

	@@ -0,0 +1,201 @@

+"""SQLite DB for focus sessions and user settings."""
+from __future__ import annotations
+import asyncio
+import json
+from datetime import datetime
+import aiosqlite
+def get_db_path() -> str:
+    """Database file path from config or default."""
+    try:
+        from config import get
+        return get("app.db_path") or "focus_guard.db"
+    except Exception:
+        return "focus_guard.db"
+async def init_database(db_path: str | None = None) -> None:
+    """Create focus_sessions, focus_events, user_settings tables if missing."""
+    path = db_path or get_db_path()
+    async with aiosqlite.connect(path) as db:
+        await db.execute("""
+            CREATE TABLE IF NOT EXISTS focus_sessions (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                start_time TIMESTAMP NOT NULL,
+                end_time TIMESTAMP,
+                duration_seconds INTEGER DEFAULT 0,
+                focus_score REAL DEFAULT 0.0,
+                total_frames INTEGER DEFAULT 0,
+                focused_frames INTEGER DEFAULT 0,
+                created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
+            )
+        """)
+        await db.execute("""
+            CREATE TABLE IF NOT EXISTS focus_events (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                session_id INTEGER NOT NULL,
+                timestamp TIMESTAMP NOT NULL,
+                is_focused BOOLEAN NOT NULL,
+                confidence REAL NOT NULL,
+                detection_data TEXT,
+                FOREIGN KEY (session_id) REFERENCES focus_sessions (id)
+            )
+        """)
+        await db.execute("""
+            CREATE TABLE IF NOT EXISTS user_settings (
+                id INTEGER PRIMARY KEY CHECK (id = 1),
+                model_name TEXT DEFAULT 'mlp'
+            )
+        """)
+        await db.execute("""
+            INSERT OR IGNORE INTO user_settings (id, model_name)
+            VALUES (1, 'mlp')
+        """)
+        await db.commit()
+async def create_session(db_path: str | None = None) -> int:
+    """Insert a new focus session. Returns session id."""
+    path = db_path or get_db_path()
+    async with aiosqlite.connect(path) as db:
+        cursor = await db.execute(
+            "INSERT INTO focus_sessions (start_time) VALUES (?)",
+            (datetime.now().isoformat(),),
+        )
+        await db.commit()
+        return cursor.lastrowid
+async def end_session(session_id: int, db_path: str | None = None) -> dict | None:
+    """Close session and return summary (duration, focus_score, etc.)."""
+    path = db_path or get_db_path()
+    async with aiosqlite.connect(path) as db:
+        cursor = await db.execute(
+            "SELECT start_time, total_frames, focused_frames FROM focus_sessions WHERE id = ?",
+            (session_id,),
+        )
+        row = await cursor.fetchone()
+    if not row:
+        return None
+    start_time_str, total_frames, focused_frames = row
+    start_time = datetime.fromisoformat(start_time_str)
+    end_time = datetime.now()
+    duration = (end_time - start_time).total_seconds()
+    focus_score = focused_frames / total_frames if total_frames > 0 else 0.0
+    async with aiosqlite.connect(path) as db:
+        await db.execute("""
+            UPDATE focus_sessions
+            SET end_time = ?, duration_seconds = ?, focus_score = ?
+            WHERE id = ?
+        """, (end_time.isoformat(), int(duration), focus_score, session_id))
+        await db.commit()
+    return {
+        "session_id": session_id,
+        "start_time": start_time_str,
+        "end_time": end_time.isoformat(),
+        "duration_seconds": int(duration),
+        "focus_score": round(focus_score, 3),
+        "total_frames": total_frames,
+        "focused_frames": focused_frames,
+    }
+async def store_focus_event(
+    session_id: int,
+    is_focused: bool,
+    confidence: float,
+    metadata: dict,
+    db_path: str | None = None,
+) -> None:
+    """Append one focus event and update session counters."""
+    path = db_path or get_db_path()
+    async with aiosqlite.connect(path) as db:
+        await db.execute("""
+            INSERT INTO focus_events (session_id, timestamp, is_focused, confidence, detection_data)
+            VALUES (?, ?, ?, ?, ?)
+        """, (session_id, datetime.now().isoformat(), is_focused, confidence, json.dumps(metadata)))
+        await db.execute("""
+            UPDATE focus_sessions
+            SET total_frames = total_frames + 1,
+                focused_frames = focused_frames + ?
+            WHERE id = ?
+        """, (1 if is_focused else 0, session_id))
+        await db.commit()
+class EventBuffer:
+    """Buffer focus events and flush to DB in batches to avoid per-frame writes."""
+    def __init__(self, db_path: str | None = None, flush_interval: float = 2.0):
+        self._db_path = db_path or get_db_path()
+        self._flush_interval = flush_interval
+        self._buf: list = []
+        self._lock = asyncio.Lock()
+        self._task: asyncio.Task | None = None
+        self._total_frames = 0
+        self._focused_frames = 0
+    def start(self) -> None:
+        if self._task is None:
+            self._task = asyncio.create_task(self._flush_loop())
+    async def stop(self) -> None:
+        if self._task:
+            self._task.cancel()
+            try:
+                await self._task
+            except asyncio.CancelledError:
+                pass
+            self._task = None
+        await self._flush()
+    def add(self, session_id: int, is_focused: bool, confidence: float, metadata: dict) -> None:
+        self._buf.append((
+            session_id,
+            datetime.now().isoformat(),
+            is_focused,
+            confidence,
+            json.dumps(metadata),
+        ))
+        self._total_frames += 1
+        if is_focused:
+            self._focused_frames += 1
+    async def _flush_loop(self) -> None:
+        while True:
+            await asyncio.sleep(self._flush_interval)
+            await self._flush()
+    async def _flush(self) -> None:
+        async with self._lock:
+            if not self._buf:
+                return
+            batch = self._buf[:]
+            total = self._total_frames
+            focused = self._focused_frames
+            self._buf.clear()
+            self._total_frames = 0
+            self._focused_frames = 0
+        if not batch:
+            return
+        session_id = batch[0][0]
+        try:
+            async with aiosqlite.connect(self._db_path) as db:
+                await db.executemany("""
+                    INSERT INTO focus_events (session_id, timestamp, is_focused, confidence, detection_data)
+                    VALUES (?, ?, ?, ?, ?)
+                """, batch)
+                await db.execute("""
+                    UPDATE focus_sessions
+                    SET total_frames = total_frames + ?,
+                        focused_frames = focused_frames + ?
+                    WHERE id = ?
+                """, (total, focused, session_id))
+                await db.commit()
+        except Exception as e:
+            import logging
+            logging.getLogger(__name__).warning("DB flush error: %s", e)

api/drawing.py ADDED Viewed

	@@ -0,0 +1,124 @@

+"""Server-side face mesh and HUD drawing for WebRTC/WS video frames."""
+from __future__ import annotations
+import cv2
+import numpy as np
+from mediapipe.tasks.python.vision import FaceLandmarksConnections
+from models.face_mesh import FaceMeshDetector
+_FONT = cv2.FONT_HERSHEY_SIMPLEX
+_CYAN = (255, 255, 0)
+_GREEN = (0, 255, 0)
+_MAGENTA = (255, 0, 255)
+_ORANGE = (0, 165, 255)
+_RED = (0, 0, 255)
+_WHITE = (255, 255, 255)
+_LIGHT_GREEN = (144, 238, 144)
+_TESSELATION_CONNS = [(c.start, c.end) for c in FaceLandmarksConnections.FACE_LANDMARKS_TESSELATION]
+_CONTOUR_CONNS = [(c.start, c.end) for c in FaceLandmarksConnections.FACE_LANDMARKS_CONTOURS]
+_LEFT_EYEBROW = [70, 63, 105, 66, 107, 55, 65, 52, 53, 46]
+_RIGHT_EYEBROW = [300, 293, 334, 296, 336, 285, 295, 282, 283, 276]
+_NOSE_BRIDGE = [6, 197, 195, 5, 4, 1, 19, 94, 2]
+_LIPS_OUTER = [61, 146, 91, 181, 84, 17, 314, 405, 321, 375, 291, 409, 270, 269, 267, 0, 37, 39, 40, 185, 61]
+_LIPS_INNER = [78, 95, 88, 178, 87, 14, 317, 402, 318, 324, 308, 415, 310, 311, 312, 13, 82, 81, 80, 191, 78]
+_LEFT_EAR_POINTS = [33, 160, 158, 133, 153, 145]
+_RIGHT_EAR_POINTS = [362, 385, 387, 263, 373, 380]
+def _lm_px(lm: np.ndarray, idx: int, w: int, h: int) -> tuple[int, int]:
+    return (int(lm[idx, 0] * w), int(lm[idx, 1] * h))
+def _draw_polyline(
+    frame: np.ndarray, lm: np.ndarray, indices: list[int], w: int, h: int, color: tuple, thickness: int
+) -> None:
+    for i in range(len(indices) - 1):
+        cv2.line(
+            frame,
+            _lm_px(lm, indices[i], w, h),
+            _lm_px(lm, indices[i + 1], w, h),
+            color,
+            thickness,
+            cv2.LINE_AA,
+        )
+def draw_face_mesh(frame: np.ndarray, lm: np.ndarray, w: int, h: int) -> None:
+    """Draw tessellation, contours, eyebrows, nose, lips, eyes, irises, gaze lines on frame."""
+    overlay = frame.copy()
+    for s, e in _TESSELATION_CONNS:
+        cv2.line(overlay, _lm_px(lm, s, w, h), _lm_px(lm, e, w, h), (200, 200, 200), 1, cv2.LINE_AA)
+    cv2.addWeighted(overlay, 0.3, frame, 0.7, 0, frame)
+    for s, e in _CONTOUR_CONNS:
+        cv2.line(frame, _lm_px(lm, s, w, h), _lm_px(lm, e, w, h), _CYAN, 1, cv2.LINE_AA)
+    _draw_polyline(frame, lm, _LEFT_EYEBROW, w, h, _LIGHT_GREEN, 2)
+    _draw_polyline(frame, lm, _RIGHT_EYEBROW, w, h, _LIGHT_GREEN, 2)
+    _draw_polyline(frame, lm, _NOSE_BRIDGE, w, h, _ORANGE, 1)
+    _draw_polyline(frame, lm, _LIPS_OUTER, w, h, _MAGENTA, 1)
+    _draw_polyline(frame, lm, _LIPS_INNER, w, h, (200, 0, 200), 1)
+    left_pts = np.array([_lm_px(lm, i, w, h) for i in FaceMeshDetector.LEFT_EYE_INDICES], dtype=np.int32)
+    cv2.polylines(frame, [left_pts], True, _GREEN, 2, cv2.LINE_AA)
+    right_pts = np.array([_lm_px(lm, i, w, h) for i in FaceMeshDetector.RIGHT_EYE_INDICES], dtype=np.int32)
+    cv2.polylines(frame, [right_pts], True, _GREEN, 2, cv2.LINE_AA)
+    for indices in [_LEFT_EAR_POINTS, _RIGHT_EAR_POINTS]:
+        for idx in indices:
+            cv2.circle(frame, _lm_px(lm, idx, w, h), 3, (0, 255, 255), -1, cv2.LINE_AA)
+    for iris_idx, eye_inner, eye_outer in [
+        (FaceMeshDetector.LEFT_IRIS_INDICES, 133, 33),
+        (FaceMeshDetector.RIGHT_IRIS_INDICES, 362, 263),
+    ]:
+        iris_pts = np.array([_lm_px(lm, i, w, h) for i in iris_idx], dtype=np.int32)
+        center = iris_pts[0]
+        if len(iris_pts) >= 5:
+            radii = [np.linalg.norm(iris_pts[j] - center) for j in range(1, 5)]
+            radius = max(int(np.mean(radii)), 2)
+            cv2.circle(frame, tuple(center), radius, _MAGENTA, 2, cv2.LINE_AA)
+            cv2.circle(frame, tuple(center), 2, _WHITE, -1, cv2.LINE_AA)
+        eye_cx = int((lm[eye_inner, 0] + lm[eye_outer, 0]) / 2.0 * w)
+        eye_cy = int((lm[eye_inner, 1] + lm[eye_outer, 1]) / 2.0 * h)
+        dx, dy = center[0] - eye_cx, center[1] - eye_cy
+        cv2.line(
+            frame,
+            tuple(center),
+            (int(center[0] + dx * 3), int(center[1] + dy * 3)),
+            _RED,
+            1,
+            cv2.LINE_AA,
+        )
+def draw_hud(frame: np.ndarray, result: dict, model_name: str) -> None:
+    """Draw status bar and detail overlay (FOCUSED/NOT FOCUSED, conf, s_face, s_eye, MAR, yawn)."""
+    h, w = frame.shape[:2]
+    is_focused = result["is_focused"]
+    status = "FOCUSED" if is_focused else "NOT FOCUSED"
+    color = _GREEN if is_focused else _RED
+    cv2.rectangle(frame, (0, 0), (w, 55), (0, 0, 0), -1)
+    cv2.putText(frame, status, (10, 28), _FONT, 0.8, color, 2, cv2.LINE_AA)
+    cv2.putText(frame, model_name.upper(), (w - 150, 28), _FONT, 0.45, _WHITE, 1, cv2.LINE_AA)
+    conf = result.get("mlp_prob", result.get("raw_score", 0.0))
+    mar_s = f" MAR:{result['mar']:.2f}" if result.get("mar") is not None else ""
+    sf, se = result.get("s_face", 0), result.get("s_eye", 0)
+    detail = f"conf:{conf:.2f} S_face:{sf:.2f} S_eye:{se:.2f}{mar_s}"
+    cv2.putText(frame, detail, (10, 48), _FONT, 0.4, _WHITE, 1, cv2.LINE_AA)
+    if result.get("yaw") is not None:
+        cv2.putText(
+            frame,
+            f"yaw:{result['yaw']:+.0f} pitch:{result['pitch']:+.0f} roll:{result['roll']:+.0f}",
+            (w - 280, 48),
+            _FONT,
+            0.4,
+            (180, 180, 180),
+            1,
+            cv2.LINE_AA,
+        )
+    if result.get("is_yawning"):
+        cv2.putText(frame, "YAWN", (10, 75), _FONT, 0.7, _ORANGE, 2, cv2.LINE_AA)
+def get_tesselation_connections() -> list[tuple[int, int]]:
+    """Return tessellation edge pairs for client-side face mesh (cached by client)."""
+    return list(_TESSELATION_CONNS)

config/__init__.py ADDED Viewed

	@@ -0,0 +1,57 @@

+"""Load app and model config from YAML. Single source for hyperparameters and tunables."""
+from __future__ import annotations
+import os
+from pathlib import Path
+from typing import Any
+_CONFIG: dict[str, Any] | None = None
+def _default_path() -> Path:
+    return Path(__file__).resolve().parent / "default.yaml"
+def load_config(path: str | Path | None = None) -> dict[str, Any]:
+    """Load YAML config. Uses FOCUSGUARD_CONFIG env or config/default.yaml."""
+    global _CONFIG
+    if _CONFIG is not None:
+        return _CONFIG
+    import yaml
+    p = path or os.environ.get("FOCUSGUARD_CONFIG") or _default_path()
+    p = Path(p)
+    if not p.is_file():
+        _CONFIG = {}
+        return _CONFIG
+    with open(p, "r", encoding="utf-8") as f:
+        _CONFIG = yaml.safe_load(f) or {}
+    return _CONFIG
+def get(key_path: str, default: Any = None) -> Any:
+    """Return a nested config value. E.g. get('app.db_path'), get('mlp.epochs')."""
+    cfg = load_config()
+    for part in key_path.split("."):
+        if not isinstance(cfg, dict) or part not in cfg:
+            return default
+        cfg = cfg[part]
+    return cfg
+def flatten_for_clearml(cfg: dict[str, Any] | None = None, prefix: str = "") -> dict[str, Any]:
+    """Flatten nested config so every value appears as a ClearML task parameter (no nested dicts)."""
+    cfg = cfg if cfg is not None else load_config()
+    out = {}
+    for k, v in cfg.items():
+        key = f"{prefix}/{k}" if prefix else k
+        if isinstance(v, dict) and v and not any(isinstance(x, (dict, list)) for x in v.values()):
+            for k2, v2 in v.items():
+                out[f"{key}/{k2}"] = v2
+        elif isinstance(v, dict) and v:
+            out.update(flatten_for_clearml(v, key))
+        elif isinstance(v, list):
+            out[key] = str(v)
+        else:
+            out[key] = v
+    return out

config/default.yaml ADDED Viewed

	@@ -0,0 +1,79 @@

+# FocusGuard app and model config. Override with FOCUSGUARD_CONFIG env path if needed.
+app:
+  db_path: "focus_guard.db"
+  inference_size: [640, 480]
+  inference_workers: 4
+  default_model: "mlp"
+  calibration_verify_target: [0.5, 0.5]
+  no_face_confidence_cap: 0.1
+l2cs_boost:
+  base_weight: 0.35
+  l2cs_weight: 0.65
+  veto_threshold: 0.38
+  fused_threshold: 0.52
+mlp:
+  model_name: "face_orientation"
+  epochs: 30
+  batch_size: 32
+  lr: 0.001
+  seed: 42
+  split_ratios: [0.7, 0.15, 0.15]
+  hidden_sizes: [64, 32]
+xgboost:
+  n_estimators: 600
+  max_depth: 8
+  learning_rate: 0.1489
+  subsample: 0.9625
+  colsample_bytree: 0.9013
+  reg_alpha: 1.1407
+  reg_lambda: 2.4181
+  eval_metric: "logloss"
+data:
+  split_ratios: [0.7, 0.15, 0.15]
+  clip:
+    yaw: [-45, 45]
+    pitch: [-30, 30]
+    roll: [-30, 30]
+    ear: [0, 0.85]
+    mar: [0, 1.0]
+    gaze_offset: [0, 0.50]
+    perclos: [0, 0.80]
+    blink_rate: [0, 30.0]
+    closure_duration: [0, 10.0]
+    yawn_duration: [0, 10.0]
+pipeline:
+  geometric:
+    max_angle: 22.0
+    alpha: 0.7
+    beta: 0.3
+    threshold: 0.55
+  smoother:
+    alpha_up: 0.55
+    alpha_down: 0.45
+    grace_frames: 10
+  hybrid_defaults:
+    w_mlp: 0.3
+    w_geo: 0.7
+    threshold: 0.35
+    geo_face_weight: 0.7
+    geo_eye_weight: 0.3
+  mlp_threshold: 0.23
+evaluation:
+  seed: 42
+  mlp_sklearn:
+    hidden_layer_sizes: [64, 32]
+    max_iter: 200
+    validation_fraction: 0.15
+  geo_weights:
+    face: 0.7
+    eye: 0.3
+  threshold_search:
+    alphas: [0.2, 0.85]
+    w_mlps: [0.3, 0.85]

data_preparation/prepare_dataset.py CHANGED Viewed

@@ -1,3 +1,11 @@
 import os
 import glob
@@ -9,6 +17,10 @@ torch = None
 Dataset = object  # type: ignore
 DataLoader = None
 def _require_torch():
     global torch, Dataset, DataLoader
@@ -90,9 +102,10 @@ def load_all_pooled(model_name: str = "face_orientation", data_dir: str = None):
     npz_files = sorted(glob.glob(pattern))
     if not npz_files:
-        print("[DATA] Warning: No .npz files found. Falling back to synthetic.")
-        X, y = _generate_synthetic_data(model_name)
-        return X, y, target_features
     all_X, all_y = [], []
     all_names = None
@@ -178,8 +191,23 @@ def _generate_synthetic_data(model_name: str):
     return features, labels
 def _split_and_scale(features, labels, split_ratios, seed, scale):
-    """Split data into train/val/test (stratified) and optionally scale."""
     test_ratio = split_ratios[2]
     val_ratio = split_ratios[1] / (split_ratios[0] + split_ratios[1])
@@ -196,7 +224,7 @@ def _split_and_scale(features, labels, split_ratios, seed, scale):
         X_train = scaler.fit_transform(X_train)
         X_val = scaler.transform(X_val)
         X_test = scaler.transform(X_test)
-        print("[DATA] Applied StandardScaler (fitted on training split)")
     splits = {
         "X_train": X_train, "y_train": y_train,
@@ -208,8 +236,13 @@ def _split_and_scale(features, labels, split_ratios, seed, scale):
     return splits, scaler
-def get_numpy_splits(model_name: str, split_ratios=(0.7, 0.15, 0.15), seed: int = 42, scale: bool = True):
-    """Return raw numpy arrays for non-PyTorch models (e.g. XGBoost)."""
     features, labels = _load_real_data(model_name)
     num_features = features.shape[1]
     num_classes = int(labels.max()) + 1
@@ -219,8 +252,13 @@ def get_numpy_splits(model_name: str, split_ratios=(0.7, 0.15, 0.15), seed: int
     return splits, num_features, num_classes, scaler
-def get_dataloaders(model_name: str, batch_size: int = 32, split_ratios=(0.7, 0.15, 0.15), seed: int = 42, scale: bool = True):
-    """Return PyTorch DataLoaders for neural-network models."""
     _, _, dataloader_cls = _require_torch()
     features, labels = _load_real_data(model_name)
     num_features = features.shape[1]

+"""
+Single source for pooled train/val/test data and splits.
+- Data: load_all_pooled() / load_per_person() from data/collected_*/*.npz (same pattern everywhere).
+- Splits: get_numpy_splits() / get_dataloaders() use stratified train/val/test with a fixed seed from config.
+- Test is held out before any preprocessing; StandardScaler is fit on train only, then applied to val and test.
+"""
 import os
 import glob
 Dataset = object  # type: ignore
 DataLoader = None
+# Defaults for stratified split (overridden by config when available)
+_DEFAULT_SPLIT_RATIOS = (0.7, 0.15, 0.15)
+_DEFAULT_SPLIT_SEED = 42
 def _require_torch():
     global torch, Dataset, DataLoader
     npz_files = sorted(glob.glob(pattern))
     if not npz_files:
+        raise FileNotFoundError(
+            f"No .npz files matching {pattern}. "
+            "Collect data first with `python -m models.collect_features --name <name>`."
+        )
     all_X, all_y = [], []
     all_names = None
     return features, labels
+def get_default_split_config():
+    """Return (split_ratios, seed) from config so all scripts use the same split. Reproducible and consistent."""
+    try:
+        from config import get
+        data = get("data") or {}
+        ratios = data.get("split_ratios", list(_DEFAULT_SPLIT_RATIOS))
+        seed = get("mlp.seed") or _DEFAULT_SPLIT_SEED
+        return (tuple(ratios), int(seed))
+    except Exception:
+        return (_DEFAULT_SPLIT_RATIOS, _DEFAULT_SPLIT_SEED)
 def _split_and_scale(features, labels, split_ratios, seed, scale):
+    """Stratified train/val/test split. Test is held out first; val is split from the rest.
+    No training data is used for validation or test. Scaler is fit on train only, then
+    applied to val and test (no leakage from val/test into scaling).
+    """
     test_ratio = split_ratios[2]
     val_ratio = split_ratios[1] / (split_ratios[0] + split_ratios[1])
         X_train = scaler.fit_transform(X_train)
         X_val = scaler.transform(X_val)
         X_test = scaler.transform(X_test)
+        print("[DATA] Applied StandardScaler (fitted on training split only)")
     splits = {
         "X_train": X_train, "y_train": y_train,
     return splits, scaler
+def get_numpy_splits(model_name: str, split_ratios=None, seed=None, scale: bool = True):
+    """Return train/val/test numpy arrays. Uses config defaults for split_ratios/seed when None.
+    Same dataset and split logic as get_dataloaders for consistent evaluation."""
+    if split_ratios is None or seed is None:
+        _ratios, _seed = get_default_split_config()
+        split_ratios = split_ratios if split_ratios is not None else _ratios
+        seed = seed if seed is not None else _seed
     features, labels = _load_real_data(model_name)
     num_features = features.shape[1]
     num_classes = int(labels.max()) + 1
     return splits, num_features, num_classes, scaler
+def get_dataloaders(model_name: str, batch_size: int = 32, split_ratios=None, seed=None, scale: bool = True):
+    """Return PyTorch DataLoaders. Uses config defaults for split_ratios/seed when None.
+    Test set is held out before preprocessing; scaler fit on train only."""
+    if split_ratios is None or seed is None:
+        _ratios, _seed = get_default_split_config()
+        split_ratios = split_ratios if split_ratios is not None else _ratios
+        seed = seed if seed is not None else _seed
     _, _, dataloader_cls = _require_torch()
     features, labels = _load_real_data(model_name)
     num_features = features.shape[1]

evaluation/GROUPED_SPLIT_BENCHMARK.md ADDED Viewed

	@@ -0,0 +1,13 @@

+# Grouped vs pooled split benchmark
+This compares the same XGBoost config under two evaluation protocols.
+Config: `{'n_estimators': 600, 'max_depth': 8, 'learning_rate': 0.1489, 'subsample': 0.9625, 'colsample_bytree': 0.9013, 'reg_alpha': 1.1407, 'reg_lambda': 2.4181, 'eval_metric': 'logloss'}`
+Quick mode: yes (n_estimators=200)
+| Protocol | Accuracy | F1 (weighted) | ROC-AUC |
+|----------|---------:|--------------:|--------:|
+| Pooled random split (70/15/15) | 0.9510 | 0.9507 | 0.9869 |
+| Grouped LOPO (9 folds) | 0.8303 | 0.8304 | 0.8801 |
+Use grouped LOPO as the primary generalisation metric when reporting model quality.

evaluation/README.md CHANGED Viewed

@@ -14,6 +14,9 @@ python -m evaluation.justify_thresholds
 (LOPO over 9 participants, Youden’s J, weight grid search; ~10–15 min.) Outputs go to `plots/` and the markdown file.
-**Feature importance:** Run `python -m evaluation.feature_importance` for XGBoost gain and leave-one-feature-out LOPO; writes `feature_selection_justification.md`.
 **Who writes here:** `models.mlp.train`, `models.xgboost.train`, `evaluation.justify_thresholds`, `evaluation.feature_importance`, and the notebooks.

 (LOPO over 9 participants, Youden’s J, weight grid search; ~10–15 min.) Outputs go to `plots/` and the markdown file.
+**Feature importance:** Run `python -m evaluation.feature_importance` for full XGBoost gain + leave-one-feature-out LOPO (slow).
+Fast iteration mode: `python -m evaluation.feature_importance --quick --skip-lofo` (channel ablation + gain only).
+**Grouped benchmark:** Run `python -m evaluation.grouped_split_benchmark` for full run, or `python -m evaluation.grouped_split_benchmark --quick` for faster approximate numbers.
 **Who writes here:** `models.mlp.train`, `models.xgboost.train`, `evaluation.justify_thresholds`, `evaluation.feature_importance`, and the notebooks.

evaluation/THRESHOLD_JUSTIFICATION.md CHANGED Viewed

@@ -2,105 +2,31 @@
 Auto-generated by `evaluation/justify_thresholds.py` using LOPO cross-validation over 9 participants (~145k samples).
 ## 1. ML Model Decision Thresholds
 Thresholds selected via **Youden's J statistic** (J = sensitivity + specificity - 1) on pooled LOPO held-out predictions.
 | Model | LOPO AUC | Optimal Threshold (Youden's J) | F1 @ Optimal | F1 @ 0.50 |
 |-------|----------|-------------------------------|--------------|-----------|
 | MLP | 0.8624 | **0.228** | 0.8578 | 0.8149 |
-| XGBoost | 0.8804 | **0.377** | 0.8585 | 0.8424 |
 ![MLP ROC](plots/roc_mlp.png)
 ![XGBoost ROC](plots/roc_xgboost.png)
-## 2. Precision, Recall and Tradeoff
-At the optimal threshold (Youden's J), pooled over all LOPO held-out predictions:
-| Model | Threshold | Precision | Recall | F1 | Accuracy |
-|-------|----------:|----------:|-------:|---:|---------:|
-| MLP | 0.228 | 0.8187 | 0.9008 | 0.8578 | 0.8164 |
-| XGBoost | 0.377 | 0.8426 | 0.8750 | 0.8585 | 0.8228 |
-Higher threshold → fewer positive predictions → higher precision, lower recall. Youden's J picks the threshold that balances sensitivity and specificity (recall for the positive class and true negative rate).
-## 3. Confusion Matrix (Pooled LOPO)
-At optimal threshold. Rows = true label, columns = predicted label (0 = unfocused, 1 = focused).
-### MLP
-|  | Pred 0 | Pred 1 |
-|--|-------:|-------:|
-| **True 0** | 38065 (TN) | 17750 (FP) |
-| **True 1** | 8831 (FN) | 80147 (TP) |
-TN=38065, FP=17750, FN=8831, TP=80147.
-### XGBoost
-|  | Pred 0 | Pred 1 |
-|--|-------:|-------:|
-| **True 0** | 41271 (TN) | 14544 (FP) |
-| **True 1** | 11118 (FN) | 77860 (TP) |
-TN=41271, FP=14544, FN=11118, TP=77860.
-![Confusion MLP](plots/confusion_matrix_mlp.png)
-![Confusion XGBoost](plots/confusion_matrix_xgb.png)
-## 4. Per-Person Performance Variance (LOPO)
-One fold per left-out person; metrics at optimal threshold.
-### MLP — per held-out person
-| Person | Accuracy | F1 | Precision | Recall |
-|--------|---------:|---:|----------:|-------:|
-| Abdelrahman | 0.8628 | 0.9029 | 0.8760 | 0.9314 |
-| Jarek | 0.8400 | 0.8770 | 0.8909 | 0.8635 |
-| Junhao | 0.8872 | 0.8986 | 0.8354 | 0.9723 |
-| Kexin | 0.7941 | 0.8123 | 0.7965 | 0.8288 |
-| Langyuan | 0.5877 | 0.6169 | 0.4972 | 0.8126 |
-| Mohamed | 0.8432 | 0.8653 | 0.7931 | 0.9519 |
-| Yingtao | 0.8794 | 0.9263 | 0.9217 | 0.9309 |
-| ayten | 0.8307 | 0.8986 | 0.8558 | 0.9459 |
-| saba | 0.9192 | 0.9243 | 0.9260 | 0.9226 |
-### XGBoost — per held-out person
-| Person | Accuracy | F1 | Precision | Recall |
-|--------|---------:|---:|----------:|-------:|
-| Abdelrahman | 0.8601 | 0.8959 | 0.9129 | 0.8795 |
-| Jarek | 0.8680 | 0.8993 | 0.9070 | 0.8917 |
-| Junhao | 0.9099 | 0.9180 | 0.8627 | 0.9810 |
-| Kexin | 0.7363 | 0.7385 | 0.7906 | 0.6928 |
-| Langyuan | 0.6738 | 0.6945 | 0.5625 | 0.9074 |
-| Mohamed | 0.8868 | 0.8988 | 0.8529 | 0.9498 |
-| Yingtao | 0.8711 | 0.9195 | 0.9347 | 0.9048 |
-| ayten | 0.8451 | 0.9070 | 0.8654 | 0.9528 |
-| saba | 0.9393 | 0.9421 | 0.9615 | 0.9235 |
-### Summary across persons
-| Model | Accuracy mean ± std | F1 mean ± std | Precision mean ± std | Recall mean ± std |
-|-------|---------------------|---------------|----------------------|-------------------|
-| MLP | 0.8271 ± 0.0968 | 0.8580 ± 0.0968 | 0.8214 ± 0.1307 | 0.9067 ± 0.0572 |
-| XGBoost | 0.8434 ± 0.0847 | 0.8682 ± 0.0879 | 0.8500 ± 0.1191 | 0.8981 ± 0.0836 |
-## 5. Confidence Intervals (95%, LOPO over 9 persons)
-Mean ± half-width of 95% t-interval (df=8) for each metric across the 9 left-out persons.
-| Model | F1 | Accuracy | Precision | Recall |
-|-------|---:|--------:|----------:|-------:|
-| MLP | 0.8580 [0.7835, 0.9326] | 0.8271 [0.7526, 0.9017] | 0.8214 [0.7207, 0.9221] | 0.9067 [0.8626, 0.9507] |
-| XGBoost | 0.8682 [0.8005, 0.9358] | 0.8434 [0.7781, 0.9086] | 0.8500 [0.7583, 0.9417] | 0.8981 [0.8338, 0.9625] |
-## 6. Geometric Pipeline Weights (s_face vs s_eye)
 Grid search over face weight alpha in {0.2 ... 0.8}. Eye weight = 1 - alpha. Threshold per fold via Youden's J.
@@ -118,9 +44,9 @@ Grid search over face weight alpha in {0.2 ... 0.8}. Eye weight = 1 - alpha. Thr
 ![Geometric weight search](plots/geo_weight_search.png)
-## 7. Hybrid Pipeline: MLP vs Geometric
-Grid search over w_mlp in {0.3 ... 0.8}. w_geo = 1 - w_mlp. Geometric sub-score uses same weights as geometric pipeline (face=0.7, eye=0.3).
 | MLP Weight (w_mlp) | Mean LOPO F1 |
 |-------------------:|-------------:|
@@ -131,43 +57,11 @@ Grid search over w_mlp in {0.3 ... 0.8}. w_geo = 1 - w_mlp. Geometric sub-score
 | 0.7 | 0.8039 |
 | 0.8 | 0.8016 |
-**Best:** w_mlp = 0.3 (MLP 30%, geometric 70%) → mean LOPO F1 = 0.8409
-![Hybrid MLP weight search](plots/hybrid_weight_search.png)
-## 8. Hybrid Pipeline: XGBoost vs Geometric
-Same grid over w_xgb in {0.3 ... 0.8}. w_geo = 1 - w_xgb.
-| XGBoost Weight (w_xgb) | Mean LOPO F1 |
-|-----------------------:|-------------:|
-| 0.3 | 0.8639 **<-- selected** |
-| 0.4 | 0.8552 |
-| 0.5 | 0.8451 |
-| 0.6 | 0.8419 |
-| 0.7 | 0.8382 |
-| 0.8 | 0.8353 |
-**Best:** w_xgb = 0.3 → mean LOPO F1 = 0.8639
-![Hybrid XGBoost weight search](plots/hybrid_xgb_weight_search.png)
-### Which hybrid is used in the app?
-**XGBoost hybrid is better** (F1 = 0.8639 vs MLP hybrid F1 = 0.8409).
-### Logistic regression combiner (replaces heuristic weights)
-Instead of a fixed linear blend (e.g. 0.3·ML + 0.7·geo), a **logistic regression** combines model probability and geometric score: meta-features = [model_prob, geo_score], trained on the same LOPO splits. Threshold from Youden's J on combiner output.
-| Method | Mean LOPO F1 |
-|--------|-------------:|
-| Heuristic weight grid (best w) | 0.8639 |
-| **LR combiner** | **0.8241** |
-The app uses the saved LR combiner when `combiner_path` is set in `hybrid_focus_config.json`.
-## 5. Eye and Mouth Aspect Ratio Thresholds
 ### EAR (Eye Aspect Ratio)
@@ -193,7 +87,7 @@ Between 0.16 and 0.30 the `_ear_score` function linearly interpolates from 0 to
 ![MAR distribution](plots/mar_distribution.png)
-## 10. Other Constants
 | Constant | Value | Rationale |
 |----------|------:|-----------|

 Auto-generated by `evaluation/justify_thresholds.py` using LOPO cross-validation over 9 participants (~145k samples).
+## 0. Latest random split checkpoints (15% test split)
+From the latest training runs:
+| Model | Accuracy | F1 | ROC-AUC |
+|-------|----------|-----|---------|
+| XGBoost | 95.87% | 0.9585 | 0.9908 |
+| MLP | 92.92% | 0.9287 | 0.9714 |
 ## 1. ML Model Decision Thresholds
+XGBoost config used for this report: `{'n_estimators': 600, 'max_depth': 8, 'learning_rate': 0.1489, 'subsample': 0.9625, 'colsample_bytree': 0.9013, 'reg_alpha': 1.1407, 'reg_lambda': 2.4181, 'eval_metric': 'logloss'}`.
 Thresholds selected via **Youden's J statistic** (J = sensitivity + specificity - 1) on pooled LOPO held-out predictions.
 | Model | LOPO AUC | Optimal Threshold (Youden's J) | F1 @ Optimal | F1 @ 0.50 |
 |-------|----------|-------------------------------|--------------|-----------|
 | MLP | 0.8624 | **0.228** | 0.8578 | 0.8149 |
+| XGBoost | 0.8695 | **0.280** | 0.8549 | 0.8324 |
 ![MLP ROC](plots/roc_mlp.png)
 ![XGBoost ROC](plots/roc_xgboost.png)
+## 2. Geometric Pipeline Weights (s_face vs s_eye)
 Grid search over face weight alpha in {0.2 ... 0.8}. Eye weight = 1 - alpha. Threshold per fold via Youden's J.
 ![Geometric weight search](plots/geo_weight_search.png)
+## 3. Hybrid Pipeline Weights (MLP vs Geometric)
+Grid search over w_mlp in {0.3 ... 0.8}. w_geo = 1 - w_mlp. Geometric sub-score uses same weights as geometric pipeline (face=0.7, eye=0.3). If you change geometric weights, re-run this script — optimal w_mlp can shift.
 | MLP Weight (w_mlp) | Mean LOPO F1 |
 |-------------------:|-------------:|
 | 0.7 | 0.8039 |
 | 0.8 | 0.8016 |
+**Best:** w_mlp = 0.3 (MLP 30%, geometric 70%)
+![Hybrid weight search](plots/hybrid_weight_search.png)
+## 4. Eye and Mouth Aspect Ratio Thresholds
 ### EAR (Eye Aspect Ratio)
 ![MAR distribution](plots/mar_distribution.png)
+## 5. Other Constants
 | Constant | Value | Rationale |
 |----------|------:|-----------|

evaluation/feature_importance.py CHANGED Viewed

@@ -10,6 +10,7 @@ Outputs:
 import os
 import sys
 import numpy as np
 from sklearn.preprocessing import StandardScaler
@@ -20,9 +21,10 @@ _PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
 if _PROJECT_ROOT not in sys.path:
     sys.path.insert(0, _PROJECT_ROOT)
-from data_preparation.prepare_dataset import load_per_person, SELECTED_FEATURES
-SEED = 42
 FEATURES = SELECTED_FEATURES["face_orientation"]
@@ -45,14 +47,22 @@ def xgb_feature_importance():
     return dict(zip(FEATURES, order))
-def run_ablation_lopo():
-    """Leave-one-feature-out: for each feature, train XGBoost on the other 9 with LOPO, report mean F1."""
-    by_person, _, _ = load_per_person("face_orientation")
-    persons = sorted(by_person.keys())
-    n_folds = len(persons)
     results = {}
     for drop_feat in FEATURES:
         idx_keep = [i for i, f in enumerate(FEATURES) if f != drop_feat]
         f1s = []
         for held_out in persons:
@@ -66,13 +76,7 @@ def run_ablation_lopo():
             X_tr_sc = scaler.transform(X_tr)
             X_te_sc = scaler.transform(X_te)
-            xgb = XGBClassifier(
-                n_estimators=600, max_depth=8, learning_rate=0.05,
-                subsample=0.8, colsample_bytree=0.8,
-                reg_alpha=0.1, reg_lambda=1.0,
-                eval_metric="logloss",
-                random_state=SEED, verbosity=0,
-            )
             xgb.fit(X_tr_sc, train_y)
             pred = xgb.predict(X_te_sc)
             f1s.append(f1_score(y_test, pred, average="weighted"))
@@ -80,10 +84,8 @@ def run_ablation_lopo():
     return results
-def run_baseline_lopo_f1():
     """Full 10-feature LOPO mean F1 for reference."""
-    by_person, _, _ = load_per_person("face_orientation")
-    persons = sorted(by_person.keys())
     f1s = []
     for held_out in persons:
         train_X = np.concatenate([by_person[p][0] for p in persons if p != held_out])
@@ -92,13 +94,7 @@ def run_baseline_lopo_f1():
         scaler = StandardScaler().fit(train_X)
         X_tr_sc = scaler.transform(train_X)
         X_te_sc = scaler.transform(X_test)
-        xgb = XGBClassifier(
-            n_estimators=600, max_depth=8, learning_rate=0.05,
-            subsample=0.8, colsample_bytree=0.8,
-            reg_alpha=0.1, reg_lambda=1.0,
-            eval_metric="logloss",
-            random_state=SEED, verbosity=0,
-        )
         xgb.fit(X_tr_sc, train_y)
         pred = xgb.predict(X_te_sc)
         f1s.append(f1_score(y_test, pred, average="weighted"))
@@ -113,12 +109,11 @@ CHANNEL_SUBSETS = {
 }
-def run_channel_ablation():
     """LOPO XGBoost with head-only, eye-only, gaze-only, and all 10. Returns dict subset_name -> mean F1."""
-    by_person, _, _ = load_per_person("face_orientation")
-    persons = sorted(by_person.keys())
     results = {}
     for subset_name, feat_list in CHANNEL_SUBSETS.items():
         idx_keep = [FEATURES.index(f) for f in feat_list]
         f1s = []
         for held_out in persons:
@@ -130,24 +125,40 @@ def run_channel_ablation():
             scaler = StandardScaler().fit(X_tr)
             X_tr_sc = scaler.transform(X_tr)
             X_te_sc = scaler.transform(X_te)
-            xgb = XGBClassifier(
-                n_estimators=600, max_depth=8, learning_rate=0.05,
-                subsample=0.8, colsample_bytree=0.8,
-                reg_alpha=0.1, reg_lambda=1.0,
-                eval_metric="logloss",
-                random_state=SEED, verbosity=0,
-            )
             xgb.fit(X_tr_sc, train_y)
             pred = xgb.predict(X_te_sc)
             f1s.append(f1_score(y_test, pred, average="weighted"))
         results[subset_name] = np.mean(f1s)
-    baseline = run_baseline_lopo_f1()
     results["all_10"] = baseline
     return results
 def main():
     print("=== Feature importance (XGBoost gain) ===")
     imp = xgb_feature_importance()
     if imp:
         for name in FEATURES:
@@ -155,20 +166,37 @@ def main():
         order = sorted(imp.items(), key=lambda x: -x[1])
         print("  Top-5 by gain:", [x[0] for x in order[:5]])
-    print("\n=== Leave-one-feature-out ablation (LOPO mean F1) ===")
-    baseline = run_baseline_lopo_f1()
     print(f"  Baseline (all 10 features) mean LOPO F1: {baseline:.4f}")
-    ablation = run_ablation_lopo()
-    for feat in FEATURES:
-        delta = baseline - ablation[feat]
-        print(f"  drop {feat}: F1={ablation[feat]:.4f} (Δ={delta:+.4f})")
-    worst_drop = min(ablation.items(), key=lambda x: x[1])
-    print(f"  Largest F1 drop when dropping: {worst_drop[0]} (F1={worst_drop[1]:.4f})")
-    print("\n=== Channel ablation (LOPO mean F1) ===")
-    channel_f1 = run_channel_ablation()
-    for name, f1 in channel_f1.items():
-        print(f"  {name}: {f1:.4f}")
     out_dir = os.path.join(_PROJECT_ROOT, "evaluation")
     out_path = os.path.join(out_dir, "feature_selection_justification.md")
@@ -188,6 +216,9 @@ def main():
         "",
         "## 2. XGBoost feature importance (gain)",
         "",
         "From the trained XGBoost checkpoint (gain on the 10 features):",
         "",
         "| Feature | Gain |",
@@ -207,19 +238,37 @@ def main():
         "",
         f"Baseline (all 10 features) mean LOPO F1: **{baseline:.4f}**.",
         "",
-        "| Feature dropped | Mean LOPO F1 | Δ vs baseline |",
-        "|------------------|--------------|---------------|",
     ])
-    for feat in FEATURES:
-        delta = baseline - ablation[feat]
-        lines.append(f"| {feat} | {ablation[feat]:.4f} | {delta:+.4f} |")
-    worst_drop = min(ablation.items(), key=lambda x: x[1])
     lines.append("")
-    lines.append(f"Dropping **{worst_drop[0]}** hurts most (F1={worst_drop[1]:.4f}), consistent with it being important.")
     lines.append("")
-    lines.append("## 4. Conclusion")
     lines.append("")
-    lines.append("Selection is supported by (1) domain rationale (three attention channels), (2) XGBoost gain importance, and (3) leave-one-out ablation. SHAP or correlation-based pruning can be added in future work.")
     lines.append("")
     with open(out_path, "w", encoding="utf-8") as f:
         f.write("\n".join(lines))

 import os
 import sys
+import argparse
 import numpy as np
 from sklearn.preprocessing import StandardScaler
 if _PROJECT_ROOT not in sys.path:
     sys.path.insert(0, _PROJECT_ROOT)
+from data_preparation.prepare_dataset import get_default_split_config, load_per_person, SELECTED_FEATURES
+from models.xgboost.config import XGB_BASE_PARAMS, build_xgb_classifier, get_xgb_params
+_, SEED = get_default_split_config()
 FEATURES = SELECTED_FEATURES["face_orientation"]
     return dict(zip(FEATURES, order))
+def _make_eval_model(seed: int, quick: bool):
+    if not quick:
+        return build_xgb_classifier(seed, verbosity=0)
+    params = get_xgb_params()
+    params["n_estimators"] = 200
+    params["random_state"] = seed
+    params["verbosity"] = 0
+    return XGBClassifier(**params)
+def run_ablation_lopo(by_person, persons, quick: bool):
+    """Leave-one-feature-out: for each feature, train XGBoost on the other 9 with LOPO, report mean F1."""
     results = {}
     for drop_feat in FEATURES:
+        print(f"  -> dropping {drop_feat} ({len(results)+1}/{len(FEATURES)})")
         idx_keep = [i for i, f in enumerate(FEATURES) if f != drop_feat]
         f1s = []
         for held_out in persons:
             X_tr_sc = scaler.transform(X_tr)
             X_te_sc = scaler.transform(X_te)
+            xgb = _make_eval_model(SEED, quick)
             xgb.fit(X_tr_sc, train_y)
             pred = xgb.predict(X_te_sc)
             f1s.append(f1_score(y_test, pred, average="weighted"))
     return results
+def run_baseline_lopo_f1(by_person, persons, quick: bool):
     """Full 10-feature LOPO mean F1 for reference."""
     f1s = []
     for held_out in persons:
         train_X = np.concatenate([by_person[p][0] for p in persons if p != held_out])
         scaler = StandardScaler().fit(train_X)
         X_tr_sc = scaler.transform(train_X)
         X_te_sc = scaler.transform(X_test)
+        xgb = _make_eval_model(SEED, quick)
         xgb.fit(X_tr_sc, train_y)
         pred = xgb.predict(X_te_sc)
         f1s.append(f1_score(y_test, pred, average="weighted"))
 }
+def run_channel_ablation(by_person, persons, quick: bool, baseline: float):
     """LOPO XGBoost with head-only, eye-only, gaze-only, and all 10. Returns dict subset_name -> mean F1."""
     results = {}
     for subset_name, feat_list in CHANNEL_SUBSETS.items():
+        print(f"  -> channel {subset_name}")
         idx_keep = [FEATURES.index(f) for f in feat_list]
         f1s = []
         for held_out in persons:
             scaler = StandardScaler().fit(X_tr)
             X_tr_sc = scaler.transform(X_tr)
             X_te_sc = scaler.transform(X_te)
+            xgb = _make_eval_model(SEED, quick)
             xgb.fit(X_tr_sc, train_y)
             pred = xgb.predict(X_te_sc)
             f1s.append(f1_score(y_test, pred, average="weighted"))
         results[subset_name] = np.mean(f1s)
     results["all_10"] = baseline
     return results
+def _parse_args():
+    parser = argparse.ArgumentParser(description="Feature importance + LOPO ablation")
+    parser.add_argument(
+        "--quick",
+        action="store_true",
+        help="Use fewer trees (200) for faster iteration.",
+    )
+    parser.add_argument(
+        "--skip-lofo",
+        action="store_true",
+        help="Skip leave-one-feature-out ablation.",
+    )
+    parser.add_argument(
+        "--skip-channel",
+        action="store_true",
+        help="Skip channel ablation.",
+    )
+    return parser.parse_args()
 def main():
+    args = _parse_args()
     print("=== Feature importance (XGBoost gain) ===")
+    if args.quick:
+        print("Running in quick mode (n_estimators=200).")
     imp = xgb_feature_importance()
     if imp:
         for name in FEATURES:
         order = sorted(imp.items(), key=lambda x: -x[1])
         print("  Top-5 by gain:", [x[0] for x in order[:5]])
+    print("\n[DATA] Loading per-person splits once...")
+    by_person, _, _ = load_per_person("face_orientation")
+    persons = sorted(by_person.keys())
+    print("\n=== Baseline LOPO (all 10 features) ===")
+    baseline = run_baseline_lopo_f1(by_person, persons, quick=args.quick)
     print(f"  Baseline (all 10 features) mean LOPO F1: {baseline:.4f}")
+    ablation = None
+    worst_drop = None
+    if args.skip_lofo:
+        print("\n=== Leave-one-feature-out ablation (LOPO mean F1) ===")
+        print("  skipped (--skip-lofo)")
+    else:
+        print("\n=== Leave-one-feature-out ablation (LOPO mean F1) ===")
+        ablation = run_ablation_lopo(by_person, persons, quick=args.quick)
+        for feat in FEATURES:
+            delta = baseline - ablation[feat]
+            print(f"  drop {feat}: F1={ablation[feat]:.4f} (Δ={delta:+.4f})")
+        worst_drop = min(ablation.items(), key=lambda x: x[1])
+        print(f"  Largest F1 drop when dropping: {worst_drop[0]} (F1={worst_drop[1]:.4f})")
+    channel_f1 = None
+    if args.skip_channel:
+        print("\n=== Channel ablation (LOPO mean F1) ===")
+        print("  skipped (--skip-channel)")
+    else:
+        print("\n=== Channel ablation (LOPO mean F1) ===")
+        channel_f1 = run_channel_ablation(by_person, persons, quick=args.quick, baseline=baseline)
+        for name, f1 in channel_f1.items():
+            print(f"  {name}: {f1:.4f}")
     out_dir = os.path.join(_PROJECT_ROOT, "evaluation")
     out_path = os.path.join(out_dir, "feature_selection_justification.md")
         "",
         "## 2. XGBoost feature importance (gain)",
         "",
+        f"Config used: `{XGB_BASE_PARAMS}`.",
+        "Quick mode: " + ("yes (200 trees)" if args.quick else "no (full config)"),
+        "",
         "From the trained XGBoost checkpoint (gain on the 10 features):",
         "",
         "| Feature | Gain |",
         "",
         f"Baseline (all 10 features) mean LOPO F1: **{baseline:.4f}**.",
         "",
     ])
+    if ablation is None:
+        lines.append("Skipped in this run (`--skip-lofo`).")
+    else:
+        lines.extend([
+            "| Feature dropped | Mean LOPO F1 | Δ vs baseline |",
+            "|------------------|--------------|---------------|",
+        ])
+        for feat in FEATURES:
+            delta = baseline - ablation[feat]
+            lines.append(f"| {feat} | {ablation[feat]:.4f} | {delta:+.4f} |")
+        lines.append("")
+        lines.append(f"Dropping **{worst_drop[0]}** hurts most (F1={worst_drop[1]:.4f}), consistent with it being important.")
     lines.append("")
+    lines.append("## 4. Channel ablation (LOPO)")
     lines.append("")
+    if channel_f1 is None:
+        lines.append("Skipped in this run (`--skip-channel`).")
+    else:
+        lines.append("| Subset | Mean LOPO F1 |")
+        lines.append("|--------|--------------|")
+        for name in ["head_pose", "eye_state", "gaze", "all_10"]:
+            lines.append(f"| {name} | {channel_f1[name]:.4f} |")
+    lines.append("")
+    lines.append("## 5. Conclusion")
     lines.append("")
+    if ablation is None:
+        lines.append("Selection is supported by (1) domain rationale (three attention channels), (2) XGBoost gain importance, and (3) channel ablation. Run without `--skip-lofo` for full leave-one-out ablation.")
+    else:
+        lines.append("Selection is supported by (1) domain rationale (three attention channels), (2) XGBoost gain importance, and (3) leave-one-out ablation. SHAP or correlation-based pruning can be added in future work.")
     lines.append("")
     with open(out_path, "w", encoding="utf-8") as f:
         f.write("\n".join(lines))

evaluation/feature_selection_justification.md CHANGED Viewed

@@ -13,6 +13,9 @@ Excluded: v_gaze (noisy), mar (rare events), yaw/roll (redundant with head_devia
 ## 2. XGBoost feature importance (gain)
 From the trained XGBoost checkpoint (gain on the 10 features):
 | Feature | Gain |
@@ -32,23 +35,19 @@ From the trained XGBoost checkpoint (gain on the 10 features):
 ## 3. Leave-one-feature-out ablation (LOPO)
-Baseline (all 10 features) mean LOPO F1: **0.8327**.
-| Feature dropped | Mean LOPO F1 | Δ vs baseline |
-|------------------|--------------|---------------|
-| head_deviation | 0.8395 | -0.0068 |
-| s_face | 0.8390 | -0.0063 |
-| s_eye | 0.8342 | -0.0015 |
-| h_gaze | 0.8244 | +0.0083 |
-| pitch | 0.8250 | +0.0077 |
-| ear_left | 0.8326 | +0.0001 |
-| ear_avg | 0.8350 | -0.0023 |
-| ear_right | 0.8344 | -0.0017 |
-| gaze_offset | 0.8351 | -0.0024 |
-| perclos | 0.8258 | +0.0069 |
-Dropping **h_gaze** hurts most (F1=0.8244), consistent with it being important.
-## 4. Conclusion
-Selection is supported by (1) domain rationale (three attention channels), (2) XGBoost gain importance, and (3) leave-one-out ablation. SHAP or correlation-based pruning can be added in future work.

 ## 2. XGBoost feature importance (gain)
+Config used: `{'n_estimators': 600, 'max_depth': 8, 'learning_rate': 0.1489, 'subsample': 0.9625, 'colsample_bytree': 0.9013, 'reg_alpha': 1.1407, 'reg_lambda': 2.4181, 'eval_metric': 'logloss'}`.
+Quick mode: yes (200 trees)
 From the trained XGBoost checkpoint (gain on the 10 features):
 | Feature | Gain |
 ## 3. Leave-one-feature-out ablation (LOPO)
+Baseline (all 10 features) mean LOPO F1: **0.8286**.
+Skipped in this run (`--skip-lofo`).
+## 4. Channel ablation (LOPO)
+| Subset | Mean LOPO F1 |
+|--------|--------------|
+| head_pose | 0.7480 |
+| eye_state | 0.8071 |
+| gaze | 0.7260 |
+| all_10 | 0.8286 |
+## 5. Conclusion
+Selection is supported by (1) domain rationale (three attention channels), (2) XGBoost gain importance, and (3) channel ablation. Run without `--skip-lofo` for full leave-one-out ablation.

evaluation/grouped_split_benchmark.py ADDED Viewed

	@@ -0,0 +1,107 @@

+"""Compare pooled random split vs grouped LOPO for XGBoost."""
+import os
+import sys
+import numpy as np
+from sklearn.metrics import accuracy_score, f1_score, roc_auc_score
+_PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
+if _PROJECT_ROOT not in sys.path:
+    sys.path.insert(0, _PROJECT_ROOT)
+from data_preparation.prepare_dataset import get_default_split_config, get_numpy_splits, load_per_person
+from models.xgboost.config import build_xgb_classifier
+MODEL_NAME = "face_orientation"
+OUT_PATH = os.path.join(_PROJECT_ROOT, "evaluation", "GROUPED_SPLIT_BENCHMARK.md")
+def run_pooled_split():
+    split_ratios, seed = get_default_split_config()
+    splits, _, _, _ = get_numpy_splits(
+        model_name=MODEL_NAME,
+        split_ratios=split_ratios,
+        seed=seed,
+        scale=False,
+    )
+    model = build_xgb_classifier(seed, verbosity=0, early_stopping_rounds=30)
+    model.fit(
+        splits["X_train"],
+        splits["y_train"],
+        eval_set=[(splits["X_val"], splits["y_val"])],
+        verbose=False,
+    )
+    probs = model.predict_proba(splits["X_test"])[:, 1]
+    preds = (probs >= 0.5).astype(int)
+    y = splits["y_test"]
+    return {
+        "accuracy": float(accuracy_score(y, preds)),
+        "f1": float(f1_score(y, preds, average="weighted")),
+        "auc": float(roc_auc_score(y, probs)),
+    }
+def run_grouped_lopo():
+    by_person, _, _ = load_per_person(MODEL_NAME)
+    persons = sorted(by_person.keys())
+    scores = {"accuracy": [], "f1": [], "auc": []}
+    _, seed = get_default_split_config()
+    for held_out in persons:
+        train_x = np.concatenate([by_person[p][0] for p in persons if p != held_out], axis=0)
+        train_y = np.concatenate([by_person[p][1] for p in persons if p != held_out], axis=0)
+        test_x, test_y = by_person[held_out]
+        model = build_xgb_classifier(seed, verbosity=0)
+        model.fit(train_x, train_y, verbose=False)
+        probs = model.predict_proba(test_x)[:, 1]
+        preds = (probs >= 0.5).astype(int)
+        scores["accuracy"].append(float(accuracy_score(test_y, preds)))
+        scores["f1"].append(float(f1_score(test_y, preds, average="weighted")))
+        scores["auc"].append(float(roc_auc_score(test_y, probs)))
+    return {
+        "accuracy": float(np.mean(scores["accuracy"])),
+        "f1": float(np.mean(scores["f1"])),
+        "auc": float(np.mean(scores["auc"])),
+        "folds": len(persons),
+    }
+def write_report(pooled, grouped):
+    lines = [
+        "# Grouped vs pooled split benchmark",
+        "",
+        "This compares the same XGBoost config under two evaluation protocols.",
+        "",
+        f"Config: `{XGB_BASE_PARAMS}`",
+        "",
+        "| Protocol | Accuracy | F1 (weighted) | ROC-AUC |",
+        "|----------|---------:|--------------:|--------:|",
+        f"| Pooled random split (70/15/15) | {pooled['accuracy']:.4f} | {pooled['f1']:.4f} | {pooled['auc']:.4f} |",
+        f"| Grouped LOPO ({grouped['folds']} folds) | {grouped['accuracy']:.4f} | {grouped['f1']:.4f} | {grouped['auc']:.4f} |",
+        "",
+        "Use grouped LOPO as the primary generalisation metric when reporting model quality.",
+        "",
+    ]
+    with open(OUT_PATH, "w", encoding="utf-8") as f:
+        f.write("\n".join(lines))
+    print(f"[LOG] Wrote {OUT_PATH}")
+def main():
+    pooled = run_pooled_split()
+    grouped = run_grouped_lopo()
+    write_report(pooled, grouped)
+    print(
+        "[DONE] pooled_f1={:.4f} grouped_f1={:.4f}".format(
+            pooled["f1"], grouped["f1"]
+        )
+    )
+if __name__ == "__main__":
+    main()

evaluation/justify_thresholds.py CHANGED Viewed

@@ -12,20 +12,19 @@ import matplotlib.pyplot as plt
 from sklearn.neural_network import MLPClassifier
 from sklearn.preprocessing import StandardScaler
 from sklearn.metrics import roc_curve, roc_auc_score, f1_score
-from xgboost import XGBClassifier
 _PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
 sys.path.insert(0, _PROJECT_ROOT)
-from data_preparation.prepare_dataset import load_per_person, SELECTED_FEATURES
 PLOTS_DIR = os.path.join(os.path.dirname(__file__), "plots")
 REPORT_PATH = os.path.join(os.path.dirname(__file__), "THRESHOLD_JUSTIFICATION.md")
-SEED = 42
-# ClearML
-# start logging with: USE_CLEARML=1 python -m evaluation.justify_thresholds or: python -m evaluation.justify_thresholds --clearml
-_USE_CLEARML = os.environ.get("USE_CLEARML", "0") == "1" or "--clearml" in sys.argv
 _task = None
 _logger = None
@@ -33,13 +32,21 @@ _logger = None
 if _USE_CLEARML:
     try:
         from clearml import Task
         _task = Task.init(
             project_name="Focus Guard",
             task_name="Threshold Justification",
             tags=["evaluation", "thresholds"],
         )
-        _task.connect({"SEED": SEED, "n_participants": 9})
         _logger = _task.get_logger()
         print("ClearML enabled — logging to project 'Focus Guard'")
     except ImportError:
         print("WARNING: ClearML not installed. Continuing without logging.")
@@ -107,13 +114,7 @@ def run_lopo_models():
         results["mlp"]["y"].append(y_test)
         results["mlp"]["p"].append(mlp_prob)
-        xgb = XGBClassifier(
-            n_estimators=600, max_depth=8, learning_rate=0.05,
-            subsample=0.8, colsample_bytree=0.8,
-            reg_alpha=0.1, reg_lambda=1.0,
-            use_label_encoder=False, eval_metric="logloss",
-            random_state=SEED, verbosity=0,
-        )
         xgb.fit(X_tr_sc, train_y)
         xgb_prob = xgb.predict_proba(X_te_sc)[:, 1]
         results["xgb"]["y"].append(y_test)
@@ -422,6 +423,8 @@ def write_report(model_stats, geo_f1, best_alpha, hybrid_f1, best_w, dist_stats)
     lines.append("## 1. ML Model Decision Thresholds")
     lines.append("")
     lines.append("Thresholds selected via **Youden's J statistic** (J = sensitivity + specificity - 1) "
                  "on pooled LOPO held-out predictions.")
     lines.append("")

 from sklearn.neural_network import MLPClassifier
 from sklearn.preprocessing import StandardScaler
 from sklearn.metrics import roc_curve, roc_auc_score, f1_score
 _PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
 sys.path.insert(0, _PROJECT_ROOT)
+from data_preparation.prepare_dataset import get_default_split_config, load_per_person, SELECTED_FEATURES
+from models.xgboost.config import XGB_BASE_PARAMS, build_xgb_classifier
 PLOTS_DIR = os.path.join(os.path.dirname(__file__), "plots")
 REPORT_PATH = os.path.join(os.path.dirname(__file__), "THRESHOLD_JUSTIFICATION.md")
+_, SEED = get_default_split_config()
+_USE_CLEARML = os.environ.get("USE_CLEARML", "0") == "1" or "--clearml" in sys.argv or bool(os.environ.get("CLEARML_TASK_ID"))
+_CLEARML_QUEUE = os.environ.get("CLEARML_QUEUE", "")
 _task = None
 _logger = None
 if _USE_CLEARML:
     try:
         from clearml import Task
+        from config import flatten_for_clearml
         _task = Task.init(
             project_name="Focus Guard",
             task_name="Threshold Justification",
             tags=["evaluation", "thresholds"],
         )
+        flat = flatten_for_clearml()
+        flat["evaluation/SEED"] = SEED
+        flat["evaluation/n_participants"] = 9
+        _task.connect(flat)
         _logger = _task.get_logger()
+        if _CLEARML_QUEUE:
+            print(f"[ClearML] Enqueuing to queue '{_CLEARML_QUEUE}'.")
+            _task.execute_remotely(queue_name=_CLEARML_QUEUE)
+            sys.exit(0)
         print("ClearML enabled — logging to project 'Focus Guard'")
     except ImportError:
         print("WARNING: ClearML not installed. Continuing without logging.")
         results["mlp"]["y"].append(y_test)
         results["mlp"]["p"].append(mlp_prob)
+        xgb = build_xgb_classifier(SEED, verbosity=0)
         xgb.fit(X_tr_sc, train_y)
         xgb_prob = xgb.predict_proba(X_te_sc)[:, 1]
         results["xgb"]["y"].append(y_test)
     lines.append("## 1. ML Model Decision Thresholds")
     lines.append("")
+    lines.append(f"XGBoost config used for this report: `{XGB_BASE_PARAMS}`.")
+    lines.append("")
     lines.append("Thresholds selected via **Youden's J statistic** (J = sensitivity + specificity - 1) "
                  "on pooled LOPO held-out predictions.")
     lines.append("")

evaluation/plots/roc_xgb.png CHANGED Viewed

main.py CHANGED Viewed

@@ -1,174 +1,68 @@
 from __future__ import annotations
-from fastapi import FastAPI, WebSocket, WebSocketDisconnect, HTTPException, Request
-from fastapi.staticfiles import StaticFiles
-from fastapi.responses import FileResponse
-from fastapi.middleware.cors import CORSMiddleware
-from pydantic import BaseModel
-from typing import Optional, List, Any
 import base64
-import cv2
-import numpy as np
-import aiosqlite
 import json
-from datetime import datetime, timedelta
-import math
 import os
-from pathlib import Path
-from typing import Callable
-from contextlib import asynccontextmanager
-import asyncio
-import concurrent.futures
 import threading
-import logging
 from aiortc import RTCPeerConnection, RTCSessionDescription, VideoStreamTrack
-logger = logging.getLogger(__name__)
 from av import VideoFrame
-from mediapipe.tasks.python.vision import FaceLandmarksConnections
-from ui.pipeline import (
-    FaceMeshPipeline, MLPPipeline, HybridFocusPipeline, XGBoostPipeline,
-    L2CSPipeline, is_l2cs_weights_available,
 )
 from models.face_mesh import FaceMeshDetector
-# ================ FACE MESH DRAWING (server-side, for WebRTC) ================
 _FONT = cv2.FONT_HERSHEY_SIMPLEX
-_CYAN = (255, 255, 0)
-_GREEN = (0, 255, 0)
-_MAGENTA = (255, 0, 255)
-_ORANGE = (0, 165, 255)
 _RED = (0, 0, 255)
-_WHITE = (255, 255, 255)
-_LIGHT_GREEN = (144, 238, 144)
-_TESSELATION_CONNS = [(c.start, c.end) for c in FaceLandmarksConnections.FACE_LANDMARKS_TESSELATION]
-_CONTOUR_CONNS = [(c.start, c.end) for c in FaceLandmarksConnections.FACE_LANDMARKS_CONTOURS]
-_LEFT_EYEBROW = [70, 63, 105, 66, 107, 55, 65, 52, 53, 46]
-_RIGHT_EYEBROW = [300, 293, 334, 296, 336, 285, 295, 282, 283, 276]
-_NOSE_BRIDGE = [6, 197, 195, 5, 4, 1, 19, 94, 2]
-_LIPS_OUTER = [61, 146, 91, 181, 84, 17, 314, 405, 321, 375, 291, 409, 270, 269, 267, 0, 37, 39, 40, 185, 61]
-_LIPS_INNER = [78, 95, 88, 178, 87, 14, 317, 402, 318, 324, 308, 415, 310, 311, 312, 13, 82, 81, 80, 191, 78]
-_LEFT_EAR_POINTS = [33, 160, 158, 133, 153, 145]
-_RIGHT_EAR_POINTS = [362, 385, 387, 263, 373, 380]
-def _lm_px(lm, idx, w, h):
-    return (int(lm[idx, 0] * w), int(lm[idx, 1] * h))
-def _draw_polyline(frame, lm, indices, w, h, color, thickness):
-    for i in range(len(indices) - 1):
-        cv2.line(frame, _lm_px(lm, indices[i], w, h), _lm_px(lm, indices[i + 1], w, h), color, thickness, cv2.LINE_AA)
-def _draw_face_mesh(frame, lm, w, h):
-    """Draw tessellation, contours, eyebrows, nose, lips, eyes, irises, gaze lines."""
-    # Tessellation (gray triangular grid, semi-transparent)
-    overlay = frame.copy()
-    for s, e in _TESSELATION_CONNS:
-        cv2.line(overlay, _lm_px(lm, s, w, h), _lm_px(lm, e, w, h), (200, 200, 200), 1, cv2.LINE_AA)
-    cv2.addWeighted(overlay, 0.3, frame, 0.7, 0, frame)
-    # Contours
-    for s, e in _CONTOUR_CONNS:
-        cv2.line(frame, _lm_px(lm, s, w, h), _lm_px(lm, e, w, h), _CYAN, 1, cv2.LINE_AA)
-    # Eyebrows
-    _draw_polyline(frame, lm, _LEFT_EYEBROW, w, h, _LIGHT_GREEN, 2)
-    _draw_polyline(frame, lm, _RIGHT_EYEBROW, w, h, _LIGHT_GREEN, 2)
-    # Nose
-    _draw_polyline(frame, lm, _NOSE_BRIDGE, w, h, _ORANGE, 1)
-    # Lips
-    _draw_polyline(frame, lm, _LIPS_OUTER, w, h, _MAGENTA, 1)
-    _draw_polyline(frame, lm, _LIPS_INNER, w, h, (200, 0, 200), 1)
-    # Eyes
-    left_pts = np.array([_lm_px(lm, i, w, h) for i in FaceMeshDetector.LEFT_EYE_INDICES], dtype=np.int32)
-    cv2.polylines(frame, [left_pts], True, _GREEN, 2, cv2.LINE_AA)
-    right_pts = np.array([_lm_px(lm, i, w, h) for i in FaceMeshDetector.RIGHT_EYE_INDICES], dtype=np.int32)
-    cv2.polylines(frame, [right_pts], True, _GREEN, 2, cv2.LINE_AA)
-    # EAR key points
-    for indices in [_LEFT_EAR_POINTS, _RIGHT_EAR_POINTS]:
-        for idx in indices:
-            cv2.circle(frame, _lm_px(lm, idx, w, h), 3, (0, 255, 255), -1, cv2.LINE_AA)
-    # Irises + gaze lines
-    for iris_idx, eye_inner, eye_outer in [
-        (FaceMeshDetector.LEFT_IRIS_INDICES, 133, 33),
-        (FaceMeshDetector.RIGHT_IRIS_INDICES, 362, 263),
-    ]:
-        iris_pts = np.array([_lm_px(lm, i, w, h) for i in iris_idx], dtype=np.int32)
-        center = iris_pts[0]
-        if len(iris_pts) >= 5:
-            radii = [np.linalg.norm(iris_pts[j] - center) for j in range(1, 5)]
-            radius = max(int(np.mean(radii)), 2)
-            cv2.circle(frame, tuple(center), radius, _MAGENTA, 2, cv2.LINE_AA)
-            cv2.circle(frame, tuple(center), 2, _WHITE, -1, cv2.LINE_AA)
-        eye_cx = int((lm[eye_inner, 0] + lm[eye_outer, 0]) / 2.0 * w)
-        eye_cy = int((lm[eye_inner, 1] + lm[eye_outer, 1]) / 2.0 * h)
-        dx, dy = center[0] - eye_cx, center[1] - eye_cy
-        cv2.line(frame, tuple(center), (int(center[0] + dx * 3), int(center[1] + dy * 3)), _RED, 1, cv2.LINE_AA)
-def _draw_hud(frame, result, model_name):
-    """Draw status bar and detail overlay matching live_demo.py."""
-    h, w = frame.shape[:2]
-    is_focused = result["is_focused"]
-    status = "FOCUSED" if is_focused else "NOT FOCUSED"
-    color = _GREEN if is_focused else _RED
-    # Top bar
-    cv2.rectangle(frame, (0, 0), (w, 55), (0, 0, 0), -1)
-    cv2.putText(frame, status, (10, 28), _FONT, 0.8, color, 2, cv2.LINE_AA)
-    cv2.putText(frame, model_name.upper(), (w - 150, 28), _FONT, 0.45, _WHITE, 1, cv2.LINE_AA)
-    # Detail line
-    conf = result.get("mlp_prob", result.get("raw_score", 0.0))
-    mar_s = f" MAR:{result['mar']:.2f}" if result.get("mar") is not None else ""
-    sf = result.get("s_face", 0)
-    se = result.get("s_eye", 0)
-    detail = f"conf:{conf:.2f} S_face:{sf:.2f} S_eye:{se:.2f}{mar_s}"
-    cv2.putText(frame, detail, (10, 48), _FONT, 0.4, _WHITE, 1, cv2.LINE_AA)
-    # Head pose (top right)
-    if result.get("yaw") is not None:
-        cv2.putText(frame, f"yaw:{result['yaw']:+.0f} pitch:{result['pitch']:+.0f} roll:{result['roll']:+.0f}",
-                    (w - 280, 48), _FONT, 0.4, (180, 180, 180), 1, cv2.LINE_AA)
-    # Yawn indicator
-    if result.get("is_yawning"):
-        cv2.putText(frame, "YAWN", (10, 75), _FONT, 0.7, _ORANGE, 2, cv2.LINE_AA)
-# Landmark indices used for face mesh drawing on client (union of all groups).
-# Sending only these instead of all 478 saves ~60% of the landmarks payload.
-_MESH_INDICES = sorted(
-    set(
-        [
-            10, 338, 297, 332, 284, 251, 389, 356, 454,
-            323, 361, 288, 397, 365, 379, 378, 400, 377,
-            152, 148, 176, 149, 150, 136, 172, 58, 132,
-            93, 234, 127, 162, 21, 54, 103, 67, 109,
-        ]  # face oval
-        + [33, 7, 163, 144, 145, 153, 154, 155, 133, 173, 157, 158, 159, 160, 161, 246]  # left eye
-        + [362, 382, 381, 380, 374, 373, 390, 249, 263, 466, 388, 387, 386, 385, 384, 398]  # right eye
-        + [468, 469, 470, 471, 472, 473, 474, 475, 476, 477]  # irises
-        + [70, 63, 105, 66, 107, 55, 65, 52, 53, 46]  # left eyebrow
-        + [300, 293, 334, 296, 336, 285, 295, 282, 283, 276]  # right eyebrow
-        + [6, 197, 195, 5, 4, 1, 19, 94, 2]  # nose bridge
-        + [61, 146, 91, 181, 84, 17, 314, 405, 321, 375, 291, 409, 270, 269, 267, 0, 37, 39, 40, 185]  # lips outer
-        + [78, 95, 88, 178, 87, 14, 317, 402, 318, 324, 308, 415, 310, 311, 312, 13, 82, 81, 80, 191]  # lips inner
-        + [33, 160, 158, 133, 153, 145]  # left EAR key points
-        + [362, 385, 387, 263, 373, 380]  # right EAR key points
-    )
-)
-# Build a lookup: original_index -> position in sparse array, so client can reconstruct.
-_MESH_INDEX_SET = set(_MESH_INDICES)
 @asynccontextmanager
 async def lifespan(app):
     global _cached_model_name
     print("Starting Focus Guard API")
-    await init_database()
     async with aiosqlite.connect(db_path) as db:
         cursor = await db.execute("SELECT model_name FROM user_settings WHERE id = 1")
         row = await cursor.fetchone()
@@ -226,9 +120,8 @@ app.add_middleware(
 )
 # Global variables
-db_path = "focus_guard.db"
 pcs = set()
-_cached_model_name = "mlp"
 _l2cs_boost_enabled = False
 async def _wait_for_ice_gathering(pc: RTCPeerConnection):
@@ -243,54 +136,6 @@ async def _wait_for_ice_gathering(pc: RTCPeerConnection):
     await done.wait()
-# ================ DATABASE MODELS ================
-async def init_database():
-    """Initialize SQLite database with required tables"""
-    async with aiosqlite.connect(db_path) as db:
-        # FocusSessions table
-        await db.execute("""
-            CREATE TABLE IF NOT EXISTS focus_sessions (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                start_time TIMESTAMP NOT NULL,
-                end_time TIMESTAMP,
-                duration_seconds INTEGER DEFAULT 0,
-                focus_score REAL DEFAULT 0.0,
-                total_frames INTEGER DEFAULT 0,
-                focused_frames INTEGER DEFAULT 0,
-                created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
-            )
-        """)
-        # FocusEvents table
-        await db.execute("""
-            CREATE TABLE IF NOT EXISTS focus_events (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                session_id INTEGER NOT NULL,
-                timestamp TIMESTAMP NOT NULL,
-                is_focused BOOLEAN NOT NULL,
-                confidence REAL NOT NULL,
-                detection_data TEXT,
-                FOREIGN KEY (session_id) REFERENCES focus_sessions (id)
-            )
-        """)
-        # UserSettings table
-        await db.execute("""
-            CREATE TABLE IF NOT EXISTS user_settings (
-                id INTEGER PRIMARY KEY CHECK (id = 1),
-                model_name TEXT DEFAULT 'mlp'
-            )
-        """)
-        # Insert default settings if not exists
-        await db.execute("""
-            INSERT OR IGNORE INTO user_settings (id, model_name)
-            VALUES (1, 'mlp')
-        """)
-        await db.commit()
 # ================ PYDANTIC MODELS ================
 class SessionCreate(BaseModel):
@@ -319,8 +164,8 @@ class VideoTransformTrack(VideoStreamTrack):
         if img is None:
             return frame
-        # Normalize size for inference/drawing
-        img = cv2.resize(img, (640, 480))
         now = datetime.now().timestamp()
         do_infer = (now - self.last_inference_time) >= self.min_inference_interval
@@ -357,8 +202,8 @@ class VideoTransformTrack(VideoStreamTrack):
                 h_f, w_f = img.shape[:2]
                 lm = out.get("landmarks")
                 if lm is not None:
-                    _draw_face_mesh(img, lm, w_f, h_f)
-                _draw_hud(img, out, model_name)
             else:
                 is_focused = False
                 confidence = 0.0
@@ -391,135 +236,6 @@ class VideoTransformTrack(VideoStreamTrack):
         new_frame.time_base = frame.time_base
         return new_frame
-# ================ DATABASE OPERATIONS ================
-async def create_session():
-    async with aiosqlite.connect(db_path) as db:
-        cursor = await db.execute(
-            "INSERT INTO focus_sessions (start_time) VALUES (?)",
-            (datetime.now().isoformat(),)
-        )
-        await db.commit()
-        return cursor.lastrowid
-async def end_session(session_id: int):
-    async with aiosqlite.connect(db_path) as db:
-        cursor = await db.execute(
-            "SELECT start_time, total_frames, focused_frames FROM focus_sessions WHERE id = ?",
-            (session_id,)
-        )
-        row = await cursor.fetchone()
-        if not row:
-            return None
-        start_time_str, total_frames, focused_frames = row
-        start_time = datetime.fromisoformat(start_time_str)
-        end_time = datetime.now()
-        duration = (end_time - start_time).total_seconds()
-        focus_score = focused_frames / total_frames if total_frames > 0 else 0.0
-        await db.execute("""
-            UPDATE focus_sessions
-            SET end_time = ?, duration_seconds = ?, focus_score = ?
-            WHERE id = ?
-        """, (end_time.isoformat(), int(duration), focus_score, session_id))
-        await db.commit()
-        return {
-            'session_id': session_id,
-            'start_time': start_time_str,
-            'end_time': end_time.isoformat(),
-            'duration_seconds': int(duration),
-            'focus_score': round(focus_score, 3),
-            'total_frames': total_frames,
-            'focused_frames': focused_frames
-        }
-async def store_focus_event(session_id: int, is_focused: bool, confidence: float, metadata: dict):
-    async with aiosqlite.connect(db_path) as db:
-        await db.execute("""
-            INSERT INTO focus_events (session_id, timestamp, is_focused, confidence, detection_data)
-            VALUES (?, ?, ?, ?, ?)
-        """, (session_id, datetime.now().isoformat(), is_focused, confidence, json.dumps(metadata)))
-        await db.execute("""
-            UPDATE focus_sessions
-            SET total_frames = total_frames + 1,
-                focused_frames = focused_frames + ?
-            WHERE id = ?
-        """, (1 if is_focused else 0, session_id))
-        await db.commit()
-class _EventBuffer:
-    """Buffer focus events in memory and flush to DB in batches to avoid per-frame DB writes."""
-    def __init__(self, flush_interval: float = 2.0):
-        self._buf: list = []
-        self._lock = asyncio.Lock()
-        self._flush_interval = flush_interval
-        self._task: asyncio.Task | None = None
-        self._total_frames = 0
-        self._focused_frames = 0
-    def start(self):
-        if self._task is None:
-            self._task = asyncio.create_task(self._flush_loop())
-    async def stop(self):
-        if self._task:
-            self._task.cancel()
-            try:
-                await self._task
-            except asyncio.CancelledError:
-                pass
-            self._task = None
-        await self._flush()
-    def add(self, session_id: int, is_focused: bool, confidence: float, metadata: dict):
-        self._buf.append((session_id, datetime.now().isoformat(), is_focused, confidence, json.dumps(metadata)))
-        self._total_frames += 1
-        if is_focused:
-            self._focused_frames += 1
-    async def _flush_loop(self):
-        while True:
-            await asyncio.sleep(self._flush_interval)
-            await self._flush()
-    async def _flush(self):
-        async with self._lock:
-            if not self._buf:
-                return
-            batch = self._buf[:]
-            total = self._total_frames
-            focused = self._focused_frames
-            self._buf.clear()
-            self._total_frames = 0
-            self._focused_frames = 0
-        if not batch:
-            return
-        session_id = batch[0][0]
-        try:
-            async with aiosqlite.connect(db_path) as db:
-                await db.executemany("""
-                    INSERT INTO focus_events (session_id, timestamp, is_focused, confidence, detection_data)
-                    VALUES (?, ?, ?, ?, ?)
-                """, batch)
-                await db.execute("""
-                    UPDATE focus_sessions
-                    SET total_frames = total_frames + ?,
-                        focused_frames = focused_frames + ?
-                    WHERE id = ?
-                """, (total, focused, session_id))
-                await db.commit()
-        except Exception as e:
-            print(f"[DB] Flush error: {e}")
 # ================ STARTUP/SHUTDOWN ================
 pipelines = {
@@ -532,7 +248,7 @@ pipelines = {
 # Thread pool for CPU-bound inference so the event loop stays responsive.
 _inference_executor = concurrent.futures.ThreadPoolExecutor(
-    max_workers=4,
     thread_name_prefix="inference",
 )
 # One lock per pipeline so shared state (TemporalTracker, etc.) is not corrupted when
@@ -607,7 +323,7 @@ def _process_frame_with_l2cs_boost(base_pipeline, frame, base_model_name):
         is_focused = False
     else:
         fused_score = _BOOST_BASE_W * base_score + _BOOST_L2CS_W * l2cs_score
-        is_focused = fused_score >= 0.52
     base_out["raw_score"] = fused_score
     base_out["is_focused"] = is_focused
@@ -680,7 +396,7 @@ async def websocket_endpoint(websocket: WebSocket):
     session_id = None
     frame_count = 0
     running = True
-    event_buffer = _EventBuffer(flush_interval=2.0)
     # Calibration state (per-connection)
     # verifying: after fit, show a verification target and check gaze accuracy
@@ -855,7 +571,7 @@ async def websocket_endpoint(websocket: WebSocket):
                 frame = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
                 if frame is None:
                     continue
-                frame = cv2.resize(frame, (640, 480))
                 # During calibration collection, always use L2CS
                 collecting = _cal.get("collecting", False)
@@ -937,7 +653,7 @@ async def websocket_endpoint(websocket: WebSocket):
                         elif use_boost and not fuse["on_screen"]:
                             # Boost mode: if gaze is clearly off-screen, override to unfocused
                             is_focused = False
-                            confidence = min(confidence, 0.1)
                     if session_id:
                         metadata = {
@@ -980,7 +696,7 @@ async def websocket_endpoint(websocket: WebSocket):
                             resp["confidence"] = round(fuse["focus_score"], 3)
                         elif use_boost and not fuse["on_screen"]:
                             resp["focused"] = False
-                            resp["confidence"] = min(resp["confidence"], 0.1)
                     if has_gaze:
                         resp["gaze_yaw"] = round(out["gaze_yaw"], 4)
                         resp["gaze_pitch"] = round(out["gaze_pitch"], 4)
@@ -1133,7 +849,7 @@ async def update_settings(settings: SettingsUpdate):
         cursor = await db.execute("SELECT id FROM user_settings WHERE id = 1")
         exists = await cursor.fetchone()
         if not exists:
-            await db.execute("INSERT INTO user_settings (id, sensitivity) VALUES (1, 6)")
             await db.commit()
         updates = []
@@ -1278,7 +994,7 @@ async def l2cs_status():
 @app.get("/api/mesh-topology")
 async def get_mesh_topology():
     """Return tessellation edge pairs for client-side face mesh drawing (cached by client)."""
-    return {"tessellation": _TESSELATION_CONNS}
 @app.get("/health")
 async def health_check():

 from __future__ import annotations
+import asyncio
 import base64
+import concurrent.futures
 import json
+import logging
 import os
 import threading
+from contextlib import asynccontextmanager
+from datetime import datetime, timedelta
+from pathlib import Path
+from typing import Any, Callable, List, Optional
+import aiosqlite
+import cv2
+import numpy as np
 from aiortc import RTCPeerConnection, RTCSessionDescription, VideoStreamTrack
 from av import VideoFrame
+from fastapi import FastAPI, HTTPException, Request, WebSocket, WebSocketDisconnect
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import FileResponse
+from fastapi.staticfiles import StaticFiles
+from pydantic import BaseModel
+from api.drawing import draw_face_mesh, draw_hud, get_tesselation_connections
+from api.db import (
+    EventBuffer,
+    create_session,
+    end_session,
+    get_db_path,
+    init_database,
+    store_focus_event,
 )
+from config import get
 from models.face_mesh import FaceMeshDetector
+from ui.pipeline import (
+    FaceMeshPipeline,
+    HybridFocusPipeline,
+    L2CSPipeline,
+    MLPPipeline,
+    XGBoostPipeline,
+    is_l2cs_weights_available,
+)
+logger = logging.getLogger(__name__)
+db_path = get_db_path()
+_inference_size = get("app.inference_size") or [640, 480]
+_inference_workers = get("app.inference_workers") or 4
+_fused_threshold = get("l2cs_boost.fused_threshold") or 0.52
+_no_face_cap = get("app.no_face_confidence_cap") or 0.1
+_BOOST_BASE_W = get("l2cs_boost.base_weight") or 0.35
+_BOOST_L2CS_W = get("l2cs_boost.l2cs_weight") or 0.65
+_BOOST_VETO = get("l2cs_boost.veto_threshold") or 0.38
 _FONT = cv2.FONT_HERSHEY_SIMPLEX
 _RED = (0, 0, 255)
 @asynccontextmanager
 async def lifespan(app):
     global _cached_model_name
     print("Starting Focus Guard API")
+    await init_database(db_path)
     async with aiosqlite.connect(db_path) as db:
         cursor = await db.execute("SELECT model_name FROM user_settings WHERE id = 1")
         row = await cursor.fetchone()
 )
 # Global variables
 pcs = set()
+_cached_model_name = get("app.default_model") or "mlp"
 _l2cs_boost_enabled = False
 async def _wait_for_ice_gathering(pc: RTCPeerConnection):
     await done.wait()
 # ================ PYDANTIC MODELS ================
 class SessionCreate(BaseModel):
         if img is None:
             return frame
+        w_sz, h_sz = _inference_size[0], _inference_size[1]
+        img = cv2.resize(img, (w_sz, h_sz))
         now = datetime.now().timestamp()
         do_infer = (now - self.last_inference_time) >= self.min_inference_interval
                 h_f, w_f = img.shape[:2]
                 lm = out.get("landmarks")
                 if lm is not None:
+                    draw_face_mesh(img, lm, w_f, h_f)
+                draw_hud(img, out, model_name)
             else:
                 is_focused = False
                 confidence = 0.0
         new_frame.time_base = frame.time_base
         return new_frame
 # ================ STARTUP/SHUTDOWN ================
 pipelines = {
 # Thread pool for CPU-bound inference so the event loop stays responsive.
 _inference_executor = concurrent.futures.ThreadPoolExecutor(
+    max_workers=_inference_workers,
     thread_name_prefix="inference",
 )
 # One lock per pipeline so shared state (TemporalTracker, etc.) is not corrupted when
         is_focused = False
     else:
         fused_score = _BOOST_BASE_W * base_score + _BOOST_L2CS_W * l2cs_score
+        is_focused = fused_score >= _fused_threshold
     base_out["raw_score"] = fused_score
     base_out["is_focused"] = is_focused
     session_id = None
     frame_count = 0
     running = True
+    event_buffer = EventBuffer(db_path=db_path, flush_interval=2.0)
     # Calibration state (per-connection)
     # verifying: after fit, show a verification target and check gaze accuracy
                 frame = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
                 if frame is None:
                     continue
+                frame = cv2.resize(frame, (_inference_size[0], _inference_size[1]))
                 # During calibration collection, always use L2CS
                 collecting = _cal.get("collecting", False)
                         elif use_boost and not fuse["on_screen"]:
                             # Boost mode: if gaze is clearly off-screen, override to unfocused
                             is_focused = False
+                            confidence = min(confidence, _no_face_cap)
                     if session_id:
                         metadata = {
                             resp["confidence"] = round(fuse["focus_score"], 3)
                         elif use_boost and not fuse["on_screen"]:
                             resp["focused"] = False
+                            resp["confidence"] = min(resp["confidence"], _no_face_cap)
                     if has_gaze:
                         resp["gaze_yaw"] = round(out["gaze_yaw"], 4)
                         resp["gaze_pitch"] = round(out["gaze_pitch"], 4)
         cursor = await db.execute("SELECT id FROM user_settings WHERE id = 1")
         exists = await cursor.fetchone()
         if not exists:
+            await db.execute("INSERT INTO user_settings (id, model_name) VALUES (1, 'mlp')")
             await db.commit()
         updates = []
 @app.get("/api/mesh-topology")
 async def get_mesh_topology():
     """Return tessellation edge pairs for client-side face mesh drawing (cached by client)."""
+    return {"tessellation": get_tesselation_connections()}
 @app.get("/health")
 async def health_check():

models/L2CS-Net/l2cs/datasets.py CHANGED Viewed

@@ -59,11 +59,6 @@ class Gaze360(Dataset):
         img = Image.open(os.path.join(self.root, face))
-        # fimg = cv2.imread(os.path.join(self.root, face))
-        # fimg = cv2.resize(fimg, (448, 448))/255.0
-        # fimg = fimg.transpose(2, 0, 1)
-        # img=torch.from_numpy(fimg).type(torch.FloatTensor)
         if self.transform:
             img = self.transform(img)
@@ -135,11 +130,6 @@ class Mpiigaze(Dataset):
     img = Image.open(os.path.join(self.root, face))
-    # fimg = cv2.imread(os.path.join(self.root, face))
-    # fimg = cv2.resize(fimg, (448, 448))/255.0
-    # fimg = fimg.transpose(2, 0, 1)
-    # img=torch.from_numpy(fimg).type(torch.FloatTensor)
     if self.transform:
         img = self.transform(img)

         img = Image.open(os.path.join(self.root, face))
         if self.transform:
             img = self.transform(img)
     img = Image.open(os.path.join(self.root, face))
     if self.transform:
         img = self.transform(img)

models/mlp/eval_accuracy.py CHANGED Viewed

@@ -25,8 +25,6 @@ def main():
     train_loader, val_loader, test_loader, num_features, num_classes, _ = get_dataloaders(
         model_name="face_orientation",
         batch_size=32,
-        split_ratios=(0.7, 0.15, 0.15),
-        seed=42,
     )
     model = BaseModel(num_features, num_classes).to(device)

     train_loader, val_loader, test_loader, num_features, num_classes, _ = get_dataloaders(
         model_name="face_orientation",
         batch_size=32,
     )
     model = BaseModel(num_features, num_classes).to(device)

models/mlp/sweep.py CHANGED Viewed

@@ -14,10 +14,10 @@ REPO_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", ".."))
 if REPO_ROOT not in sys.path:
     sys.path.insert(0, REPO_ROOT)
-from data_preparation.prepare_dataset import get_dataloaders
 from models.mlp.train import BaseModel, set_seed
-SEED = 42
 N_TRIALS = 20
 EPOCHS_PER_TRIAL = 15
@@ -31,7 +31,7 @@ def objective(trial):
     train_loader, val_loader, _, num_features, num_classes, _ = get_dataloaders(
         model_name="face_orientation",
         batch_size=batch_size,
-        split_ratios=(0.7, 0.15, 0.15),
         seed=SEED,
     )

 if REPO_ROOT not in sys.path:
     sys.path.insert(0, REPO_ROOT)
+from data_preparation.prepare_dataset import get_default_split_config, get_dataloaders
 from models.mlp.train import BaseModel, set_seed
+SPLIT_RATIOS, SEED = get_default_split_config()
 N_TRIALS = 20
 EPOCHS_PER_TRIAL = 15
     train_loader, val_loader, _, num_features, num_classes, _ = get_dataloaders(
         model_name="face_orientation",
         batch_size=batch_size,
+        split_ratios=SPLIT_RATIOS,
         seed=SEED,
     )

models/mlp/train.py CHANGED Viewed

@@ -1,46 +1,95 @@
 import json
 import os
 import random
 import numpy as np
 import joblib
 import torch
 import torch.nn as nn
 import torch.optim as optim
-from sklearn.metrics import f1_score, roc_auc_score
 from data_preparation.prepare_dataset import get_dataloaders, SELECTED_FEATURES
-USE_CLEARML = False
 _PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", ".."))
-CFG = {
-    "model_name": "face_orientation",
-    "epochs": 30,
-    "batch_size": 32,
-    "lr": 1e-3,
-    "seed": 42,
-    "split_ratios": (0.7, 0.15, 0.15),
-    "checkpoints_dir": os.path.join(_PROJECT_ROOT, "checkpoints"),
-    "logs_dir": os.path.join(_PROJECT_ROOT, "evaluation", "logs"),
-}
-# ==== ClearML (opt-in) =============================================
 task = None
 if USE_CLEARML:
-    from clearml import Task
-    task = Task.init(
-        project_name="Focus Guard",
-        task_name="MLP Model Training",
-        tags=["training", "mlp_model"]
-    )
-    task.connect(CFG)
 # ==== Model =============================================
-def set_seed(seed: int):
     random.seed(seed)
     np.random.seed(seed)
     torch.manual_seed(seed)
@@ -49,15 +98,18 @@ def set_seed(seed: int):
 class BaseModel(nn.Module):
-    def __init__(self, num_features: int, num_classes: int):
         super().__init__()
-        self.network = nn.Sequential(
-            nn.Linear(num_features, 64),
-            nn.ReLU(),
-            nn.Linear(64, 32),
-            nn.ReLU(),
-            nn.Linear(32, num_classes),
-        )
     def forward(self, x):
         return self.network(x)
@@ -89,6 +141,8 @@ class BaseModel(nn.Module):
         total_loss = 0.0
         correct = 0
         total = 0
         for features, labels in loader:
             features, labels = features.to(device), labels.to(device)
@@ -96,10 +150,14 @@ class BaseModel(nn.Module):
             loss = criterion(outputs, labels)
             total_loss += loss.item() * features.size(0)
-            correct += (outputs.argmax(dim=1) == labels).sum().item()
             total += features.size(0)
-        return total_loss / total, correct / total
     @torch.no_grad()
     def test_step(self, loader, criterion, device):
@@ -130,7 +188,8 @@ class BaseModel(nn.Module):
         return total_loss / total, correct / total, np.array(all_probs), np.array(all_preds), np.array(all_labels)
-def main():
     set_seed(CFG["seed"])
     device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
@@ -144,7 +203,7 @@ def main():
         seed=CFG["seed"],
     )
-    model = BaseModel(num_features, num_classes).to(device)
     criterion = nn.CrossEntropyLoss()
     optimizer = optim.Adam(model.parameters(), lr=CFG["lr"])
@@ -163,22 +222,25 @@ def main():
         "train_acc": [],
         "val_loss": [],
         "val_acc": [],
     }
     best_val_acc = 0.0
-    print(f"\n{'Epoch':>6} | {'Train Loss':>10} | {'Train Acc':>9} | {'Val Loss':>10} | {'Val Acc':>9}")
-    print("-" * 60)
     for epoch in range(1, CFG["epochs"] + 1):
         train_loss, train_acc = model.training_step(train_loader, optimizer, criterion, device)
-        val_loss, val_acc = model.validation_step(val_loader, criterion, device)
         history["epochs"].append(epoch)
         history["train_loss"].append(round(train_loss, 4))
         history["train_acc"].append(round(train_acc, 4))
         history["val_loss"].append(round(val_loss, 4))
         history["val_acc"].append(round(val_acc, 4))
         current_lr = optimizer.param_groups[0]['lr']
@@ -187,30 +249,36 @@ def main():
             task.logger.report_scalar("Accuracy",      "Train", float(train_acc),  iteration=epoch)
             task.logger.report_scalar("Loss",          "Val",   float(val_loss),   iteration=epoch)
             task.logger.report_scalar("Accuracy",      "Val",   float(val_acc),    iteration=epoch)
             task.logger.report_scalar("Learning Rate", "LR",    float(current_lr), iteration=epoch)
             task.logger.flush()
         marker = ""
-        if val_acc > best_val_acc:
             best_val_acc = val_acc
             torch.save(model.state_dict(), best_ckpt_path)
             marker = " *"
-        print(f"{epoch:>6} | {train_loss:>10.4f} | {train_acc:>8.2%} | {val_loss:>10.4f} | {val_acc:>8.2%}{marker}")
-    print(f"\nBest validation accuracy: {best_val_acc:.2%}")
     print(f"Checkpoint saved to: {best_ckpt_path}")
     model.load_state_dict(torch.load(best_ckpt_path, weights_only=True))
     test_loss, test_acc, test_probs, test_preds, test_labels = model.test_step(test_loader, criterion, device)
-    test_f1 = f1_score(test_labels, test_preds, average='weighted')
-    # Handle potentially >2 classes for AUC
     if num_classes > 2:
-        test_auc = roc_auc_score(test_labels, test_probs, multi_class='ovr', average='weighted')
     else:
-        test_auc = roc_auc_score(test_labels, test_probs[:, 1])
     print(f"\n[TEST] Loss: {test_loss:.4f} | Accuracy: {test_acc:.2%}")
     print(f"[TEST] F1: {test_f1:.4f} | ROC-AUC: {test_auc:.4f}")
@@ -219,22 +287,72 @@ def main():
     history["test_f1"] = round(test_f1, 4)
     history["test_auc"] = round(test_auc, 4)
     logs_dir = CFG["logs_dir"]
     os.makedirs(logs_dir, exist_ok=True)
     log_path = os.path.join(logs_dir, f"{CFG['model_name']}_training_log.json")
     with open(log_path, "w") as f:
         json.dump(history, f, indent=2)
     print(f"[LOG] Training history saved to: {log_path}")
-    # Save scaler and feature names for inference (ui/pipeline.py)
     scaler_path = os.path.join(ckpt_dir, "scaler_mlp.joblib")
     joblib.dump(scaler, scaler_path)
     meta_path = os.path.join(ckpt_dir, "meta_mlp.npz")
     np.savez(meta_path, feature_names=np.array(SELECTED_FEATURES["face_orientation"]))
     print(f"[LOG] Scaler and meta saved to {ckpt_dir}")
 if __name__ == "__main__":
     main()

 import json
 import os
 import random
+import sys
 import numpy as np
 import joblib
 import torch
 import torch.nn as nn
 import torch.optim as optim
+from sklearn.metrics import (
+    confusion_matrix,
+    f1_score,
+    precision_recall_fscore_support,
+    roc_auc_score,
+)
 from data_preparation.prepare_dataset import get_dataloaders, SELECTED_FEATURES
 _PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", ".."))
+USE_CLEARML = os.environ.get("USE_CLEARML", "0") == "1" or bool(os.environ.get("CLEARML_TASK_ID"))
+CLEARML_QUEUE = os.environ.get("CLEARML_QUEUE", "")
+def _load_cfg():
+    """Build training config from config/default.yaml with fallbacks."""
+    try:
+        from config import get
+        mlp = get("mlp") or {}
+        data = get("data") or {}
+        ratios = data.get("split_ratios", [0.7, 0.15, 0.15])
+        return {
+            "model_name": mlp.get("model_name", "face_orientation"),
+            "epochs": mlp.get("epochs", 30),
+            "batch_size": mlp.get("batch_size", 32),
+            "lr": mlp.get("lr", 1e-3),
+            "seed": mlp.get("seed", 42),
+            "split_ratios": tuple(ratios),
+            "hidden_sizes": mlp.get("hidden_sizes", [64, 32]),
+            "checkpoints_dir": os.path.join(_PROJECT_ROOT, "checkpoints"),
+            "logs_dir": os.path.join(_PROJECT_ROOT, "evaluation", "logs"),
+        }
+    except Exception:
+        return {
+            "model_name": "face_orientation",
+            "epochs": 30,
+            "batch_size": 32,
+            "lr": 1e-3,
+            "seed": 42,
+            "split_ratios": (0.7, 0.15, 0.15),
+            "hidden_sizes": [64, 32],
+            "checkpoints_dir": os.path.join(_PROJECT_ROOT, "checkpoints"),
+            "logs_dir": os.path.join(_PROJECT_ROOT, "evaluation", "logs"),
+        }
+CFG = _load_cfg()
+# ==== ClearML: expose all config as task params, support remote execution ====
 task = None
 if USE_CLEARML:
+    try:
+        from clearml import Task
+        from config import flatten_for_clearml
+        task = Task.init(
+            project_name="Focus Guard",
+            task_name="MLP Model Training",
+            tags=["training", "mlp_model"],
+        )
+        flat = flatten_for_clearml()
+        flat["mlp/model_name"] = CFG.get("model_name", "face_orientation")
+        flat["mlp/epochs"] = CFG.get("epochs", 30)
+        flat["mlp/batch_size"] = CFG.get("batch_size", 32)
+        flat["mlp/lr"] = CFG.get("lr", 1e-3)
+        flat["mlp/seed"] = CFG.get("seed", 42)
+        flat["mlp/hidden_sizes"] = str(CFG.get("hidden_sizes", [64, 32]))
+        flat["mlp/split_ratios"] = str(CFG.get("split_ratios", (0.7, 0.15, 0.15)))
+        task.connect(flat)
+        if CLEARML_QUEUE:
+            print(f"[ClearML] Enqueuing to queue '{CLEARML_QUEUE}'. Agent will run training.")
+            task.execute_remotely(queue_name=CLEARML_QUEUE)
+            sys.exit(0)
+    except ImportError:
+        task = None
+        USE_CLEARML = False
 # ==== Model =============================================
+def set_seed(seed: int) -> None:
+    """Set random seed for numpy, torch, and Python RNG for reproducibility."""
     random.seed(seed)
     np.random.seed(seed)
     torch.manual_seed(seed)
 class BaseModel(nn.Module):
+    """MLP classifier: num_features -> hidden_sizes -> num_classes. Used for face_orientation focus."""
+    def __init__(self, num_features: int, num_classes: int, hidden_sizes: list[int] | None = None):
         super().__init__()
+        sizes = hidden_sizes or CFG.get("hidden_sizes", [64, 32])
+        layers = []
+        prev = num_features
+        for h in sizes:
+            layers.extend([nn.Linear(prev, h), nn.ReLU()])
+            prev = h
+        layers.append(nn.Linear(prev, num_classes))
+        self.network = nn.Sequential(*layers)
     def forward(self, x):
         return self.network(x)
         total_loss = 0.0
         correct = 0
         total = 0
+        all_preds = []
+        all_labels = []
         for features, labels in loader:
             features, labels = features.to(device), labels.to(device)
             loss = criterion(outputs, labels)
             total_loss += loss.item() * features.size(0)
+            preds = outputs.argmax(dim=1)
+            correct += (preds == labels).sum().item()
             total += features.size(0)
+            all_preds.extend(preds.cpu().numpy())
+            all_labels.extend(labels.cpu().numpy())
+        val_f1 = f1_score(np.array(all_labels), np.array(all_preds), average="weighted")
+        return total_loss / total, correct / total, val_f1
     @torch.no_grad()
     def test_step(self, loader, criterion, device):
         return total_loss / total, correct / total, np.array(all_probs), np.array(all_preds), np.array(all_labels)
+def main() -> None:
+    """Train MLP on face_orientation features, save best checkpoint and scaler to checkpoints/."""
     set_seed(CFG["seed"])
     device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
         seed=CFG["seed"],
     )
+    model = BaseModel(num_features, num_classes, hidden_sizes=CFG.get("hidden_sizes")).to(device)
     criterion = nn.CrossEntropyLoss()
     optimizer = optim.Adam(model.parameters(), lr=CFG["lr"])
         "train_acc": [],
         "val_loss": [],
         "val_acc": [],
+        "val_f1": [],
     }
+    best_val_f1 = 0.0
     best_val_acc = 0.0
+    print(f"\n{'Epoch':>6} | {'Train Loss':>10} | {'Train Acc':>9} | {'Val Loss':>10} | {'Val Acc':>9} | {'Val F1':>8}")
+    print("-" * 72)
     for epoch in range(1, CFG["epochs"] + 1):
         train_loss, train_acc = model.training_step(train_loader, optimizer, criterion, device)
+        val_loss, val_acc, val_f1 = model.validation_step(val_loader, criterion, device)
         history["epochs"].append(epoch)
         history["train_loss"].append(round(train_loss, 4))
         history["train_acc"].append(round(train_acc, 4))
         history["val_loss"].append(round(val_loss, 4))
         history["val_acc"].append(round(val_acc, 4))
+        history["val_f1"].append(round(val_f1, 4))
         current_lr = optimizer.param_groups[0]['lr']
             task.logger.report_scalar("Accuracy",      "Train", float(train_acc),  iteration=epoch)
             task.logger.report_scalar("Loss",          "Val",   float(val_loss),   iteration=epoch)
             task.logger.report_scalar("Accuracy",      "Val",   float(val_acc),    iteration=epoch)
+            task.logger.report_scalar("F1",            "Val",   float(val_f1),     iteration=epoch)
             task.logger.report_scalar("Learning Rate", "LR",    float(current_lr), iteration=epoch)
             task.logger.flush()
         marker = ""
+        if val_f1 > best_val_f1:
+            best_val_f1 = val_f1
             best_val_acc = val_acc
             torch.save(model.state_dict(), best_ckpt_path)
             marker = " *"
+        print(
+            f"{epoch:>6} | {train_loss:>10.4f} | {train_acc:>8.2%} | {val_loss:>10.4f} | "
+            f"{val_acc:>8.2%} | {val_f1:>8.4f}{marker}"
+        )
+    print(f"\nBest validation F1: {best_val_f1:.4f} (accuracy at best F1: {best_val_acc:.2%})")
     print(f"Checkpoint saved to: {best_ckpt_path}")
     model.load_state_dict(torch.load(best_ckpt_path, weights_only=True))
     test_loss, test_acc, test_probs, test_preds, test_labels = model.test_step(test_loader, criterion, device)
+    test_labels_np = np.asarray(test_labels)
+    test_preds_np = np.asarray(test_preds)
+    test_f1 = f1_score(test_labels_np, test_preds_np, average="weighted")
     if num_classes > 2:
+        test_auc = roc_auc_score(test_labels_np, test_probs, multi_class="ovr", average="weighted")
     else:
+        test_auc = roc_auc_score(test_labels_np, test_probs[:, 1])
     print(f"\n[TEST] Loss: {test_loss:.4f} | Accuracy: {test_acc:.2%}")
     print(f"[TEST] F1: {test_f1:.4f} | ROC-AUC: {test_auc:.4f}")
     history["test_f1"] = round(test_f1, 4)
     history["test_auc"] = round(test_auc, 4)
+    # Dataset stats for ClearML
+    train_labels = train_loader.dataset.labels.numpy()
+    val_labels = val_loader.dataset.labels.numpy()
+    dataset_stats = {
+        "train_size": len(train_loader.dataset),
+        "val_size": len(val_loader.dataset),
+        "test_size": len(test_loader.dataset),
+        "train_class_counts": np.bincount(train_labels, minlength=num_classes).tolist(),
+        "val_class_counts": np.bincount(val_labels, minlength=num_classes).tolist(),
+        "test_class_counts": np.bincount(test_labels_np, minlength=num_classes).tolist(),
+    }
+    history["dataset_stats"] = dataset_stats
     logs_dir = CFG["logs_dir"]
     os.makedirs(logs_dir, exist_ok=True)
     log_path = os.path.join(logs_dir, f"{CFG['model_name']}_training_log.json")
     with open(log_path, "w") as f:
         json.dump(history, f, indent=2)
     print(f"[LOG] Training history saved to: {log_path}")
     scaler_path = os.path.join(ckpt_dir, "scaler_mlp.joblib")
     joblib.dump(scaler, scaler_path)
     meta_path = os.path.join(ckpt_dir, "meta_mlp.npz")
     np.savez(meta_path, feature_names=np.array(SELECTED_FEATURES["face_orientation"]))
     print(f"[LOG] Scaler and meta saved to {ckpt_dir}")
+    # ClearML: artifacts, confusion matrix, per-class metrics
+    if task is not None:
+        task.upload_artifact(name="mlp_best", artifact_object=best_ckpt_path)
+        task.upload_artifact(name="training_log", artifact_object=log_path)
+        task.logger.report_single_value("test/accuracy", test_acc)
+        task.logger.report_single_value("test/f1_weighted", test_f1)
+        task.logger.report_single_value("test/roc_auc", test_auc)
+        for key, val in dataset_stats.items():
+            if isinstance(val, list):
+                task.logger.report_single_value(f"dataset/{key}", str(val))
+            else:
+                task.logger.report_single_value(f"dataset/{key}", val)
+        prec, rec, f1_per_class, _ = precision_recall_fscore_support(
+            test_labels_np, test_preds_np, average=None, zero_division=0
+        )
+        for c in range(num_classes):
+            task.logger.report_single_value(f"test/class_{c}_precision", float(prec[c]))
+            task.logger.report_single_value(f"test/class_{c}_recall", float(rec[c]))
+            task.logger.report_single_value(f"test/class_{c}_f1", float(f1_per_class[c]))
+        cm = confusion_matrix(test_labels_np, test_preds_np)
+        import matplotlib
+        matplotlib.use("Agg")
+        import matplotlib.pyplot as plt
+        fig, ax = plt.subplots(figsize=(6, 5))
+        ax.imshow(cm, cmap="Blues")
+        ax.set_xticks(range(num_classes))
+        ax.set_yticks(range(num_classes))
+        ax.set_xticklabels([f"Class {i}" for i in range(num_classes)])
+        ax.set_yticklabels([f"Class {i}" for i in range(num_classes)])
+        for i in range(num_classes):
+            for j in range(num_classes):
+                ax.text(j, i, str(cm[i, j]), ha="center", va="center", color="black")
+        ax.set_xlabel("Predicted")
+        ax.set_ylabel("True")
+        ax.set_title("Test set confusion matrix")
+        fig.tight_layout()
+        task.logger.report_matplotlib_figure(title="Confusion Matrix", series="test", figure=fig, iteration=0)
+        plt.close(fig)
+        task.logger.flush()
 if __name__ == "__main__":
     main()

models/xgboost/add_accuracy.py CHANGED Viewed

@@ -8,9 +8,7 @@ import os
 print("Loading dataset for evaluation...")
 splits, _, _, _ = get_numpy_splits(
     model_name="face_orientation",
-    split_ratios=(0.7, 0.15, 0.15),
-    seed=42,
-    scale=False
 )
 X_train, y_train = splits["X_train"], splits["y_train"]
 X_val,   y_val   = splits["X_val"],   splits["y_val"]

 print("Loading dataset for evaluation...")
 splits, _, _, _ = get_numpy_splits(
     model_name="face_orientation",
+    scale=False,
 )
 X_train, y_train = splits["X_train"], splits["y_train"]
 X_val,   y_val   = splits["X_val"],   splits["y_val"]

models/xgboost/config.py ADDED Viewed

	@@ -0,0 +1,52 @@

+"""Shared XGBoost config used by training and evaluation. Loads from config/default.yaml when present."""
+from copy import deepcopy
+from xgboost import XGBClassifier
+def _load_xgb_params():
+    try:
+        from config import get
+        xgb = get("xgboost") or {}
+        return {
+            "n_estimators": xgb.get("n_estimators", 600),
+            "max_depth": xgb.get("max_depth", 8),
+            "learning_rate": xgb.get("learning_rate", 0.1489),
+            "subsample": xgb.get("subsample", 0.9625),
+            "colsample_bytree": xgb.get("colsample_bytree", 0.9013),
+            "reg_alpha": xgb.get("reg_alpha", 1.1407),
+            "reg_lambda": xgb.get("reg_lambda", 2.4181),
+            "eval_metric": xgb.get("eval_metric", "logloss"),
+        }
+    except Exception:
+        return {
+            "n_estimators": 600,
+            "max_depth": 8,
+            "learning_rate": 0.1489,
+            "subsample": 0.9625,
+            "colsample_bytree": 0.9013,
+            "reg_alpha": 1.1407,
+            "reg_lambda": 2.4181,
+            "eval_metric": "logloss",
+        }
+XGB_BASE_PARAMS = _load_xgb_params()
+def get_xgb_params():
+    return deepcopy(XGB_BASE_PARAMS)
+def build_xgb_classifier(seed: int, *, verbosity: int = 0, early_stopping_rounds=None):
+    params = get_xgb_params()
+    params.update(
+        {
+            "random_state": seed,
+            "verbosity": verbosity,
+        }
+    )
+    if early_stopping_rounds is not None:
+        params["early_stopping_rounds"] = early_stopping_rounds
+    return XGBClassifier(**params)

models/xgboost/eval_accuracy.py CHANGED Viewed

@@ -25,8 +25,6 @@ def main():
     splits, num_features, num_classes, _ = get_numpy_splits(
         model_name=MODEL_NAME,
-        split_ratios=(0.7, 0.15, 0.15),
-        seed=42,
         scale=False,
     )
     X_test = splits["X_test"]

     splits, num_features, num_classes, _ = get_numpy_splits(
         model_name=MODEL_NAME,
         scale=False,
     )
     X_test = splits["X_test"]

models/xgboost/sweep_local.py CHANGED Viewed

@@ -14,13 +14,12 @@ from xgboost import XGBClassifier
 from sklearn.metrics import f1_score, roc_auc_score, accuracy_score
 # Import your own dataset loading logic
-from data_preparation.prepare_dataset import get_numpy_splits
 # ── General Settings ──────────────────────────────────────────────────────────
 PROJECT_NAME = "FocusGuards Large Group Project"
 BASE_TASK_NAME = "XGBoost Sweep Trial"
-DATA_SPLITS = (0.7, 0.15, 0.15)
-SEED = 42
 # ── Search Space ──────────────────────────────────────────────────────────────
 def objective(trial):

 from sklearn.metrics import f1_score, roc_auc_score, accuracy_score
 # Import your own dataset loading logic
+from data_preparation.prepare_dataset import get_default_split_config, get_numpy_splits
 # ── General Settings ──────────────────────────────────────────────────────────
 PROJECT_NAME = "FocusGuards Large Group Project"
 BASE_TASK_NAME = "XGBoost Sweep Trial"
+DATA_SPLITS, SEED = get_default_split_config()
 # ── Search Space ──────────────────────────────────────────────────────────────
 def objective(trial):

models/xgboost/train.py CHANGED Viewed

@@ -1,42 +1,73 @@
 import json
 import os
 import random
 import numpy as np
-# from clearml import Task
-from sklearn.metrics import f1_score, roc_auc_score
-from xgboost import XGBClassifier
 from data_preparation.prepare_dataset import get_numpy_splits
 _PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", ".."))
-CFG = {
-    "model_name": "face_orientation",
-    "seed": 42,
-    "split_ratios": (0.7, 0.15, 0.15),
-    "scale": False,  # XGBoost is tree-based — scaling is unnecessary
-    "checkpoints_dir": os.path.join(_PROJECT_ROOT, "checkpoints"),
-    "logs_dir": os.path.join(_PROJECT_ROOT, "evaluation", "logs"),
-    # XGBoost hyperparameters chosen by F1 score in 40 trials of Optuna HPO
-    "n_estimators": 600,
-    "max_depth": 8,
-    "learning_rate": 0.1489,
-    "subsample": 0.9625,
-    "colsample_bytree": 0.9013,
-    "reg_alpha": 1.1407,
-    "reg_lambda": 2.4181,
-    "eval_metric": "logloss",
-}
-# ClearML disabled (uncomment + set credentials to re-enable)
-# task = Task.init(
-#     project_name="FocusGuards Large Group Project",
-#     task_name="XGBoost Model Training",
-#     tags=["training", "xgboost"]
-# )
-# task.connect(CFG)
 task = None
 def set_seed(seed: int):
     random.seed(seed)
@@ -62,19 +93,7 @@ def main():
     X_test,  y_test  = splits["X_test"],  splits["y_test"]
     # ── Model ─────────────────────────────────────────────────────
-    model = XGBClassifier(
-        n_estimators=CFG["n_estimators"],
-        max_depth=CFG["max_depth"],
-        learning_rate=CFG["learning_rate"],
-        subsample=CFG["subsample"],
-        colsample_bytree=CFG["colsample_bytree"],
-        reg_alpha=CFG["reg_alpha"],
-        reg_lambda=CFG["reg_lambda"],
-        eval_metric=CFG["eval_metric"],
-        early_stopping_rounds=30,
-        random_state=CFG["seed"],
-        verbosity=1,
-    )
     model.fit(
         X_train, y_train,
@@ -82,12 +101,13 @@ def main():
         verbose=10,
     )
     best_it = getattr(model, "best_iteration", None)
-    print(f"[TRAIN] Best iteration: {best_it} / {CFG['n_estimators']}")
     # ── Evaluation ────────────────────────────────────────────────
     evals = model.evals_result()
-    train_losses = evals["validation_0"][CFG["eval_metric"]]
-    val_losses   = evals["validation_1"][CFG["eval_metric"]]
     # Test metrics
     test_preds = model.predict(X_test)
@@ -104,14 +124,53 @@ def main():
     print(f"[TEST] F1:       {test_f1:.4f}")
     print(f"[TEST] ROC-AUC:  {test_auc:.4f}")
-    # ClearML logging (no-op when task is None)
     if task is not None:
         for i, (tl, vl) in enumerate(zip(train_losses, val_losses)):
             task.logger.report_scalar("Loss", "Train", tl, iteration=i + 1)
-            task.logger.report_scalar("Loss", "Val",   vl, iteration=i + 1)
-        task.logger.report_single_value("test_accuracy", test_acc)
-        task.logger.report_single_value("test_f1",       test_f1)
-        task.logger.report_single_value("test_auc",      test_auc)
         task.logger.flush()
     # ── Save checkpoint ───────────────────────────────────────────
@@ -122,17 +181,23 @@ def main():
     print(f"\n[CKPT] Model saved to: {model_path}")
     # ── Write JSON log (same schema as MLP) ───────────────────────
     history = {
         "model_name": f"xgboost_{CFG['model_name']}",
-        "param_count": int(model.get_booster().trees_to_dataframe().shape[0]),  # total tree nodes
-        "n_estimators": CFG["n_estimators"],
-        "max_depth": CFG["max_depth"],
         "epochs": list(range(1, len(train_losses) + 1)),
         "train_loss": [round(v, 4) for v in train_losses],
-        "val_loss":   [round(v, 4) for v in val_losses],
-        "test_acc":   round(test_acc, 4),
-        "test_f1":    round(test_f1, 4),
-        "test_auc":   round(test_auc, 4),
     }
     logs_dir = CFG["logs_dir"]
@@ -144,6 +209,10 @@ def main():
     print(f"[LOG] Training history saved to: {log_path}")
 if __name__ == "__main__":
     main()

 import json
 import os
 import random
+import sys
 import numpy as np
+from sklearn.metrics import confusion_matrix, f1_score, precision_recall_fscore_support, roc_auc_score
 from data_preparation.prepare_dataset import get_numpy_splits
+from models.xgboost.config import XGB_BASE_PARAMS, build_xgb_classifier
 _PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", ".."))
+def _load_cfg():
+    try:
+        from config import get
+        xgb = get("xgboost") or {}
+        data = get("data") or {}
+        ratios = data.get("split_ratios", [0.7, 0.15, 0.15])
+        return {
+            "model_name": get("mlp.model_name") or "face_orientation",
+            "seed": get("mlp.seed") or 42,
+            "split_ratios": tuple(ratios),
+            "scale": False,
+            "checkpoints_dir": os.path.join(_PROJECT_ROOT, "checkpoints"),
+            "logs_dir": os.path.join(_PROJECT_ROOT, "evaluation", "logs"),
+            "xgb_params": dict(XGB_BASE_PARAMS),
+        }
+    except Exception:
+        return {
+            "model_name": "face_orientation",
+            "seed": 42,
+            "split_ratios": (0.7, 0.15, 0.15),
+            "scale": False,
+            "checkpoints_dir": os.path.join(_PROJECT_ROOT, "checkpoints"),
+            "logs_dir": os.path.join(_PROJECT_ROOT, "evaluation", "logs"),
+            "xgb_params": dict(XGB_BASE_PARAMS),
+        }
+CFG = _load_cfg()
+USE_CLEARML = os.environ.get("USE_CLEARML", "0") == "1" or bool(os.environ.get("CLEARML_TASK_ID"))
+CLEARML_QUEUE = os.environ.get("CLEARML_QUEUE", "")
 task = None
+if USE_CLEARML:
+    try:
+        from clearml import Task
+        from config import flatten_for_clearml
+        task = Task.init(
+            project_name="Focus Guard",
+            task_name="XGBoost Model Training",
+            tags=["training", "xgboost"],
+        )
+        flat = flatten_for_clearml()
+        for k, v in CFG.get("xgb_params", {}).items():
+            flat[f"xgb_params/{k}"] = v
+        flat["model_name"] = CFG["model_name"]
+        flat["seed"] = CFG["seed"]
+        flat["split_ratios"] = str(CFG["split_ratios"])
+        task.connect(flat)
+        if CLEARML_QUEUE:
+            print(f"[ClearML] Enqueuing to queue '{CLEARML_QUEUE}'.")
+            task.execute_remotely(queue_name=CLEARML_QUEUE)
+            sys.exit(0)
+    except ImportError:
+        task = None
+        USE_CLEARML = False
 def set_seed(seed: int):
     random.seed(seed)
     X_test,  y_test  = splits["X_test"],  splits["y_test"]
     # ── Model ─────────────────────────────────────────────────────
+    model = build_xgb_classifier(CFG["seed"], verbosity=1, early_stopping_rounds=30)
     model.fit(
         X_train, y_train,
         verbose=10,
     )
     best_it = getattr(model, "best_iteration", None)
+    print(f"[TRAIN] Best iteration: {best_it} / {CFG['xgb_params']['n_estimators']}")
     # ── Evaluation ────────────────────────────────────────────────
     evals = model.evals_result()
+    eval_metric_name = CFG["xgb_params"]["eval_metric"]
+    train_losses = evals["validation_0"][eval_metric_name]
+    val_losses   = evals["validation_1"][eval_metric_name]
     # Test metrics
     test_preds = model.predict(X_test)
     print(f"[TEST] F1:       {test_f1:.4f}")
     print(f"[TEST] ROC-AUC:  {test_auc:.4f}")
+    # Dataset stats
+    dataset_stats = {
+        "train_size": len(y_train),
+        "val_size": len(y_val),
+        "test_size": len(y_test),
+        "train_class_counts": np.bincount(y_train.astype(int), minlength=num_classes).tolist(),
+        "val_class_counts": np.bincount(y_val.astype(int), minlength=num_classes).tolist(),
+        "test_class_counts": np.bincount(y_test.astype(int), minlength=num_classes).tolist(),
+    }
     if task is not None:
         for i, (tl, vl) in enumerate(zip(train_losses, val_losses)):
             task.logger.report_scalar("Loss", "Train", tl, iteration=i + 1)
+            task.logger.report_scalar("Loss", "Val", vl, iteration=i + 1)
+        task.logger.report_single_value("test/accuracy", test_acc)
+        task.logger.report_single_value("test/f1_weighted", test_f1)
+        task.logger.report_single_value("test/roc_auc", test_auc)
+        for key, val in dataset_stats.items():
+            task.logger.report_single_value(
+                f"dataset/{key}", str(val) if isinstance(val, list) else val
+            )
+        prec, rec, f1_per_class, _ = precision_recall_fscore_support(
+            y_test, test_preds, average=None, zero_division=0
+        )
+        for c in range(num_classes):
+            task.logger.report_single_value(f"test/class_{c}_precision", float(prec[c]))
+            task.logger.report_single_value(f"test/class_{c}_recall", float(rec[c]))
+            task.logger.report_single_value(f"test/class_{c}_f1", float(f1_per_class[c]))
+        cm = confusion_matrix(y_test, test_preds)
+        import matplotlib
+        matplotlib.use("Agg")
+        import matplotlib.pyplot as plt
+        fig, ax = plt.subplots(figsize=(6, 5))
+        ax.imshow(cm, cmap="Blues")
+        ax.set_xticks(range(num_classes))
+        ax.set_yticks(range(num_classes))
+        ax.set_xticklabels([f"Class {i}" for i in range(num_classes)])
+        ax.set_yticklabels([f"Class {i}" for i in range(num_classes)])
+        for i in range(num_classes):
+            for j in range(num_classes):
+                ax.text(j, i, str(cm[i, j]), ha="center", va="center", color="black")
+        ax.set_xlabel("Predicted")
+        ax.set_ylabel("True")
+        ax.set_title("Test set confusion matrix")
+        fig.tight_layout()
+        task.logger.report_matplotlib_figure(title="Confusion Matrix", series="test", figure=fig, iteration=0)
+        plt.close(fig)
         task.logger.flush()
     # ── Save checkpoint ───────────────────────────────────────────
     print(f"\n[CKPT] Model saved to: {model_path}")
     # ── Write JSON log (same schema as MLP) ───────────────────────
+    # pandas-free tree/node count (trees_to_dataframe() needs pandas)
+    booster = model.get_booster()
+    tree_count = int(booster.num_boosted_rounds())
+    node_count = int(sum(tree.count("\n") + 1 for tree in booster.get_dump()))
     history = {
         "model_name": f"xgboost_{CFG['model_name']}",
+        "param_count": node_count,
+        "tree_count": tree_count,
+        "xgb_params": CFG["xgb_params"],
         "epochs": list(range(1, len(train_losses) + 1)),
         "train_loss": [round(v, 4) for v in train_losses],
+        "val_loss": [round(v, 4) for v in val_losses],
+        "test_acc": round(test_acc, 4),
+        "test_f1": round(test_f1, 4),
+        "test_auc": round(test_auc, 4),
+        "dataset_stats": dataset_stats,
     }
     logs_dir = CFG["logs_dir"]
     print(f"[LOG] Training history saved to: {log_path}")
+    if task is not None:
+        task.upload_artifact(name="xgboost_model", artifact_object=model_path)
+        task.upload_artifact(name="training_log", artifact_object=log_path)
 if __name__ == "__main__":
     main()

requirements.txt CHANGED Viewed

@@ -16,6 +16,7 @@ httpx>=0.27.0
 aiosqlite>=0.19.0
 psutil>=5.9.0
 pydantic>=2.0.0
 xgboost>=2.0.0
 clearml>=2.0.2
 pytest>=9.0.0

 aiosqlite>=0.19.0
 psutil>=5.9.0
 pydantic>=2.0.0
+PyYAML>=6.0
 xgboost>=2.0.0
 clearml>=2.0.2
 pytest>=9.0.0

src/App.jsx CHANGED Viewed

@@ -65,7 +65,7 @@ function App() {
         {renderMenuButton('records', 'My Records')}
         <div className="separator"></div>
-        {renderMenuButton('customise', 'Customise')}
         <div className="separator"></div>
         {renderMenuButton('help', 'Help')}

         {renderMenuButton('records', 'My Records')}
         <div className="separator"></div>
+        {renderMenuButton('customise', 'Settings')}
         <div className="separator"></div>
         {renderMenuButton('help', 'Help')}

src/components/Achievement.jsx CHANGED Viewed

@@ -199,22 +199,6 @@ function Achievement() {
         </div>
       ) : (
         <>
-          {systemStats && systemStats.cpu_percent != null && (
-            <div style={{
-              textAlign: 'center',
-              marginBottom: '12px',
-              padding: '8px 12px',
-              background: 'rgba(0,0,0,0.2)',
-              borderRadius: '8px',
-              fontSize: '13px',
-              color: '#aaa'
-            }}>
-              Server: CPU <strong style={{ color: '#8f8' }}>{systemStats.cpu_percent}%</strong>
-              {' · '}
-              RAM <strong style={{ color: '#8af' }}>{systemStats.memory_percent}%</strong>
-              {systemStats.memory_used_mb != null && ` (${systemStats.memory_used_mb}/${systemStats.memory_total_mb} MB)`}
-            </div>
-          )}
           <div className="stats-grid">
             <div className="stat-card">
               <div className="stat-number" id="total-sessions">{stats.total_sessions}</div>

         </div>
       ) : (
         <>
           <div className="stats-grid">
             <div className="stat-card">
               <div className="stat-number" id="total-sessions">{stats.total_sessions}</div>

src/components/Customise.jsx CHANGED Viewed

@@ -103,7 +103,7 @@ function Customise() {
   return (
     <main id="page-e" className="page">
-      <h1 className="page-title">Customise</h1>
       <div className="settings-container">
         {/* Data Management */}

   return (
     <main id="page-e" className="page">
+      <h1 className="page-title">Settings</h1>
       <div className="settings-container">
         {/* Data Management */}

src/components/FocusPageLocal.jsx CHANGED Viewed

@@ -518,20 +518,16 @@ function FocusPageLocal({ videoManager, sessionResult, setSessionResult, isActiv
       return;
     }
-    //
     const sessionDuration = Math.floor((Date.now() - (videoManager.sessionStartTime || Date.now())) / 1000);
-    //
-    const focusScore = currentStats.framesProcessed > 0
-      ? (currentStats.framesProcessed * (currentStats.currentStatus ? 1 : 0)) / currentStats.framesProcessed
-      : 0;
-    //
     setSessionResult({
       duration_seconds: sessionDuration,
       focus_score: focusScore,
-      total_frames: currentStats.framesProcessed,
-      focused_frames: Math.floor(currentStats.framesProcessed * focusScore)
     });
   };

       return;
     }
     const sessionDuration = Math.floor((Date.now() - (videoManager.sessionStartTime || Date.now())) / 1000);
+    const totalFrames = currentStats.framesProcessed || 0;
+    const focusedFrames = currentStats.focusedFrames ?? 0;
+    const focusScore = totalFrames > 0 ? focusedFrames / totalFrames : 0;
     setSessionResult({
       duration_seconds: sessionDuration,
       focus_score: focusScore,
+      total_frames: totalFrames,
+      focused_frames: focusedFrames
     });
   };

src/utils/VideoManagerLocal.js CHANGED Viewed

@@ -59,10 +59,11 @@ export class VideoManagerLocal {
         // Calibration state
         this.calibration = createCalibrationState();
-        // Performance metrics
         this.stats = {
             framesSent: 0,
             framesProcessed: 0,
             avgLatency: 0,
             lastLatencies: []
         };
@@ -411,6 +412,8 @@ export class VideoManagerLocal {
             case 'session_started':
                 this.sessionId = data.session_id;
                 this.sessionStartTime = Date.now();
                 console.log('Session started:', this.sessionId);
                 if (this.callbacks.onSessionStart) {
                     this.callbacks.onSessionStart(this.sessionId);
@@ -419,6 +422,7 @@ export class VideoManagerLocal {
             case 'detection':
                 this.stats.framesProcessed++;
                 // Track latency from send→receive
                 const now = performance.now();

         // Calibration state
         this.calibration = createCalibrationState();
+        // Performance metrics (focusedFrames = count of frames where focused was true this session)
         this.stats = {
             framesSent: 0,
             framesProcessed: 0,
+            focusedFrames: 0,
             avgLatency: 0,
             lastLatencies: []
         };
             case 'session_started':
                 this.sessionId = data.session_id;
                 this.sessionStartTime = Date.now();
+                this.stats.framesProcessed = 0;
+                this.stats.focusedFrames = 0;
                 console.log('Session started:', this.sessionId);
                 if (this.callbacks.onSessionStart) {
                     this.callbacks.onSessionStart(this.sessionId);
             case 'detection':
                 this.stats.framesProcessed++;
+                if (data.focused) this.stats.focusedFrames++;
                 // Track latency from send→receive
                 const now = performance.now();

tests/test_api_settings.py CHANGED Viewed

@@ -24,27 +24,16 @@ def test_get_settings_default_fields():
         resp = client.get("/api/settings")
         assert resp.status_code == 200
         data = resp.json()
-        assert "sensitivity" in data
-        assert "notification_enabled" in data
-        assert "notification_threshold" in data
-        assert "frame_rate" in data
         assert "model_name" in data
-def test_update_settings_clamped_ranges():
     client = _make_test_client()
     with client:
-        # get setting
         r0 = client.get("/api/settings")
         assert r0.status_code == 200
-        # set unlogic params
-        payload = {
-            "sensitivity": 100,
-            "notification_enabled": False,
-            "notification_threshold": 1,
-            "frame_rate": 1000,
-        }
         r_put = client.put("/api/settings", json=payload)
         assert r_put.status_code == 200
         body = r_put.json()
@@ -53,8 +42,5 @@ def test_update_settings_clamped_ranges():
         r1 = client.get("/api/settings")
         data = r1.json()
-        assert 1 <= data["sensitivity"] <= 10
-        assert bool(data["notification_enabled"]) is False
-        assert 5 <= data["notification_threshold"] <= 300
-        assert 5 <= data["frame_rate"] <= 60

         resp = client.get("/api/settings")
         assert resp.status_code == 200
         data = resp.json()
         assert "model_name" in data
+        assert "l2cs_boost" in data
+def test_update_settings_model_name():
     client = _make_test_client()
     with client:
         r0 = client.get("/api/settings")
         assert r0.status_code == 200
+        payload = {"model_name": "xgboost"}
         r_put = client.put("/api/settings", json=payload)
         assert r_put.status_code == 200
         body = r_put.json()
         r1 = client.get("/api/settings")
         data = r1.json()
+        assert data["model_name"] == "xgboost"

tests/test_data_preparation.py CHANGED Viewed

@@ -10,10 +10,18 @@ if PROJECT_ROOT not in sys.path:
 from data_preparation.prepare_dataset import (
     SELECTED_FEATURES,
     _generate_synthetic_data,
     get_numpy_splits,
 )
 def test_generate_synthetic_data_shape():
     X, y = _generate_synthetic_data("face_orientation")
     assert X.shape[0] == 500
@@ -22,18 +30,23 @@ def test_generate_synthetic_data_shape():
 def test_get_numpy_splits_consistency():
-    splits, num_features, num_classes, scaler = get_numpy_splits("face_orientation")
-    # train/val/test each have samples
     n_train = len(splits["y_train"])
     n_val = len(splits["y_val"])
     n_test = len(splits["y_test"])
     assert n_train > 0
     assert n_val > 0
     assert n_test > 0
-    # feature dim should same as num_features
     assert splits["X_train"].shape[1] == num_features
     assert num_classes >= 2

 from data_preparation.prepare_dataset import (
     SELECTED_FEATURES,
     _generate_synthetic_data,
+    get_default_split_config,
     get_numpy_splits,
 )
+def test_get_default_split_config():
+    ratios, seed = get_default_split_config()
+    assert len(ratios) == 3
+    assert abs(sum(ratios) - 1.0) < 1e-6
+    assert seed >= 0
 def test_generate_synthetic_data_shape():
     X, y = _generate_synthetic_data("face_orientation")
     assert X.shape[0] == 500
 def test_get_numpy_splits_consistency():
+    split_ratios, seed = get_default_split_config()
+    splits, num_features, num_classes, scaler = get_numpy_splits(
+        "face_orientation", split_ratios=split_ratios, seed=seed
+    )
     n_train = len(splits["y_train"])
     n_val = len(splits["y_val"])
     n_test = len(splits["y_test"])
     assert n_train > 0
     assert n_val > 0
     assert n_test > 0
     assert splits["X_train"].shape[1] == num_features
     assert num_classes >= 2
+    # Same seed and ratios produce same split (deterministic)
+    splits2, _, _, _ = get_numpy_splits(
+        "face_orientation", split_ratios=split_ratios, seed=seed
+    )
+    np.testing.assert_array_equal(splits["y_test"], splits2["y_test"])