Spaces:

FocusGuard
/

final

Sleeping

App Files Files Community

k22056537 commited on 11 days ago

Commit

6114098

1 Parent(s): 633f159

chore: integration cleanup — remove eye CNN, add threshold justification, fix pipeline

Browse files

Files changed (41) hide show

.gitignore +1 -0
FOCUS_SCORE_EQUATIONS.md +147 -0
checkpoints/hybrid_focus_config.json +5 -5
data/CNN/eye_crops/val/open/.gitkeep +0 -1
evaluation/README.md +34 -8
evaluation/THRESHOLD_JUSTIFICATION.md +89 -0
evaluation/justify_thresholds.py +463 -0
evaluation/plots/ear_distribution.png +0 -0
evaluation/plots/geo_weight_search.png +0 -0
evaluation/plots/hybrid_weight_search.png +0 -0
evaluation/plots/mar_distribution.png +0 -0
evaluation/plots/roc_mlp.png +0 -0
evaluation/plots/roc_xgb.png +0 -0
models/README.md +0 -2
models/cnn/CNN_MODEL/.claude/settings.local.json +0 -7
models/cnn/CNN_MODEL/.gitattributes +0 -1
models/cnn/CNN_MODEL/.gitignore +0 -4
models/cnn/CNN_MODEL/README.md +0 -74
models/cnn/CNN_MODEL/notebooks/eye_classifier_colab.ipynb +0 -0
models/cnn/CNN_MODEL/scripts/focus_infer.py +0 -199
models/cnn/CNN_MODEL/scripts/predict_image.py +0 -49
models/cnn/CNN_MODEL/scripts/video_infer.py +0 -281
models/cnn/CNN_MODEL/scripts/webcam_live.py +0 -184
models/cnn/CNN_MODEL/weights/yolo11s-cls.pt +0 -3
models/cnn/__init__.py +0 -0
models/cnn/eye_attention/__init__.py +0 -1
models/cnn/eye_attention/classifier.py +0 -169
models/cnn/eye_attention/crop.py +0 -70
models/cnn/eye_attention/train.py +0 -0
models/cnn/notebooks/EyeCNN.ipynb +0 -107
models/cnn/notebooks/EyeCNN_Train_Evaluate_new.ipynb +0 -0
models/cnn/notebooks/EyeCNN_Training_Evaluate.ipynb +0 -0
models/cnn/notebooks/README.md +0 -1
models/eye_classifier.py +0 -69
models/eye_crop.py +0 -77
models/xgboost/checkpoints/face_orientation_best.json +0 -0
public/assets/111.jpg +0 -0
src/assets/react.svg +0 -1
ui/live_demo.py +0 -14
ui/pipeline.py +20 -87
yolov8n.pt +0 -3

.gitignore CHANGED Viewed

@@ -37,6 +37,7 @@ ignore/
 # Project specific
 focus_guard.db
 static/
 __pycache__/
 docs/

 # Project specific
 focus_guard.db
+test_focus_guard.db
 static/
 __pycache__/
 docs/

FOCUS_SCORE_EQUATIONS.md ADDED Viewed

	@@ -0,0 +1,147 @@

+# How the focused/unfocused score is computed
+The system outputs a **focus score** in `[0, 1]` and a binary **focused/unfocused** label. The label is derived from the score and a threshold. The exact equation depends on which pipeline (model) you use.
+---
+## 1. Final output (all pipelines)
+- **`raw_score`** (or **`focus_score`** in Hybrid): value in `[0, 1]` after optional smoothing.
+- **`is_focused`**: binary label.
+**Equation:**
+```text
+is_focused = (smoothed_score >= threshold)
+```
+- **Smoothed score:** the pipeline may apply an exponential moving average (EMA) to the raw score; that smoothed value is what you see as `raw_score` / `focus_score` in the API.
+- **Threshold:** set in the UI (sensitivity) or in pipeline config; typical default **0.5** or **0.55**.
+So: **focus score** is the continuous value; **focused vs unfocused** is **score ≥ threshold** vs **score < threshold**.
+---
+## 2. Geometric pipeline (rule-based, no ML)
+**Raw score (before smoothing):**
+```text
+raw = α · s_face + β · s_eye
+```
+- Default: **α = 0.4**, **β = 0.6** (face weight 40%, eye weight 60%).
+- If **yawning** (MAR > 0.55): **raw = 0**.
+**Face score `s_face`** (head pose, from `HeadPoseEstimator`):
+- **deviation** = √( yaw² + pitch² + (0.5·roll)² )
+- **t** = min( deviation / max_angle , 1 ), with **max_angle = 22°** (default).
+- **s_face** = 0.5 · (1 + cos(π · t))
+  → 1 when head is straight, 0 when deviation ≥ max_angle.
+**Eye score `s_eye`** (from `EyeBehaviourScorer`):
+- **EAR** = Eye Aspect Ratio (from landmarks); use **min(left_ear, right_ear)**.
+- **ear_s** = linear map of EAR to [0,1] between `ear_closed=0.16` and `ear_open=0.30`.
+- **Gaze:** horizontal/vertical gaze ratios from iris position; **offset** = distance from center (0.5, 0.5).
+- **gaze_s** = 0.5 · (1 + cos(π · t)), with **t** = min( offset / gaze_max_offset , 1 ), **gaze_max_offset = 0.28**.
+- **s_eye** = ear_s · gaze_s (or just ear_s if ear_s < 0.3).
+Then:
+```text
+smoothed_score = EMA(raw)
+is_focused = (smoothed_score >= threshold)
+```
+---
+## 3. MLP pipeline
+- Features are extracted (same 17-d feature vector as in training), clipped, then optionally extended (magnitudes, velocities, variances) and scaled with the **training-time scaler**.
+- The MLP outputs either:
+  - **Probability of class 1 (focused):** `mlp_prob = predict_proba(X_sc)[0, 1]`, or
+  - If no `predict_proba`: **mlp_prob = 1 if predict(X_sc) == 1 else 0**.
+**Equations:**
+```text
+raw_score = mlp_prob   (clipped to [0, 1])
+smoothed_score = EMA(raw_score)
+is_focused = (smoothed_score >= threshold)
+```
+So the **focus score** is the **MLP’s estimated probability of being focused** (after optional smoothing).
+---
+## 4. XGBoost pipeline
+- Same feature extraction and clipping; uses the **same feature subset** as in XGBoost training (no runtime magnitude/velocity extension).
+- **prob** = `predict_proba(X)[0]` → **[P(unfocused), P(focused)]**.
+**Equations:**
+```text
+raw_score = prob[1]   (probability of focused class)
+smoothed_score = EMA(raw_score)
+is_focused = (smoothed_score >= threshold)
+```
+So the **focus score** is the **XGBoost probability of the focused class**.
+---
+## 5. Hybrid pipeline (MLP + geometric)
+Combines the MLP’s probability with a geometric score, then applies a single threshold.
+**Geometric part:**
+```text
+geo_score = geo_face_weight · s_face + geo_eye_weight · s_eye
+```
+- Default: **geo_face_weight = 0.4**, **geo_eye_weight = 0.6**.
+- **s_face** and **s_eye** as in the Geometric pipeline (with optional yawn veto: if yawning, **geo_score = 0**).
+- **geo_score** is clipped to [0, 1].
+**MLP part:** same as MLP pipeline → **mlp_prob** in [0, 1].
+**Combined focus score (default weights):**
+```text
+focus_score = w_mlp · mlp_prob + w_geo · geo_score
+```
+- Default: **w_mlp = 0.7**, **w_geo = 0.3** (after normalising so weights sum to 1).
+- **focus_score** is clipped to [0, 1], then smoothed.
+**Equations:**
+```text
+focus_score = clip( w_mlp · mlp_prob + w_geo · geo_score , 0 , 1 )
+smoothed_score = EMA(focus_score)
+is_focused = (smoothed_score >= threshold)
+```
+Default **threshold** in hybrid config is **0.55**.
+---
+## 6. Summary table
+| Pipeline   | Raw score formula                    | Focused condition           |
+|-----------|--------------------------------------|-----------------------------|
+| Geometric | α·s_face + β·s_eye (0 if yawn)       | smoothed ≥ threshold        |
+| MLP       | MLP P(focused)                       | smoothed ≥ threshold        |
+| XGBoost   | XGB P(focused)                       | smoothed ≥ threshold        |
+| Hybrid    | w_mlp·mlp_prob + w_geo·geo_score     | smoothed ≥ threshold        |
+**s_face** = head-pose score (cosine of normalised deviation).
+**s_eye** = eye score (EAR × gaze score, or blend with CNN).
+**geo_score** = geo_face_weight·s_face + geo_eye_weight·s_eye (with optional yawn veto).
+**EMA** = exponential moving average (e.g. α=0.3) for temporal smoothing.
+So: **focus score** is always a number in [0, 1]; **focused vs unfocused** is **score ≥ threshold** in all pipelines.

checkpoints/hybrid_focus_config.json CHANGED Viewed

@@ -1,10 +1,10 @@
 {
-  "w_mlp": 0.6000000000000001,
-  "w_geo": 0.3999999999999999,
   "threshold": 0.35,
   "use_yawn_veto": true,
-  "geo_face_weight": 0.4,
-  "geo_eye_weight": 0.6,
   "mar_yawn_threshold": 0.55,
   "metric": "f1"
-}

 {
+  "w_mlp": 0.3,
+  "w_geo": 0.7,
   "threshold": 0.35,
   "use_yawn_veto": true,
+  "geo_face_weight": 0.7,
+  "geo_eye_weight": 0.3,
   "mar_yawn_threshold": 0.55,
   "metric": "f1"
+}

data/CNN/eye_crops/val/open/.gitkeep DELETED Viewed

	@@ -1 +0,0 @@
1	-

evaluation/README.md CHANGED Viewed

@@ -1,14 +1,22 @@
 # evaluation/
-Training logs and performance metrics.
 ## 1. Contents
 ```
 logs/
-├── face_orientation_training_log.json           # MLP (latest run)
-├── mlp_face_orientation_training_log.json       # MLP (alternate)
-└── xgboost_face_orientation_training_log.json   # XGBoost
 ```
 ## 2. Log Format
@@ -39,8 +47,26 @@ Each JSON file records the full training history:
 }
 ```
-## 3. Generated By
-- `python -m models.mlp.train` → writes MLP log
-- `python -m models.xgboost.train` → writes XGBoost log
-- Notebooks in `notebooks/` also save logs here

 # evaluation/
+Training logs, threshold analysis, and performance metrics.
 ## 1. Contents
+```
+logs/              # training run logs (JSON)
+plots/             # threshold justification figures (ROC, weight search, EAR/MAR)
+justify_thresholds.py   # LOPO analysis script
+THRESHOLD_JUSTIFICATION.md   # report (auto-generated by script)
+```
+**Logs (when present):**
 ```
 logs/
+├── face_orientation_training_log.json
+├── mlp_face_orientation_training_log.json
+└── xgboost_face_orientation_training_log.json
 ```
 ## 2. Log Format
 }
 ```
+## 3. Threshold justification
+Thresholds and weights used in the app (geometric, MLP, XGBoost, hybrid) are justified in **THRESHOLD_JUSTIFICATION.md**. The report is generated by:
+```bash
+python -m evaluation.justify_thresholds
+```
+From repo root, with venv active. The script runs LOPO over 9 participants (~145k samples), computes ROC + Youden's J for ML/XGB thresholds, grid-searches geometric and hybrid weights, and plots EAR/MAR distributions. It writes:
+- `plots/roc_mlp.png`, `plots/roc_xgb.png`
+- `plots/geo_weight_search.png`, `plots/hybrid_weight_search.png`
+- `plots/ear_distribution.png`, `plots/mar_distribution.png`
+- `THRESHOLD_JUSTIFICATION.md`
+Takes ~10–15 minutes. Re-run after changing data or pipeline weights (e.g. geometric face/eye); hybrid optimal w_mlp depends on the geometric sub-score weights.
+## 4. Generated by
+- `python -m models.mlp.train` → MLP log in `logs/`
+- `python -m models.xgboost.train` → XGBoost log in `logs/`
+- `python -m evaluation.justify_thresholds` → plots + THRESHOLD_JUSTIFICATION.md
+- Notebooks in `notebooks/` can also write logs here

evaluation/THRESHOLD_JUSTIFICATION.md ADDED Viewed

	@@ -0,0 +1,89 @@

+# Threshold Justification Report
+Auto-generated by `evaluation/justify_thresholds.py` using LOPO cross-validation over 9 participants (~145k samples).
+## 1. ML Model Decision Thresholds
+Thresholds selected via **Youden's J statistic** (J = sensitivity + specificity - 1) on pooled LOPO held-out predictions.
+| Model | LOPO AUC | Optimal Threshold (Youden's J) | F1 @ Optimal | F1 @ 0.50 |
+|-------|----------|-------------------------------|--------------|-----------|
+| MLP | 0.8624 | **0.228** | 0.8578 | 0.8149 |
+| XGBoost | 0.8804 | **0.377** | 0.8585 | 0.8424 |
+![MLP ROC](plots/roc_mlp.png)
+![XGBoost ROC](plots/roc_xgboost.png)
+## 2. Geometric Pipeline Weights (s_face vs s_eye)
+Grid search over face weight alpha in {0.2 ... 0.8}. Eye weight = 1 - alpha. Threshold per fold via Youden's J.
+| Face Weight (alpha) | Mean LOPO F1 |
+|--------------------:|-------------:|
+| 0.2 | 0.7926 |
+| 0.3 | 0.8002 |
+| 0.4 | 0.7719 |
+| 0.5 | 0.7868 |
+| 0.6 | 0.8184 |
+| 0.7 | 0.8195 **<-- selected** |
+| 0.8 | 0.8126 |
+**Best:** alpha = 0.7 (face 70%, eye 30%)
+![Geometric weight search](plots/geo_weight_search.png)
+## 3. Hybrid Pipeline Weights (MLP vs Geometric)
+Grid search over w_mlp in {0.3 ... 0.8}. w_geo = 1 - w_mlp. Geometric sub-score uses same weights as geometric pipeline (face=0.7, eye=0.3). If you change geometric weights, re-run this script — optimal w_mlp can shift.
+| MLP Weight (w_mlp) | Mean LOPO F1 |
+|-------------------:|-------------:|
+| 0.3 | 0.8409 **<-- selected** |
+| 0.4 | 0.8246 |
+| 0.5 | 0.8164 |
+| 0.6 | 0.8106 |
+| 0.7 | 0.8039 |
+| 0.8 | 0.8016 |
+**Best:** w_mlp = 0.3 (MLP 30%, geometric 70%)
+![Hybrid weight search](plots/hybrid_weight_search.png)
+## 4. Eye and Mouth Aspect Ratio Thresholds
+### EAR (Eye Aspect Ratio)
+Reference: Soukupova & Cech, "Real-Time Eye Blink Detection Using Facial Landmarks" (2016) established EAR ~ 0.2 as a blink threshold.
+Our thresholds define a linear interpolation zone around this established value:
+| Constant | Value | Justification |
+|----------|------:|---------------|
+| `ear_closed` | 0.16 | Below this, eyes are fully shut. 16.3% of samples fall here. |
+| `EAR_BLINK_THRESH` | 0.21 | Blink detection point; close to the 0.2 reference. 21.2% of samples below. |
+| `ear_open` | 0.30 | Above this, eyes are fully open. 70.4% of samples here. |
+Between 0.16 and 0.30 the `_ear_score` function linearly interpolates from 0 to 1, providing a smooth transition rather than a hard binary cutoff.
+![EAR distribution](plots/ear_distribution.png)
+### MAR (Mouth Aspect Ratio)
+| Constant | Value | Justification |
+|----------|------:|---------------|
+| `MAR_YAWN_THRESHOLD` | 0.55 | Only 1.7% of samples exceed this, confirming it captures genuine yawns without false positives. |
+![MAR distribution](plots/mar_distribution.png)
+## 5. Other Constants
+| Constant | Value | Rationale |
+|----------|------:|-----------|
+| `gaze_max_offset` | 0.28 | Max iris displacement (normalised) before gaze score drops to zero. Corresponds to ~56% of the eye width; beyond this the iris is at the extreme edge. |
+| `max_angle` | 22.0 deg | Head deviation beyond which face score = 0. Based on typical monitor-viewing cone: at 60 cm distance and a 24" monitor, the viewing angle is ~20-25 degrees. |
+| `roll_weight` | 0.5 | Roll is less indicative of inattention than yaw/pitch (tilting head doesn't mean looking away), so it's down-weighted by 50%. |
+| `EMA alpha` | 0.3 | Smoothing factor for focus score. Gives ~3-4 frame effective window; balances responsiveness vs flicker. |
+| `grace_frames` | 15 | ~0.5 s at 30 fps before penalising no-face. Allows brief occlusions (e.g. hand gesture) without dropping score. |
+| `PERCLOS_WINDOW` | 60 frames | 2 s at 30 fps; standard PERCLOS measurement window (Dinges & Grace, 1998). |
+| `BLINK_WINDOW_SEC` | 30 s | Blink rate measured over 30 s; typical spontaneous blink rate is 15-20/min (Bentivoglio et al., 1997). |

evaluation/justify_thresholds.py ADDED Viewed

	@@ -0,0 +1,463 @@

+# LOPO threshold/weight analysis. Run: python -m evaluation.justify_thresholds
+import glob
+import os
+import sys
+import numpy as np
+import matplotlib
+matplotlib.use("Agg")
+import matplotlib.pyplot as plt
+from sklearn.neural_network import MLPClassifier
+from sklearn.preprocessing import StandardScaler
+from sklearn.metrics import roc_curve, roc_auc_score, f1_score
+from xgboost import XGBClassifier
+_PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
+sys.path.insert(0, _PROJECT_ROOT)
+from data_preparation.prepare_dataset import load_per_person, SELECTED_FEATURES
+PLOTS_DIR = os.path.join(os.path.dirname(__file__), "plots")
+REPORT_PATH = os.path.join(os.path.dirname(__file__), "THRESHOLD_JUSTIFICATION.md")
+SEED = 42
+def _youdens_j(y_true, y_prob):
+    fpr, tpr, thresholds = roc_curve(y_true, y_prob)
+    j = tpr - fpr
+    idx = j.argmax()
+    auc = roc_auc_score(y_true, y_prob)
+    return float(thresholds[idx]), fpr, tpr, thresholds, float(auc)
+def _f1_at_threshold(y_true, y_prob, threshold):
+    return f1_score(y_true, (y_prob >= threshold).astype(int), zero_division=0)
+def _plot_roc(fpr, tpr, auc, opt_thresh, opt_idx, title, path):
+    fig, ax = plt.subplots(figsize=(6, 5))
+    ax.plot(fpr, tpr, lw=2, label=f"ROC (AUC = {auc:.4f})")
+    ax.plot(fpr[opt_idx], tpr[opt_idx], "ro", markersize=10,
+            label=f"Youden's J optimum (t = {opt_thresh:.3f})")
+    ax.plot([0, 1], [0, 1], "k--", lw=1, alpha=0.5)
+    ax.set_xlabel("False Positive Rate")
+    ax.set_ylabel("True Positive Rate")
+    ax.set_title(title)
+    ax.legend(loc="lower right")
+    fig.tight_layout()
+    fig.savefig(path, dpi=150)
+    plt.close(fig)
+    print(f"  saved {path}")
+def run_lopo_models():
+    print("\n=== LOPO: MLP and XGBoost ===")
+    by_person, _, _ = load_per_person("face_orientation")
+    persons = sorted(by_person.keys())
+    results = {"mlp": {"y": [], "p": []}, "xgb": {"y": [], "p": []}}
+    for i, held_out in enumerate(persons):
+        X_test, y_test = by_person[held_out]
+        train_X = np.concatenate([by_person[p][0] for p in persons if p != held_out])
+        train_y = np.concatenate([by_person[p][1] for p in persons if p != held_out])
+        scaler = StandardScaler().fit(train_X)
+        X_tr_sc = scaler.transform(train_X)
+        X_te_sc = scaler.transform(X_test)
+        mlp = MLPClassifier(
+            hidden_layer_sizes=(64, 32), activation="relu",
+            max_iter=200, early_stopping=True, validation_fraction=0.15,
+            random_state=SEED, verbose=False,
+        )
+        mlp.fit(X_tr_sc, train_y)
+        mlp_prob = mlp.predict_proba(X_te_sc)[:, 1]
+        results["mlp"]["y"].append(y_test)
+        results["mlp"]["p"].append(mlp_prob)
+        xgb = XGBClassifier(
+            n_estimators=600, max_depth=8, learning_rate=0.05,
+            subsample=0.8, colsample_bytree=0.8,
+            reg_alpha=0.1, reg_lambda=1.0,
+            use_label_encoder=False, eval_metric="logloss",
+            random_state=SEED, verbosity=0,
+        )
+        xgb.fit(X_tr_sc, train_y)
+        xgb_prob = xgb.predict_proba(X_te_sc)[:, 1]
+        results["xgb"]["y"].append(y_test)
+        results["xgb"]["p"].append(xgb_prob)
+        print(f"  fold {i+1}/{len(persons)}: held out {held_out} "
+              f"({X_test.shape[0]} samples)")
+    for key in results:
+        results[key]["y"] = np.concatenate(results[key]["y"])
+        results[key]["p"] = np.concatenate(results[key]["p"])
+    return results
+def analyse_model_thresholds(results):
+    print("\n=== Model threshold analysis ===")
+    model_stats = {}
+    for name, label in [("mlp", "MLP"), ("xgb", "XGBoost")]:
+        y, p = results[name]["y"], results[name]["p"]
+        opt_t, fpr, tpr, thresholds, auc = _youdens_j(y, p)
+        j = tpr - fpr
+        opt_idx = j.argmax()
+        f1_opt = _f1_at_threshold(y, p, opt_t)
+        f1_50 = _f1_at_threshold(y, p, 0.50)
+        path = os.path.join(PLOTS_DIR, f"roc_{name}.png")
+        _plot_roc(fpr, tpr, auc, opt_t, opt_idx,
+                  f"LOPO ROC — {label} (9 folds, 144k samples)", path)
+        model_stats[name] = {
+            "label": label, "auc": auc,
+            "opt_threshold": opt_t, "f1_opt": f1_opt, "f1_50": f1_50,
+        }
+        print(f"  {label}: AUC={auc:.4f}, optimal threshold={opt_t:.3f} "
+              f"(F1={f1_opt:.4f}), F1@0.50={f1_50:.4f}")
+    return model_stats
+def run_geo_weight_search():
+    print("\n=== Geometric weight grid search ===")
+    by_person, _, _ = load_per_person("face_orientation")
+    persons = sorted(by_person.keys())
+    features = SELECTED_FEATURES["face_orientation"]
+    sf_idx = features.index("s_face")
+    se_idx = features.index("s_eye")
+    alphas = np.arange(0.2, 0.85, 0.1).round(1)
+    alpha_f1 = {a: [] for a in alphas}
+    for held_out in persons:
+        X_test, y_test = by_person[held_out]
+        sf = X_test[:, sf_idx]
+        se = X_test[:, se_idx]
+        train_X = np.concatenate([by_person[p][0] for p in persons if p != held_out])
+        train_y = np.concatenate([by_person[p][1] for p in persons if p != held_out])
+        sf_tr = train_X[:, sf_idx]
+        se_tr = train_X[:, se_idx]
+        for a in alphas:
+            score_tr = a * sf_tr + (1.0 - a) * se_tr
+            opt_t, *_ = _youdens_j(train_y, score_tr)
+            score_te = a * sf + (1.0 - a) * se
+            f1 = _f1_at_threshold(y_test, score_te, opt_t)
+            alpha_f1[a].append(f1)
+    mean_f1 = {a: np.mean(f1s) for a, f1s in alpha_f1.items()}
+    best_alpha = max(mean_f1, key=mean_f1.get)
+    fig, ax = plt.subplots(figsize=(7, 4))
+    ax.bar([f"{a:.1f}" for a in alphas],
+           [mean_f1[a] for a in alphas], color="steelblue")
+    ax.set_xlabel("Face weight (alpha); eye weight = 1 - alpha")
+    ax.set_ylabel("Mean LOPO F1")
+    ax.set_title("Geometric Pipeline: Face vs Eye Weight Search")
+    ax.set_ylim(bottom=max(0, min(mean_f1.values()) - 0.05))
+    for i, a in enumerate(alphas):
+        ax.text(i, mean_f1[a] + 0.003, f"{mean_f1[a]:.3f}",
+                ha="center", va="bottom", fontsize=8)
+    fig.tight_layout()
+    path = os.path.join(PLOTS_DIR, "geo_weight_search.png")
+    fig.savefig(path, dpi=150)
+    plt.close(fig)
+    print(f"  saved {path}")
+    print(f"  Best alpha (face weight) = {best_alpha:.1f}, "
+          f"mean LOPO F1 = {mean_f1[best_alpha]:.4f}")
+    return dict(mean_f1), best_alpha
+def run_hybrid_weight_search(lopo_results):
+    print("\n=== Hybrid weight grid search ===")
+    by_person, _, _ = load_per_person("face_orientation")
+    persons = sorted(by_person.keys())
+    features = SELECTED_FEATURES["face_orientation"]
+    sf_idx = features.index("s_face")
+    se_idx = features.index("s_eye")
+    GEO_FACE_W = 0.7
+    GEO_EYE_W = 0.3
+    w_mlps = np.arange(0.3, 0.85, 0.1).round(1)
+    wmf1 = {w: [] for w in w_mlps}
+    mlp_p = lopo_results["mlp"]["p"]
+    offset = 0
+    for held_out in persons:
+        X_test, y_test = by_person[held_out]
+        n = X_test.shape[0]
+        mlp_prob_fold = mlp_p[offset:offset + n]
+        offset += n
+        sf = X_test[:, sf_idx]
+        se = X_test[:, se_idx]
+        geo_score = np.clip(GEO_FACE_W * sf + GEO_EYE_W * se, 0, 1)
+        train_X = np.concatenate([by_person[p][0] for p in persons if p != held_out])
+        train_y = np.concatenate([by_person[p][1] for p in persons if p != held_out])
+        sf_tr = train_X[:, sf_idx]
+        se_tr = train_X[:, se_idx]
+        geo_tr = np.clip(GEO_FACE_W * sf_tr + GEO_EYE_W * se_tr, 0, 1)
+        scaler = StandardScaler().fit(train_X)
+        mlp_tr = MLPClassifier(
+            hidden_layer_sizes=(64, 32), activation="relu",
+            max_iter=200, early_stopping=True, validation_fraction=0.15,
+            random_state=SEED, verbose=False,
+        )
+        mlp_tr.fit(scaler.transform(train_X), train_y)
+        mlp_prob_tr = mlp_tr.predict_proba(scaler.transform(train_X))[:, 1]
+        for w in w_mlps:
+            combo_tr = w * mlp_prob_tr + (1.0 - w) * geo_tr
+            opt_t, *_ = _youdens_j(train_y, combo_tr)
+            combo_te = w * mlp_prob_fold + (1.0 - w) * geo_score
+            f1 = _f1_at_threshold(y_test, combo_te, opt_t)
+            wmf1[w].append(f1)
+    mean_f1 = {w: np.mean(f1s) for w, f1s in wmf1.items()}
+    best_w = max(mean_f1, key=mean_f1.get)
+    fig, ax = plt.subplots(figsize=(7, 4))
+    ax.bar([f"{w:.1f}" for w in w_mlps],
+           [mean_f1[w] for w in w_mlps], color="darkorange")
+    ax.set_xlabel("MLP weight (w_mlp); geo weight = 1 - w_mlp")
+    ax.set_ylabel("Mean LOPO F1")
+    ax.set_title("Hybrid Pipeline: MLP vs Geometric Weight Search")
+    ax.set_ylim(bottom=max(0, min(mean_f1.values()) - 0.05))
+    for i, w in enumerate(w_mlps):
+        ax.text(i, mean_f1[w] + 0.003, f"{mean_f1[w]:.3f}",
+                ha="center", va="bottom", fontsize=8)
+    fig.tight_layout()
+    path = os.path.join(PLOTS_DIR, "hybrid_weight_search.png")
+    fig.savefig(path, dpi=150)
+    plt.close(fig)
+    print(f"  saved {path}")
+    print(f"  Best w_mlp = {best_w:.1f}, mean LOPO F1 = {mean_f1[best_w]:.4f}")
+    return dict(mean_f1), best_w
+def plot_distributions():
+    print("\n=== EAR / MAR distributions ===")
+    npz_files = sorted(glob.glob(os.path.join(_PROJECT_ROOT, "data", "collected_*", "*.npz")))
+    all_ear_l, all_ear_r, all_mar, all_labels = [], [], [], []
+    for f in npz_files:
+        d = np.load(f, allow_pickle=True)
+        names = list(d["feature_names"])
+        feat = d["features"].astype(np.float32)
+        lab = d["labels"].astype(np.int64)
+        all_ear_l.append(feat[:, names.index("ear_left")])
+        all_ear_r.append(feat[:, names.index("ear_right")])
+        all_mar.append(feat[:, names.index("mar")])
+        all_labels.append(lab)
+    ear_l = np.concatenate(all_ear_l)
+    ear_r = np.concatenate(all_ear_r)
+    mar = np.concatenate(all_mar)
+    labels = np.concatenate(all_labels)
+    ear_min = np.minimum(ear_l, ear_r)
+    ear_plot = np.clip(ear_min, 0, 0.85)
+    mar_plot = np.clip(mar, 0, 1.5)
+    fig, ax = plt.subplots(figsize=(7, 4))
+    ax.hist(ear_plot[labels == 1], bins=100, alpha=0.6, label="Focused (1)", density=True)
+    ax.hist(ear_plot[labels == 0], bins=100, alpha=0.6, label="Unfocused (0)", density=True)
+    for val, lbl, c in [
+        (0.16, "ear_closed = 0.16", "red"),
+        (0.21, "EAR_BLINK = 0.21", "orange"),
+        (0.30, "ear_open = 0.30", "green"),
+    ]:
+        ax.axvline(val, color=c, ls="--", lw=1.5, label=lbl)
+    ax.set_xlabel("min(left_EAR, right_EAR)")
+    ax.set_ylabel("Density")
+    ax.set_title("EAR Distribution by Class (144k samples)")
+    ax.legend(fontsize=8)
+    fig.tight_layout()
+    path = os.path.join(PLOTS_DIR, "ear_distribution.png")
+    fig.savefig(path, dpi=150)
+    plt.close(fig)
+    print(f"  saved {path}")
+    fig, ax = plt.subplots(figsize=(7, 4))
+    ax.hist(mar_plot[labels == 1], bins=100, alpha=0.6, label="Focused (1)", density=True)
+    ax.hist(mar_plot[labels == 0], bins=100, alpha=0.6, label="Unfocused (0)", density=True)
+    ax.axvline(0.55, color="red", ls="--", lw=1.5, label="MAR_YAWN = 0.55")
+    ax.set_xlabel("Mouth Aspect Ratio (MAR)")
+    ax.set_ylabel("Density")
+    ax.set_title("MAR Distribution by Class (144k samples)")
+    ax.legend(fontsize=8)
+    fig.tight_layout()
+    path = os.path.join(PLOTS_DIR, "mar_distribution.png")
+    fig.savefig(path, dpi=150)
+    plt.close(fig)
+    print(f"  saved {path}")
+    closed_pct = np.mean(ear_min < 0.16) * 100
+    blink_pct = np.mean(ear_min < 0.21) * 100
+    open_pct = np.mean(ear_min >= 0.30) * 100
+    yawn_pct = np.mean(mar > 0.55) * 100
+    stats = {
+        "ear_below_016": closed_pct,
+        "ear_below_021": blink_pct,
+        "ear_above_030": open_pct,
+        "mar_above_055": yawn_pct,
+        "n_samples": len(ear_min),
+    }
+    print(f"  EAR<0.16 (closed): {closed_pct:.1f}%  |  EAR<0.21 (blink): {blink_pct:.1f}%  |  "
+          f"EAR>=0.30 (open): {open_pct:.1f}%")
+    print(f"  MAR>0.55 (yawn): {yawn_pct:.1f}%")
+    return stats
+def write_report(model_stats, geo_f1, best_alpha, hybrid_f1, best_w, dist_stats):
+    lines = []
+    lines.append("# Threshold Justification Report")
+    lines.append("")
+    lines.append("Auto-generated by `evaluation/justify_thresholds.py` using LOPO cross-validation "
+                 "over 9 participants (~145k samples).")
+    lines.append("")
+    lines.append("## 1. ML Model Decision Thresholds")
+    lines.append("")
+    lines.append("Thresholds selected via **Youden's J statistic** (J = sensitivity + specificity - 1) "
+                 "on pooled LOPO held-out predictions.")
+    lines.append("")
+    lines.append("| Model | LOPO AUC | Optimal Threshold (Youden's J) | F1 @ Optimal | F1 @ 0.50 |")
+    lines.append("|-------|----------|-------------------------------|--------------|-----------|")
+    for key in ("mlp", "xgb"):
+        s = model_stats[key]
+        lines.append(f"| {s['label']} | {s['auc']:.4f} | **{s['opt_threshold']:.3f}** | "
+                     f"{s['f1_opt']:.4f} | {s['f1_50']:.4f} |")
+    lines.append("")
+    lines.append("![MLP ROC](plots/roc_mlp.png)")
+    lines.append("")
+    lines.append("![XGBoost ROC](plots/roc_xgboost.png)")
+    lines.append("")
+    lines.append("## 2. Geometric Pipeline Weights (s_face vs s_eye)")
+    lines.append("")
+    lines.append("Grid search over face weight alpha in {0.2 ... 0.8}. "
+                 "Eye weight = 1 - alpha. Threshold per fold via Youden's J.")
+    lines.append("")
+    lines.append("| Face Weight (alpha) | Mean LOPO F1 |")
+    lines.append("|--------------------:|-------------:|")
+    for a in sorted(geo_f1.keys()):
+        marker = " **<-- selected**" if a == best_alpha else ""
+        lines.append(f"| {a:.1f} | {geo_f1[a]:.4f}{marker} |")
+    lines.append("")
+    lines.append(f"**Best:** alpha = {best_alpha:.1f} (face {best_alpha*100:.0f}%, "
+                 f"eye {(1-best_alpha)*100:.0f}%)")
+    lines.append("")
+    lines.append("![Geometric weight search](plots/geo_weight_search.png)")
+    lines.append("")
+    lines.append("## 3. Hybrid Pipeline Weights (MLP vs Geometric)")
+    lines.append("")
+    lines.append("Grid search over w_mlp in {0.3 ... 0.8}. w_geo = 1 - w_mlp. "
+                 "Geometric sub-score uses same weights as geometric pipeline (face=0.7, eye=0.3). "
+                 "If you change geometric weights, re-run this script — optimal w_mlp can shift.")
+    lines.append("")
+    lines.append("| MLP Weight (w_mlp) | Mean LOPO F1 |")
+    lines.append("|-------------------:|-------------:|")
+    for w in sorted(hybrid_f1.keys()):
+        marker = " **<-- selected**" if w == best_w else ""
+        lines.append(f"| {w:.1f} | {hybrid_f1[w]:.4f}{marker} |")
+    lines.append("")
+    lines.append(f"**Best:** w_mlp = {best_w:.1f} (MLP {best_w*100:.0f}%, "
+                 f"geometric {(1-best_w)*100:.0f}%)")
+    lines.append("")
+    lines.append("![Hybrid weight search](plots/hybrid_weight_search.png)")
+    lines.append("")
+    lines.append("## 4. Eye and Mouth Aspect Ratio Thresholds")
+    lines.append("")
+    lines.append("### EAR (Eye Aspect Ratio)")
+    lines.append("")
+    lines.append("Reference: Soukupova & Cech, \"Real-Time Eye Blink Detection Using Facial "
+                 "Landmarks\" (2016) established EAR ~ 0.2 as a blink threshold.")
+    lines.append("")
+    lines.append("Our thresholds define a linear interpolation zone around this established value:")
+    lines.append("")
+    lines.append("| Constant | Value | Justification |")
+    lines.append("|----------|------:|---------------|")
+    lines.append(f"| `ear_closed` | 0.16 | Below this, eyes are fully shut. "
+                 f"{dist_stats['ear_below_016']:.1f}% of samples fall here. |")
+    lines.append(f"| `EAR_BLINK_THRESH` | 0.21 | Blink detection point; close to the 0.2 reference. "
+                 f"{dist_stats['ear_below_021']:.1f}% of samples below. |")
+    lines.append(f"| `ear_open` | 0.30 | Above this, eyes are fully open. "
+                 f"{dist_stats['ear_above_030']:.1f}% of samples here. |")
+    lines.append("")
+    lines.append("Between 0.16 and 0.30 the `_ear_score` function linearly interpolates from 0 to 1, "
+                 "providing a smooth transition rather than a hard binary cutoff.")
+    lines.append("")
+    lines.append("![EAR distribution](plots/ear_distribution.png)")
+    lines.append("")
+    lines.append("### MAR (Mouth Aspect Ratio)")
+    lines.append("")
+    lines.append(f"| Constant | Value | Justification |")
+    lines.append("|----------|------:|---------------|")
+    lines.append(f"| `MAR_YAWN_THRESHOLD` | 0.55 | Only {dist_stats['mar_above_055']:.1f}% of "
+                 f"samples exceed this, confirming it captures genuine yawns without false positives. |")
+    lines.append("")
+    lines.append("![MAR distribution](plots/mar_distribution.png)")
+    lines.append("")
+    lines.append("## 5. Other Constants")
+    lines.append("")
+    lines.append("| Constant | Value | Rationale |")
+    lines.append("|----------|------:|-----------|")
+    lines.append("| `gaze_max_offset` | 0.28 | Max iris displacement (normalised) before gaze score "
+                 "drops to zero. Corresponds to ~56% of the eye width; beyond this the iris is at "
+                 "the extreme edge. |")
+    lines.append("| `max_angle` | 22.0 deg | Head deviation beyond which face score = 0. Based on "
+                 "typical monitor-viewing cone: at 60 cm distance and a 24\" monitor, the viewing "
+                 "angle is ~20-25 degrees. |")
+    lines.append("| `roll_weight` | 0.5 | Roll is less indicative of inattention than yaw/pitch "
+                 "(tilting head doesn't mean looking away), so it's down-weighted by 50%. |")
+    lines.append("| `EMA alpha` | 0.3 | Smoothing factor for focus score. "
+                 "Gives ~3-4 frame effective window; balances responsiveness vs flicker. |")
+    lines.append("| `grace_frames` | 15 | ~0.5 s at 30 fps before penalising no-face. Allows brief "
+                 "occlusions (e.g. hand gesture) without dropping score. |")
+    lines.append("| `PERCLOS_WINDOW` | 60 frames | 2 s at 30 fps; standard PERCLOS measurement "
+                 "window (Dinges & Grace, 1998). |")
+    lines.append("| `BLINK_WINDOW_SEC` | 30 s | Blink rate measured over 30 s; typical spontaneous "
+                 "blink rate is 15-20/min (Bentivoglio et al., 1997). |")
+    lines.append("")
+    with open(REPORT_PATH, "w", encoding="utf-8") as f:
+        f.write("\n".join(lines))
+    print(f"\nReport written to {REPORT_PATH}")
+def main():
+    os.makedirs(PLOTS_DIR, exist_ok=True)
+    lopo_results = run_lopo_models()
+    model_stats = analyse_model_thresholds(lopo_results)
+    geo_f1, best_alpha = run_geo_weight_search()
+    hybrid_f1, best_w = run_hybrid_weight_search(lopo_results)
+    dist_stats = plot_distributions()
+    write_report(model_stats, geo_f1, best_alpha, hybrid_f1, best_w, dist_stats)
+    print("\nDone.")
+if __name__ == "__main__":
+    main()

evaluation/plots/ear_distribution.png ADDED Viewed

evaluation/plots/geo_weight_search.png ADDED Viewed

evaluation/plots/hybrid_weight_search.png ADDED Viewed

evaluation/plots/mar_distribution.png ADDED Viewed

evaluation/plots/roc_mlp.png ADDED Viewed

evaluation/plots/roc_xgb.png ADDED Viewed

models/README.md CHANGED Viewed

@@ -11,8 +11,6 @@ Root-level modules form the real-time inference pipeline:
 | `face_mesh.py` | BGR frame | 478 MediaPipe landmarks |
 | `head_pose.py` | Landmarks, frame size | yaw, pitch, roll, face/eye score, gaze offset, head deviation |
 | `eye_scorer.py` | Landmarks | EAR (left/right/avg), gaze ratio (h/v), MAR |
-| `eye_crop.py` | Landmarks, frame | Cropped eye region images |
-| `eye_classifier.py` | Eye crops or landmarks | Eye open/closed prediction (geometric fallback) |
 | `collect_features.py` | BGR frame | 17-d feature vector + temporal features (PERCLOS, blink rate, etc.) |
 ## 2. Training Scripts

 | `face_mesh.py` | BGR frame | 478 MediaPipe landmarks |
 | `head_pose.py` | Landmarks, frame size | yaw, pitch, roll, face/eye score, gaze offset, head deviation |
 | `eye_scorer.py` | Landmarks | EAR (left/right/avg), gaze ratio (h/v), MAR |
 | `collect_features.py` | BGR frame | 17-d feature vector + temporal features (PERCLOS, blink rate, etc.) |
 ## 2. Training Scripts

models/cnn/CNN_MODEL/.claude/settings.local.json DELETED Viewed

@@ -1,7 +0,0 @@
-{
-  "permissions": {
-    "allow": [
-      "Bash(# Check Dataset_subset counts echo \"\"=== Dataset_subset/train/open ===\"\" && ls /Users/mohammedalketbi22/Downloads/GAP_Large_project-feature-dataset-model-test-92_30-clean/Dataset_subset/train/open/ | wc -l && echo \"\"=== Dataset_subset/train/closed ===\"\" && ls /Users/mohammedalketbi22/Downloads/GAP_Large_project-feature-dataset-model-test-92_30-clean/Dataset_subset/train/closed/ | wc -l && echo \"\"=== Dataset_subset/val/open ===\"\" && ls /Users/mohammedalketbi22/Downloads/GAP_Large_project-feature-dataset-model-test-92_30-clean/Dataset_subset/val/open/ | wc -l && echo \"\"=== Dataset_subset/val/closed ===\"\" && ls /Users/mohammedalketbi22/Downloads/GAP_Large_project-feature-dataset-model-test-92_30-clean/Dataset_subset/val/closed/)"
-    ]
-  }
-}

models/cnn/CNN_MODEL/.gitattributes DELETED Viewed

	@@ -1 +0,0 @@
1	- DATA/** filter=lfs diff=lfs merge=lfs -text

models/cnn/CNN_MODEL/.gitignore DELETED Viewed

@@ -1,4 +0,0 @@
-Dataset/train/
-Dataset/val/
-Dataset/test/
-.DS_Store

models/cnn/CNN_MODEL/README.md DELETED Viewed

@@ -1,74 +0,0 @@
-# Eye Open / Closed Classifier (YOLOv11-CLS)
-Binary classifier: **open** vs **closed** eyes.
-Used as a baseline for eye-tracking, drowsiness, or focus detection.
----
-## Model team task
-- **Train** the YOLOv11s-cls eye classifier in a **separate notebook** (data split, epochs, GPU, export `best.pt`).
-- Provide **trained weights** (`best.pt`) for this repo’s evaluation and inference scripts.
----
-## Repo contents
-- **notebooks/eye_classifier_colab.ipynb** — Data download (Kaggle), clean, split, undersample, **evaluate** (needs `best.pt` from model team), export.
-- **scripts/predict_image.py** — Run classifier on single images (needs `best.pt`).
-- **scripts/webcam_live.py** — Live webcam open/closed (needs `best.pt` + optional `weights/face_landmarker.task`).
-- **scripts/video_infer.py** — Run on video files.
-- **scripts/focus_infer.py** — Focus/attention inference.
-- **weights/** — Put `best.pt` here; `face_landmarker.task` is downloaded on first webcam run if missing.
-- **docs/** — Extra docs (e.g. UNNECESSARY_FILES.md if present).
----
-## Dataset
-- **Source:** [Kaggle — open/closed eyes](https://www.kaggle.com/datasets/sehriyarmemmedli/open-closed-eyes-dataset)
-- The Colab notebook downloads it via `kagglehub`; no local copy in repo.
----
-## Weights
-- Put **best.pt** from the model team in **weights/best.pt** (or `runs/classify/runs_cls/eye_open_closed_cpu/weights/best.pt`).
-- For webcam: **face_landmarker.task** is downloaded into **weights/** on first run if missing.
----
-## Local setup
-```bash
-pip install ultralytics opencv-python mediapipe "numpy<2"
-```
-Optional: use a venv. From repo root:
-- `python scripts/predict_image.py <image.png>`
-- `python scripts/webcam_live.py`
-- `python scripts/video_infer.py` (expects 1.mp4 / 2.mp4 in repo root or set `VIDEOS` env)
-- `python scripts/focus_infer.py`
----
-## Project structure
-```
-├── notebooks/
-│   └── eye_classifier_colab.ipynb   # Data + eval (no training)
-├── scripts/
-│   ├── predict_image.py
-│   ├── webcam_live.py
-│   ├── video_infer.py
-│   └── focus_infer.py
-├── weights/                     # best.pt, face_landmarker.task
-├── docs/                        # extra docs
-├── README.md
-└── venv/                        # optional
-```
-Training and weight generation: **model team, separate notebook.**

models/cnn/CNN_MODEL/notebooks/eye_classifier_colab.ipynb DELETED Viewed

The diff for this file is too large to render. See raw diff

models/cnn/CNN_MODEL/scripts/focus_infer.py DELETED Viewed

@@ -1,199 +0,0 @@
-from __future__ import annotations
-from pathlib import Path
-import os
-import cv2
-import numpy as np
-from ultralytics import YOLO
-def list_images(folder: Path):
-    exts = {".png", ".jpg", ".jpeg", ".bmp", ".webp"}
-    return sorted([p for p in folder.iterdir() if p.suffix.lower() in exts])
-def find_weights(project_root: Path) -> Path | None:
-    candidates = [
-        project_root / "weights" / "best.pt",
-        project_root / "runs" / "classify" / "runs_cls" / "eye_open_closed_cpu" / "weights" / "best.pt",
-        project_root / "runs" / "classify" / "runs_cls" / "eye_open_closed_cpu" / "weights" / "last.pt",
-        project_root / "runs_cls" / "eye_open_closed_cpu" / "weights" / "best.pt",
-        project_root / "runs_cls" / "eye_open_closed_cpu" / "weights" / "last.pt",
-    ]
-    return next((p for p in candidates if p.is_file()), None)
-def detect_eyelid_boundary(gray: np.ndarray) -> np.ndarray | None:
-    """
-    Returns an ellipse fit to the largest contour near the eye boundary.
-    Output format: (center(x,y), (axis1, axis2), angle) or None.
-    """
-    blur = cv2.GaussianBlur(gray, (5, 5), 0)
-    edges = cv2.Canny(blur, 40, 120)
-    edges = cv2.dilate(edges, np.ones((3, 3), np.uint8), iterations=1)
-    contours, _ = cv2.findContours(edges, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
-    if not contours:
-        return None
-    contours = sorted(contours, key=cv2.contourArea, reverse=True)
-    for c in contours:
-        if len(c) >= 5 and cv2.contourArea(c) > 50:
-            return cv2.fitEllipse(c)
-    return None
-def detect_pupil_center(gray: np.ndarray) -> tuple[int, int] | None:
-    """
-    More robust pupil detection:
-    - enhance contrast (CLAHE)
-    - find dark blobs
-    - score by circularity and proximity to center
-    """
-    h, w = gray.shape
-    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
-    eq = clahe.apply(gray)
-    blur = cv2.GaussianBlur(eq, (7, 7), 0)
-    # Focus on the central region to avoid eyelashes/edges
-    cx, cy = w // 2, h // 2
-    rx, ry = int(w * 0.3), int(h * 0.3)
-    x0, x1 = max(cx - rx, 0), min(cx + rx, w)
-    y0, y1 = max(cy - ry, 0), min(cy + ry, h)
-    roi = blur[y0:y1, x0:x1]
-    # Inverted threshold to capture dark pupil
-    _, thresh = cv2.threshold(roi, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
-    thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, np.ones((3, 3), np.uint8), iterations=2)
-    thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, np.ones((5, 5), np.uint8), iterations=1)
-    contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
-    if not contours:
-        return None
-    best = None
-    best_score = -1.0
-    for c in contours:
-        area = cv2.contourArea(c)
-        if area < 15:
-            continue
-        perimeter = cv2.arcLength(c, True)
-        if perimeter <= 0:
-            continue
-        circularity = 4 * np.pi * (area / (perimeter * perimeter))
-        if circularity < 0.3:
-            continue
-        m = cv2.moments(c)
-        if m["m00"] == 0:
-            continue
-        px = int(m["m10"] / m["m00"]) + x0
-        py = int(m["m01"] / m["m00"]) + y0
-        # Score by circularity and distance to center
-        dist = np.hypot(px - cx, py - cy) / max(w, h)
-        score = circularity - dist
-        if score > best_score:
-            best_score = score
-            best = (px, py)
-    return best
-def is_focused(pupil_center: tuple[int, int], img_shape: tuple[int, int]) -> bool:
-    """
-    Decide focus based on pupil offset from image center.
-    """
-    h, w = img_shape
-    cx, cy = w // 2, h // 2
-    px, py = pupil_center
-    dx = abs(px - cx) / max(w, 1)
-    dy = abs(py - cy) / max(h, 1)
-    return (dx < 0.12) and (dy < 0.12)
-def annotate(img_bgr: np.ndarray, ellipse, pupil_center, focused: bool, cls_label: str, conf: float):
-    out = img_bgr.copy()
-    if ellipse is not None:
-        cv2.ellipse(out, ellipse, (0, 255, 255), 2)
-    if pupil_center is not None:
-        cv2.circle(out, pupil_center, 4, (0, 0, 255), -1)
-    label = f"{cls_label} ({conf:.2f}) | focused={int(focused)}"
-    cv2.putText(out, label, (8, 20), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)
-    return out
-def main():
-    project_root = Path(__file__).resolve().parent.parent
-    data_dir = project_root / "Dataset"
-    alt_data_dir = project_root / "DATA"
-    out_dir = project_root / "runs_focus"
-    out_dir.mkdir(parents=True, exist_ok=True)
-    weights = find_weights(project_root)
-    if weights is None:
-        print("Weights not found. Train first.")
-        return
-    # Support both Dataset/test/{open,closed} and Dataset/{open,closed}
-    def resolve_test_dirs(root: Path):
-        test_open = root / "test" / "open"
-        test_closed = root / "test" / "closed"
-        if test_open.exists() and test_closed.exists():
-            return test_open, test_closed
-        test_open = root / "open"
-        test_closed = root / "closed"
-        if test_open.exists() and test_closed.exists():
-            return test_open, test_closed
-        alt_closed = root / "close"
-        if test_open.exists() and alt_closed.exists():
-            return test_open, alt_closed
-        return None, None
-    test_open, test_closed = resolve_test_dirs(data_dir)
-    if (test_open is None or test_closed is None) and alt_data_dir.exists():
-        test_open, test_closed = resolve_test_dirs(alt_data_dir)
-    if not test_open.exists() or not test_closed.exists():
-        print("Test folders missing. Expected:")
-        print(test_open)
-        print(test_closed)
-        return
-    test_files = list_images(test_open) + list_images(test_closed)
-    print("Total test images:", len(test_files))
-    max_images = int(os.getenv("MAX_IMAGES", "0"))
-    if max_images > 0:
-        test_files = test_files[:max_images]
-        print("Limiting to MAX_IMAGES:", max_images)
-    model = YOLO(str(weights))
-    results = model.predict(test_files, imgsz=224, device="cpu", verbose=False)
-    names = model.names
-    for r in results:
-        probs = r.probs
-        top_idx = int(probs.top1)
-        top_conf = float(probs.top1conf)
-        pred_label = names[top_idx]
-        img = cv2.imread(r.path)
-        if img is None:
-            continue
-        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
-        ellipse = detect_eyelid_boundary(gray)
-        pupil_center = detect_pupil_center(gray)
-        focused = False
-        if pred_label.lower() == "open" and pupil_center is not None:
-            focused = is_focused(pupil_center, gray.shape)
-        annotated = annotate(img, ellipse, pupil_center, focused, pred_label, top_conf)
-        out_path = out_dir / (Path(r.path).stem + "_annotated.jpg")
-        cv2.imwrite(str(out_path), annotated)
-        print(f"{Path(r.path).name}: pred={pred_label} conf={top_conf:.3f} focused={focused}")
-    print(f"\nAnnotated outputs saved to: {out_dir}")
-if __name__ == "__main__":
-    main()

models/cnn/CNN_MODEL/scripts/predict_image.py DELETED Viewed

@@ -1,49 +0,0 @@
-"""Run the eye open/closed model on one or more images."""
-import sys
-from pathlib import Path
-from ultralytics import YOLO
-def main():
-    project_root = Path(__file__).resolve().parent.parent
-    weight_candidates = [
-        project_root / "weights" / "best.pt",
-        project_root / "runs" / "classify" / "runs_cls" / "eye_open_closed_cpu" / "weights" / "best.pt",
-        project_root / "runs" / "classify" / "runs_cls" / "eye_open_closed_cpu" / "weights" / "last.pt",
-    ]
-    weights = next((p for p in weight_candidates if p.is_file()), None)
-    if weights is None:
-        print("Weights not found. Put best.pt in weights/ or runs/.../weights/ (from model team).")
-        sys.exit(1)
-    if len(sys.argv) < 2:
-        print("Usage: python scripts/predict_image.py <image1> [image2 ...]")
-        print("Example: python scripts/predict_image.py path/to/image.png")
-        sys.exit(0)
-    model = YOLO(str(weights))
-    names = model.names
-    for path in sys.argv[1:]:
-        p = Path(path)
-        if not p.is_file():
-            print(p, "- file not found")
-            continue
-        try:
-            results = model.predict(str(p), imgsz=224, device="cpu", verbose=False)
-        except Exception as e:
-            print(p, "- error:", e)
-            continue
-        if not results:
-            print(p, "- no result")
-            continue
-        r = results[0]
-        top_idx = int(r.probs.top1)
-        conf = float(r.probs.top1conf)
-        label = names[top_idx]
-        print(f"{p.name}: {label} ({conf:.2%})")
-if __name__ == "__main__":
-    main()

models/cnn/CNN_MODEL/scripts/video_infer.py DELETED Viewed

@@ -1,281 +0,0 @@
-from __future__ import annotations
-import os
-from pathlib import Path
-import cv2
-import numpy as np
-from ultralytics import YOLO
-try:
-    import mediapipe as mp
-except Exception:  # pragma: no cover
-    mp = None
-def find_weights(project_root: Path) -> Path | None:
-    candidates = [
-        project_root / "weights" / "best.pt",
-        project_root / "runs" / "classify" / "runs_cls" / "eye_open_closed_cpu" / "weights" / "best.pt",
-        project_root / "runs" / "classify" / "runs_cls" / "eye_open_closed_cpu" / "weights" / "last.pt",
-        project_root / "runs_cls" / "eye_open_closed_cpu" / "weights" / "best.pt",
-        project_root / "runs_cls" / "eye_open_closed_cpu" / "weights" / "last.pt",
-    ]
-    return next((p for p in candidates if p.is_file()), None)
-def detect_pupil_center(gray: np.ndarray) -> tuple[int, int] | None:
-    h, w = gray.shape
-    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
-    eq = clahe.apply(gray)
-    blur = cv2.GaussianBlur(eq, (7, 7), 0)
-    cx, cy = w // 2, h // 2
-    rx, ry = int(w * 0.3), int(h * 0.3)
-    x0, x1 = max(cx - rx, 0), min(cx + rx, w)
-    y0, y1 = max(cy - ry, 0), min(cy + ry, h)
-    roi = blur[y0:y1, x0:x1]
-    _, thresh = cv2.threshold(roi, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
-    thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, np.ones((3, 3), np.uint8), iterations=2)
-    thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, np.ones((5, 5), np.uint8), iterations=1)
-    contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
-    if not contours:
-        return None
-    best = None
-    best_score = -1.0
-    for c in contours:
-        area = cv2.contourArea(c)
-        if area < 15:
-            continue
-        perimeter = cv2.arcLength(c, True)
-        if perimeter <= 0:
-            continue
-        circularity = 4 * np.pi * (area / (perimeter * perimeter))
-        if circularity < 0.3:
-            continue
-        m = cv2.moments(c)
-        if m["m00"] == 0:
-            continue
-        px = int(m["m10"] / m["m00"]) + x0
-        py = int(m["m01"] / m["m00"]) + y0
-        dist = np.hypot(px - cx, py - cy) / max(w, h)
-        score = circularity - dist
-        if score > best_score:
-            best_score = score
-            best = (px, py)
-    return best
-def is_focused(pupil_center: tuple[int, int], img_shape: tuple[int, int]) -> bool:
-    h, w = img_shape
-    cx = w // 2
-    px, _ = pupil_center
-    dx = abs(px - cx) / max(w, 1)
-    return dx < 0.12
-def classify_frame(model: YOLO, frame: np.ndarray) -> tuple[str, float]:
-    # Use classifier directly on frame (assumes frame is eye crop)
-    results = model.predict(frame, imgsz=224, device="cpu", verbose=False)
-    r = results[0]
-    probs = r.probs
-    top_idx = int(probs.top1)
-    top_conf = float(probs.top1conf)
-    pred_label = model.names[top_idx]
-    return pred_label, top_conf
-def annotate_frame(frame: np.ndarray, label: str, focused: bool, conf: float, time_sec: float):
-    out = frame.copy()
-    text = f"{label} | focused={int(focused)} | conf={conf:.2f} | t={time_sec:.2f}s"
-    cv2.putText(out, text, (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)
-    return out
-def write_segments(path: Path, segments: list[tuple[float, float, str]]):
-    with path.open("w") as f:
-        for start, end, label in segments:
-            f.write(f"{start:.2f},{end:.2f},{label}\n")
-def process_video(video_path: Path, model: YOLO | None):
-    cap = cv2.VideoCapture(str(video_path))
-    if not cap.isOpened():
-        print(f"Failed to open {video_path}")
-        return
-    fps = cap.get(cv2.CAP_PROP_FPS) or 30.0
-    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
-    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
-    out_path = video_path.with_name(video_path.stem + "_pred.mp4")
-    fourcc = cv2.VideoWriter_fourcc(*"mp4v")
-    writer = cv2.VideoWriter(str(out_path), fourcc, fps, (width, height))
-    csv_path = video_path.with_name(video_path.stem + "_predictions.csv")
-    seg_path = video_path.with_name(video_path.stem + "_segments.txt")
-    frame_idx = 0
-    last_label = None
-    seg_start = 0.0
-    segments: list[tuple[float, float, str]] = []
-    with csv_path.open("w") as fcsv:
-        fcsv.write("time_sec,label,focused,conf\n")
-        if mp is None:
-            print("mediapipe is not installed. Falling back to classifier-only mode.")
-        use_mp = mp is not None
-        if use_mp:
-            mp_face_mesh = mp.solutions.face_mesh
-            face_mesh = mp_face_mesh.FaceMesh(
-                static_image_mode=False,
-                max_num_faces=1,
-                refine_landmarks=True,
-                min_detection_confidence=0.5,
-                min_tracking_confidence=0.5,
-            )
-        while True:
-            ret, frame = cap.read()
-            if not ret:
-                break
-            time_sec = frame_idx / fps
-            conf = 0.0
-            pred_label = "open"
-            focused = False
-            if use_mp:
-                rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
-                res = face_mesh.process(rgb)
-                if res.multi_face_landmarks:
-                    lm = res.multi_face_landmarks[0].landmark
-                    h, w = frame.shape[:2]
-                    # Eye landmarks (MediaPipe FaceMesh)
-                    left_eye = [33, 160, 158, 133, 153, 144]
-                    right_eye = [362, 385, 387, 263, 373, 380]
-                    left_iris = [468, 469, 470, 471]
-                    right_iris = [473, 474, 475, 476]
-                    def pts(idxs):
-                        return np.array([(int(lm[i].x * w), int(lm[i].y * h)) for i in idxs])
-                    def ear(eye_pts):
-                        # EAR using 6 points
-                        p1, p2, p3, p4, p5, p6 = eye_pts
-                        v1 = np.linalg.norm(p2 - p6)
-                        v2 = np.linalg.norm(p3 - p5)
-                        h1 = np.linalg.norm(p1 - p4)
-                        return (v1 + v2) / (2.0 * h1 + 1e-6)
-                    le = pts(left_eye)
-                    re = pts(right_eye)
-                    le_ear = ear(le)
-                    re_ear = ear(re)
-                    ear_avg = (le_ear + re_ear) / 2.0
-                    # openness threshold
-                    pred_label = "open" if ear_avg > 0.22 else "closed"
-                    # iris centers
-                    li = pts(left_iris)
-                    ri = pts(right_iris)
-                    li_c = li.mean(axis=0).astype(int)
-                    ri_c = ri.mean(axis=0).astype(int)
-                    # eye centers (midpoint of corners)
-                    le_c = ((le[0] + le[3]) / 2).astype(int)
-                    re_c = ((re[0] + re[3]) / 2).astype(int)
-                    # focus = iris close to eye center horizontally for both eyes
-                    le_dx = abs(li_c[0] - le_c[0]) / max(np.linalg.norm(le[0] - le[3]), 1)
-                    re_dx = abs(ri_c[0] - re_c[0]) / max(np.linalg.norm(re[0] - re[3]), 1)
-                    focused = (pred_label == "open") and (le_dx < 0.18) and (re_dx < 0.18)
-                    # draw eye boundaries
-                    cv2.polylines(frame, [le], True, (0, 255, 255), 1)
-                    cv2.polylines(frame, [re], True, (0, 255, 255), 1)
-                    # draw iris centers
-                    cv2.circle(frame, tuple(li_c), 3, (0, 0, 255), -1)
-                    cv2.circle(frame, tuple(ri_c), 3, (0, 0, 255), -1)
-                else:
-                    pred_label = "closed"
-                    focused = False
-            else:
-                if model is not None:
-                    pred_label, conf = classify_frame(model, frame)
-                gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
-                pupil_center = detect_pupil_center(gray) if pred_label.lower() == "open" else None
-                focused = False
-                if pred_label.lower() == "open" and pupil_center is not None:
-                    focused = is_focused(pupil_center, gray.shape)
-            if pred_label.lower() != "open":
-                focused = False
-            label = "open_focused" if (pred_label.lower() == "open" and focused) else "open_not_focused"
-            if pred_label.lower() != "open":
-                label = "closed_not_focused"
-            fcsv.write(f"{time_sec:.2f},{label},{int(focused)},{conf:.4f}\n")
-            if last_label is None:
-                last_label = label
-                seg_start = time_sec
-            elif label != last_label:
-                segments.append((seg_start, time_sec, last_label))
-                seg_start = time_sec
-                last_label = label
-            annotated = annotate_frame(frame, label, focused, conf, time_sec)
-            writer.write(annotated)
-            frame_idx += 1
-    if last_label is not None:
-        end_time = frame_idx / fps
-        segments.append((seg_start, end_time, last_label))
-    write_segments(seg_path, segments)
-    cap.release()
-    writer.release()
-    print(f"Saved: {out_path}")
-    print(f"CSV: {csv_path}")
-    print(f"Segments: {seg_path}")
-def main():
-    project_root = Path(__file__).resolve().parent.parent
-    weights = find_weights(project_root)
-    model = YOLO(str(weights)) if weights is not None else None
-    # Default to 1.mp4 and 2.mp4 in project root
-    videos = []
-    for name in ["1.mp4", "2.mp4"]:
-        p = project_root / name
-        if p.exists():
-            videos.append(p)
-    # Also allow passing paths via env var
-    extra = os.getenv("VIDEOS", "")
-    for v in [x.strip() for x in extra.split(",") if x.strip()]:
-        vp = Path(v)
-        if not vp.is_absolute():
-            vp = project_root / vp
-        if vp.exists():
-            videos.append(vp)
-    if not videos:
-        print("No videos found. Expected 1.mp4 / 2.mp4 in project root.")
-        return
-    for v in videos:
-        process_video(v, model)
-if __name__ == "__main__":
-    main()

models/cnn/CNN_MODEL/scripts/webcam_live.py DELETED Viewed

@@ -1,184 +0,0 @@
-"""
-Live webcam: detect face, crop each eye, run open/closed classifier, show on screen.
-Requires: opencv-python, ultralytics, mediapipe (pip install mediapipe).
-Press 'q' to quit.
-"""
-import urllib.request
-from pathlib import Path
-import cv2
-import numpy as np
-from ultralytics import YOLO
-try:
-    import mediapipe as mp
-    _mp_has_solutions = hasattr(mp, "solutions")
-except ImportError:
-    mp = None
-    _mp_has_solutions = False
-# New MediaPipe Tasks API (Face Landmarker) eye indices
-LEFT_EYE_INDICES_NEW = [263, 249, 390, 373, 374, 380, 381, 382, 362, 466, 388, 387, 386, 385, 384, 398]
-RIGHT_EYE_INDICES_NEW = [33, 7, 163, 144, 145, 153, 154, 155, 133, 246, 161, 160, 159, 158, 157, 173]
-# Old Face Mesh (solutions) indices
-LEFT_EYE_INDICES_OLD = [33, 160, 158, 133, 153, 144]
-RIGHT_EYE_INDICES_OLD = [362, 385, 387, 263, 373, 380]
-EYE_PADDING = 0.35
-def find_weights(project_root: Path) -> Path | None:
-    candidates = [
-        project_root / "weights" / "best.pt",
-        project_root / "runs" / "classify" / "runs_cls" / "eye_open_closed_cpu" / "weights" / "best.pt",
-        project_root / "runs" / "classify" / "runs_cls" / "eye_open_closed_cpu" / "weights" / "last.pt",
-    ]
-    return next((p for p in candidates if p.is_file()), None)
-def get_eye_roi(frame: np.ndarray, landmarks, indices: list[int]) -> np.ndarray | None:
-    h, w = frame.shape[:2]
-    pts = np.array([(int(landmarks[i].x * w), int(landmarks[i].y * h)) for i in indices])
-    x_min, y_min = pts.min(axis=0)
-    x_max, y_max = pts.max(axis=0)
-    dx = max(int((x_max - x_min) * EYE_PADDING), 8)
-    dy = max(int((y_max - y_min) * EYE_PADDING), 8)
-    x0 = max(0, x_min - dx)
-    y0 = max(0, y_min - dy)
-    x1 = min(w, x_max + dx)
-    y1 = min(h, y_max + dy)
-    if x1 <= x0 or y1 <= y0:
-        return None
-    return frame[y0:y1, x0:x1].copy()
-def _run_with_solutions(mp, model, cap):
-    face_mesh = mp.solutions.face_mesh.FaceMesh(
-        static_image_mode=False,
-        max_num_faces=1,
-        refine_landmarks=True,
-        min_detection_confidence=0.5,
-        min_tracking_confidence=0.5,
-    )
-    while True:
-        ret, frame = cap.read()
-        if not ret:
-            break
-        rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
-        results = face_mesh.process(rgb)
-        left_label, left_conf = "—", 0.0
-        right_label, right_conf = "—", 0.0
-        if results.multi_face_landmarks:
-            lm = results.multi_face_landmarks[0].landmark
-            for roi, indices, side in [
-                (get_eye_roi(frame, lm, LEFT_EYE_INDICES_OLD), LEFT_EYE_INDICES_OLD, "left"),
-                (get_eye_roi(frame, lm, RIGHT_EYE_INDICES_OLD), RIGHT_EYE_INDICES_OLD, "right"),
-            ]:
-                if roi is not None and roi.size > 0:
-                    try:
-                        pred = model.predict(roi, imgsz=224, device="cpu", verbose=False)
-                        if pred:
-                            r = pred[0]
-                            label = model.names[int(r.probs.top1)]
-                            conf = float(r.probs.top1conf)
-                            if side == "left":
-                                left_label, left_conf = label, conf
-                            else:
-                                right_label, right_conf = label, conf
-                    except Exception:
-                        pass
-        cv2.putText(frame, f"L: {left_label} ({left_conf:.0%})", (20, 40), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)
-        cv2.putText(frame, f"R: {right_label} ({right_conf:.0%})", (20, 80), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)
-        cv2.imshow("Eye open/closed (q to quit)", frame)
-        if cv2.waitKey(1) & 0xFF == ord("q"):
-            break
-def _run_with_tasks(project_root: Path, model, cap):
-    from mediapipe.tasks.python import BaseOptions
-    from mediapipe.tasks.python.vision import FaceLandmarker, FaceLandmarkerOptions
-    from mediapipe.tasks.python.vision.core import vision_task_running_mode as running_mode
-    from mediapipe.tasks.python.vision.core import image as image_lib
-    model_path = project_root / "weights" / "face_landmarker.task"
-    if not model_path.is_file():
-        print("Downloading face_landmarker.task ...")
-        url = "https://storage.googleapis.com/mediapipe-models/face_landmarker/face_landmarker/float16/1/face_landmarker.task"
-        urllib.request.urlretrieve(url, model_path)
-        print("Done.")
-    options = FaceLandmarkerOptions(
-        base_options=BaseOptions(model_asset_path=str(model_path)),
-        running_mode=running_mode.VisionTaskRunningMode.IMAGE,
-        num_faces=1,
-    )
-    face_landmarker = FaceLandmarker.create_from_options(options)
-    ImageFormat = image_lib.ImageFormat
-    while True:
-        ret, frame = cap.read()
-        if not ret:
-            break
-        left_label, left_conf = "—", 0.0
-        right_label, right_conf = "—", 0.0
-        rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
-        rgb_contiguous = np.ascontiguousarray(rgb)
-        mp_image = image_lib.Image(ImageFormat.SRGB, rgb_contiguous)
-        result = face_landmarker.detect(mp_image)
-        if result.face_landmarks:
-            lm = result.face_landmarks[0]
-            for roi, side in [
-                (get_eye_roi(frame, lm, LEFT_EYE_INDICES_NEW), "left"),
-                (get_eye_roi(frame, lm, RIGHT_EYE_INDICES_NEW), "right"),
-            ]:
-                if roi is not None and roi.size > 0:
-                    try:
-                        pred = model.predict(roi, imgsz=224, device="cpu", verbose=False)
-                        if pred:
-                            r = pred[0]
-                            label = model.names[int(r.probs.top1)]
-                            conf = float(r.probs.top1conf)
-                            if side == "left":
-                                left_label, left_conf = label, conf
-                            else:
-                                right_label, right_conf = label, conf
-                    except Exception:
-                        pass
-        cv2.putText(frame, f"L: {left_label} ({left_conf:.0%})", (20, 40), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)
-        cv2.putText(frame, f"R: {right_label} ({right_conf:.0%})", (20, 80), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)
-        cv2.imshow("Eye open/closed (q to quit)", frame)
-        if cv2.waitKey(1) & 0xFF == ord("q"):
-            break
-def main():
-    project_root = Path(__file__).resolve().parent.parent
-    weights = find_weights(project_root)
-    if weights is None:
-        print("Weights not found. Put best.pt in weights/ or runs/.../weights/ (from model team).")
-        return
-    if mp is None:
-        print("MediaPipe required. Install: pip install mediapipe")
-        return
-    model = YOLO(str(weights))
-    cap = cv2.VideoCapture(0)
-    if not cap.isOpened():
-        print("Could not open webcam.")
-        return
-    print("Live eye open/closed on your face. Press 'q' to quit.")
-    try:
-        if _mp_has_solutions:
-            _run_with_solutions(mp, model, cap)
-        else:
-            _run_with_tasks(project_root, model, cap)
-    finally:
-        cap.release()
-        cv2.destroyAllWindows()
-if __name__ == "__main__":
-    main()

models/cnn/CNN_MODEL/weights/yolo11s-cls.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:e2b605d1c8c212b434a75a32759a6f7adf1d2b29c35f76bdccd4c794cb653cf2
-size 13630112

models/cnn/__init__.py DELETED Viewed

File without changes

models/cnn/eye_attention/__init__.py DELETED Viewed

	@@ -1 +0,0 @@
1	-

models/cnn/eye_attention/classifier.py DELETED Viewed

@@ -1,169 +0,0 @@
-from __future__ import annotations
-import os
-from abc import ABC, abstractmethod
-import numpy as np
-class EyeClassifier(ABC):
-    @property
-    @abstractmethod
-    def name(self) -> str:
-        pass
-    @abstractmethod
-    def predict_score(self, crops_bgr: list[np.ndarray]) -> float:
-        pass
-class GeometricOnlyClassifier(EyeClassifier):
-    @property
-    def name(self) -> str:
-        return "geometric"
-    def predict_score(self, crops_bgr: list[np.ndarray]) -> float:
-        return 1.0
-class YOLOv11Classifier(EyeClassifier):
-    def __init__(self, checkpoint_path: str, device: str = "cpu"):
-        from ultralytics import YOLO
-        self._model = YOLO(checkpoint_path)
-        self._device = device
-        names = self._model.names
-        self._attentive_idx = None
-        for idx, cls_name in names.items():
-            if cls_name in ("open", "attentive"):
-                self._attentive_idx = idx
-                break
-        if self._attentive_idx is None:
-            self._attentive_idx = max(names.keys())
-        print(f"[YOLO] Classes: {names}, attentive_idx={self._attentive_idx}")
-    @property
-    def name(self) -> str:
-        return "yolo"
-    def predict_score(self, crops_bgr: list[np.ndarray]) -> float:
-        if not crops_bgr:
-            return 1.0
-        results = self._model.predict(crops_bgr, device=self._device, verbose=False)
-        scores = [float(r.probs.data[self._attentive_idx]) for r in results]
-        return sum(scores) / len(scores) if scores else 1.0
-class EyeCNNClassifier(EyeClassifier):
-    """Loader for the custom PyTorch EyeCNN (trained on Kaggle eye crops)."""
-    def __init__(self, checkpoint_path: str, device: str = "cpu"):
-        import torch
-        import torch.nn as nn
-        class EyeCNN(nn.Module):
-            def __init__(self, num_classes=2, dropout_rate=0.3):
-                super().__init__()
-                self.conv_layers = nn.Sequential(
-                    nn.Conv2d(3, 32, 3, 1, 1), nn.BatchNorm2d(32), nn.ReLU(), nn.MaxPool2d(2, 2),
-                    nn.Conv2d(32, 64, 3, 1, 1), nn.BatchNorm2d(64), nn.ReLU(), nn.MaxPool2d(2, 2),
-                    nn.Conv2d(64, 128, 3, 1, 1), nn.BatchNorm2d(128), nn.ReLU(), nn.MaxPool2d(2, 2),
-                    nn.Conv2d(128, 256, 3, 1, 1), nn.BatchNorm2d(256), nn.ReLU(), nn.MaxPool2d(2, 2),
-                )
-                self.fc_layers = nn.Sequential(
-                    nn.AdaptiveAvgPool2d((1, 1)), nn.Flatten(),
-                    nn.Linear(256, 512), nn.ReLU(), nn.Dropout(dropout_rate),
-                    nn.Linear(512, num_classes),
-                )
-            def forward(self, x):
-                return self.fc_layers(self.conv_layers(x))
-        self._device = torch.device(device)
-        checkpoint = torch.load(checkpoint_path, map_location=self._device, weights_only=False)
-        dropout_rate = checkpoint.get("config", {}).get("dropout_rate", 0.35)
-        self._model = EyeCNN(num_classes=2, dropout_rate=dropout_rate)
-        self._model.load_state_dict(checkpoint["model_state_dict"])
-        self._model.to(self._device)
-        self._model.eval()
-        self._transform = None  # built lazily
-    def _get_transform(self):
-        if self._transform is None:
-            from torchvision import transforms
-            self._transform = transforms.Compose([
-                transforms.ToPILImage(),
-                transforms.Resize((96, 96)),
-                transforms.ToTensor(),
-                transforms.Normalize(
-                    mean=[0.485, 0.456, 0.406],
-                    std=[0.229, 0.224, 0.225],
-                ),
-            ])
-        return self._transform
-    @property
-    def name(self) -> str:
-        return "eye_cnn"
-    def predict_score(self, crops_bgr: list[np.ndarray]) -> float:
-        if not crops_bgr:
-            return 1.0
-        import torch
-        import cv2
-        transform = self._get_transform()
-        scores = []
-        for crop in crops_bgr:
-            if crop is None or crop.size == 0:
-                scores.append(1.0)
-                continue
-            rgb = cv2.cvtColor(crop, cv2.COLOR_BGR2RGB)
-            tensor = transform(rgb).unsqueeze(0).to(self._device)
-            with torch.no_grad():
-                output = self._model(tensor)
-                prob = torch.softmax(output, dim=1)[0, 1].item()  # prob of "open"
-            scores.append(prob)
-        return sum(scores) / len(scores)
-_EXT_TO_BACKEND = {".pth": "cnn", ".pt": "yolo"}
-def load_eye_classifier(
-    path: str | None = None,
-    backend: str = "yolo",
-    device: str = "cpu",
-) -> EyeClassifier:
-    if backend == "geometric":
-        return GeometricOnlyClassifier()
-    if path is None:
-        print(f"[CLASSIFIER] No model path for backend {backend!r}, falling back to geometric")
-        return GeometricOnlyClassifier()
-    ext = os.path.splitext(path)[1].lower()
-    inferred = _EXT_TO_BACKEND.get(ext)
-    if inferred and inferred != backend:
-        print(f"[CLASSIFIER] File extension {ext!r} implies backend {inferred!r}, "
-              f"overriding requested {backend!r}")
-        backend = inferred
-    print(f"[CLASSIFIER] backend={backend!r}, path={path!r}")
-    if backend == "cnn":
-        return EyeCNNClassifier(path, device=device)
-    if backend == "yolo":
-        try:
-            return YOLOv11Classifier(path, device=device)
-        except ImportError:
-            print("[CLASSIFIER] ultralytics required for YOLO. pip install ultralytics")
-            raise
-    raise ValueError(
-        f"Unknown eye backend {backend!r}. Choose from: yolo, cnn, geometric"
-    )

models/cnn/eye_attention/crop.py DELETED Viewed

@@ -1,70 +0,0 @@
-import cv2
-import numpy as np
-from models.pretrained.face_mesh.face_mesh import FaceMeshDetector
-LEFT_EYE_CONTOUR = FaceMeshDetector.LEFT_EYE_INDICES
-RIGHT_EYE_CONTOUR = FaceMeshDetector.RIGHT_EYE_INDICES
-IMAGENET_MEAN = (0.485, 0.456, 0.406)
-IMAGENET_STD = (0.229, 0.224, 0.225)
-CROP_SIZE = 96
-def _bbox_from_landmarks(
-    landmarks: np.ndarray,
-    indices: list[int],
-    frame_w: int,
-    frame_h: int,
-    expand: float = 0.4,
-) -> tuple[int, int, int, int]:
-    pts = landmarks[indices, :2]
-    px = pts[:, 0] * frame_w
-    py = pts[:, 1] * frame_h
-    x_min, x_max = px.min(), px.max()
-    y_min, y_max = py.min(), py.max()
-    w = x_max - x_min
-    h = y_max - y_min
-    cx = (x_min + x_max) / 2
-    cy = (y_min + y_max) / 2
-    size = max(w, h) * (1 + expand)
-    half = size / 2
-    x1 = int(max(cx - half, 0))
-    y1 = int(max(cy - half, 0))
-    x2 = int(min(cx + half, frame_w))
-    y2 = int(min(cy + half, frame_h))
-    return x1, y1, x2, y2
-def extract_eye_crops(
-    frame: np.ndarray,
-    landmarks: np.ndarray,
-    expand: float = 0.4,
-    crop_size: int = CROP_SIZE,
-) -> tuple[np.ndarray, np.ndarray, tuple, tuple]:
-    h, w = frame.shape[:2]
-    left_bbox = _bbox_from_landmarks(landmarks, LEFT_EYE_CONTOUR, w, h, expand)
-    right_bbox = _bbox_from_landmarks(landmarks, RIGHT_EYE_CONTOUR, w, h, expand)
-    left_crop = frame[left_bbox[1] : left_bbox[3], left_bbox[0] : left_bbox[2]]
-    right_crop = frame[right_bbox[1] : right_bbox[3], right_bbox[0] : right_bbox[2]]
-    left_crop = cv2.resize(left_crop, (crop_size, crop_size), interpolation=cv2.INTER_AREA)
-    right_crop = cv2.resize(right_crop, (crop_size, crop_size), interpolation=cv2.INTER_AREA)
-    return left_crop, right_crop, left_bbox, right_bbox
-def crop_to_tensor(crop_bgr: np.ndarray):
-    import torch
-    rgb = cv2.cvtColor(crop_bgr, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
-    for c in range(3):
-        rgb[:, :, c] = (rgb[:, :, c] - IMAGENET_MEAN[c]) / IMAGENET_STD[c]
-    return torch.from_numpy(rgb.transpose(2, 0, 1))

models/cnn/eye_attention/train.py DELETED Viewed

File without changes

models/cnn/notebooks/EyeCNN.ipynb DELETED Viewed

@@ -1,107 +0,0 @@
-{
-  "nbformat": 4,
-  "nbformat_minor": 0,
-  "metadata": {
-    "colab": {
-      "provenance": [],
-      "gpuType": "T4"
-    },
-    "kernelspec": {
-      "name": "python3",
-      "display_name": "Python 3"
-    },
-    "language_info": {
-      "name": "python"
-    },
-    "accelerator": "GPU"
-  },
-  "cells": [
-    {
-      "cell_type": "code",
-      "source": [
-        "import os\n",
-        "import torch\n",
-        "import torch.nn as nn\n",
-        "import torch.optim as optim\n",
-        "from torch.utils.data import DataLoader\n",
-        "from torchvision import datasets, transforms\n",
-        "\n",
-        "from google.colab import drive\n",
-        "drive.mount('/content/drive')\n",
-        "!cp -r /content/drive/MyDrive/Dataset_clean /content/\n",
-        "\n",
-        "#Verify structure\n",
-        "for split in ['train', 'val', 'test']:\n",
-        "    path = f'/content/Dataset_clean/{split}'\n",
-        "    classes = os.listdir(path)\n",
-        "    total = sum(len(os.listdir(os.path.join(path, c))) for c in classes)\n",
-        "    print(f'{split}: {total} images | classes: {classes}')"
-      ],
-      "metadata": {
-        "colab": {
-          "base_uri": "https://localhost:8080/"
-        },
-        "id": "sE1F3em-V5go",
-        "outputId": "2c73a9a6-a198-468c-a2cc-253b2de7cc3f"
-      },
-      "execution_count": null,
-      "outputs": [
-        {
-          "output_type": "stream",
-          "name": "stdout",
-          "text": [
-            "Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount(\"/content/drive\", force_remount=True).\n"
-          ]
-        }
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "nG2bh66rQ56G"
-      },
-      "outputs": [],
-      "source": [
-        "class EyeCNN(nn.Module):\n",
-        "    def __init__(self, num_classes=2):\n",
-        "        super(EyeCNN, self).__init__()\n",
-        "        self.conv_layers = nn.Sequential(\n",
-        "            nn.Conv2d(3, 32, 3, 1, 1),\n",
-        "            nn.BatchNorm2d(32),\n",
-        "            nn.ReLU(),\n",
-        "            nn.MaxPool2d(2, 2),\n",
-        "\n",
-        "            nn.Conv2d(32, 64, 3, 1, 1),\n",
-        "            nn.BatchNorm2d(64),\n",
-        "            nn.ReLU(),\n",
-        "            nn.MaxPool2d(2, 2),\n",
-        "\n",
-        "            nn.Conv2d(64, 128, 3, 1, 1),\n",
-        "            nn.BatchNorm2d(128),\n",
-        "            nn.ReLU(),\n",
-        "            nn.MaxPool2d(2, 2),\n",
-        "\n",
-        "            nn.Conv2d(128, 256, 3, 1, 1),\n",
-        "            nn.BatchNorm2d(256),\n",
-        "            nn.ReLU(),\n",
-        "            nn.MaxPool2d(2, 2)\n",
-        "        )\n",
-        "\n",
-        "        self.fc_layers = nn.Sequential(\n",
-        "            nn.AdaptiveAvgPool2d((1, 1)),\n",
-        "            nn.Flatten(),\n",
-        "            nn.Linear(256, 512),\n",
-        "            nn.ReLU(),\n",
-        "            nn.Dropout(0.35),\n",
-        "            nn.Linear(512, num_classes)\n",
-        "        )\n",
-        "\n",
-        "    def forward(self, x):\n",
-        "        x = self.conv_layers(x)\n",
-        "        x = self.fc_layers(x)\n",
-        "        return x"
-      ]
-    }
-  ]
-}

models/cnn/notebooks/EyeCNN_Train_Evaluate_new.ipynb DELETED Viewed

The diff for this file is too large to render. See raw diff

models/cnn/notebooks/EyeCNN_Training_Evaluate.ipynb DELETED Viewed

The diff for this file is too large to render. See raw diff

models/cnn/notebooks/README.md DELETED Viewed

	@@ -1 +0,0 @@
1	- # GAP Large Project

models/eye_classifier.py DELETED Viewed

@@ -1,69 +0,0 @@
-from __future__ import annotations
-from abc import ABC, abstractmethod
-import numpy as np
-class EyeClassifier(ABC):
-    @property
-    @abstractmethod
-    def name(self) -> str:
-        pass
-    @abstractmethod
-    def predict_score(self, crops_bgr: list[np.ndarray]) -> float:
-        pass
-class GeometricOnlyClassifier(EyeClassifier):
-    @property
-    def name(self) -> str:
-        return "geometric"
-    def predict_score(self, crops_bgr: list[np.ndarray]) -> float:
-        return 1.0
-class YOLOv11Classifier(EyeClassifier):
-    def __init__(self, checkpoint_path: str, device: str = "cpu"):
-        from ultralytics import YOLO
-        self._model = YOLO(checkpoint_path)
-        self._device = device
-        names = self._model.names
-        self._attentive_idx = None
-        for idx, cls_name in names.items():
-            if cls_name in ("open", "attentive"):
-                self._attentive_idx = idx
-                break
-        if self._attentive_idx is None:
-            self._attentive_idx = max(names.keys())
-        print(f"[YOLO] Classes: {names}, attentive_idx={self._attentive_idx}")
-    @property
-    def name(self) -> str:
-        return "yolo"
-    def predict_score(self, crops_bgr: list[np.ndarray]) -> float:
-        if not crops_bgr:
-            return 1.0
-        results = self._model.predict(crops_bgr, device=self._device, verbose=False)
-        scores = [float(r.probs.data[self._attentive_idx]) for r in results]
-        return sum(scores) / len(scores) if scores else 1.0
-def load_eye_classifier(
-    path: str | None = None,
-    backend: str = "yolo",
-    device: str = "cpu",
-) -> EyeClassifier:
-    if path is None or backend == "geometric":
-        return GeometricOnlyClassifier()
-    try:
-        return YOLOv11Classifier(path, device=device)
-    except ImportError:
-        print("[CLASSIFIER] ultralytics required for YOLO. pip install ultralytics")
-        raise

models/eye_crop.py DELETED Viewed

@@ -1,77 +0,0 @@
-import cv2
-import numpy as np
-from models.face_mesh import FaceMeshDetector
-LEFT_EYE_CONTOUR = FaceMeshDetector.LEFT_EYE_INDICES
-RIGHT_EYE_CONTOUR = FaceMeshDetector.RIGHT_EYE_INDICES
-IMAGENET_MEAN = (0.485, 0.456, 0.406)
-IMAGENET_STD = (0.229, 0.224, 0.225)
-CROP_SIZE = 96
-def _bbox_from_landmarks(
-    landmarks: np.ndarray,
-    indices: list[int],
-    frame_w: int,
-    frame_h: int,
-    expand: float = 0.4,
-) -> tuple[int, int, int, int]:
-    pts = landmarks[indices, :2]
-    px = pts[:, 0] * frame_w
-    py = pts[:, 1] * frame_h
-    x_min, x_max = px.min(), px.max()
-    y_min, y_max = py.min(), py.max()
-    w = x_max - x_min
-    h = y_max - y_min
-    cx = (x_min + x_max) / 2
-    cy = (y_min + y_max) / 2
-    size = max(w, h) * (1 + expand)
-    half = size / 2
-    x1 = int(max(cx - half, 0))
-    y1 = int(max(cy - half, 0))
-    x2 = int(min(cx + half, frame_w))
-    y2 = int(min(cy + half, frame_h))
-    return x1, y1, x2, y2
-def extract_eye_crops(
-    frame: np.ndarray,
-    landmarks: np.ndarray,
-    expand: float = 0.4,
-    crop_size: int = CROP_SIZE,
-) -> tuple[np.ndarray, np.ndarray, tuple, tuple]:
-    h, w = frame.shape[:2]
-    left_bbox = _bbox_from_landmarks(landmarks, LEFT_EYE_CONTOUR, w, h, expand)
-    right_bbox = _bbox_from_landmarks(landmarks, RIGHT_EYE_CONTOUR, w, h, expand)
-    left_crop = frame[left_bbox[1] : left_bbox[3], left_bbox[0] : left_bbox[2]]
-    right_crop = frame[right_bbox[1] : right_bbox[3], right_bbox[0] : right_bbox[2]]
-    if left_crop.size == 0:
-        left_crop = np.zeros((crop_size, crop_size, 3), dtype=np.uint8)
-    else:
-        left_crop = cv2.resize(left_crop, (crop_size, crop_size), interpolation=cv2.INTER_AREA)
-    if right_crop.size == 0:
-        right_crop = np.zeros((crop_size, crop_size, 3), dtype=np.uint8)
-    else:
-        right_crop = cv2.resize(right_crop, (crop_size, crop_size), interpolation=cv2.INTER_AREA)
-    return left_crop, right_crop, left_bbox, right_bbox
-def crop_to_tensor(crop_bgr: np.ndarray):
-    import torch
-    rgb = cv2.cvtColor(crop_bgr, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
-    for c in range(3):
-        rgb[:, :, c] = (rgb[:, :, c] - IMAGENET_MEAN[c]) / IMAGENET_STD[c]
-    return torch.from_numpy(rgb.transpose(2, 0, 1))

models/xgboost/checkpoints/face_orientation_best.json DELETED Viewed

The diff for this file is too large to render. See raw diff

public/assets/111.jpg DELETED Viewed

Binary file (73.4 kB)

src/assets/react.svg DELETED Viewed

ui/live_demo.py CHANGED Viewed

@@ -130,9 +130,6 @@ def main():
     parser.add_argument("--camera", type=int, default=0)
     parser.add_argument("--mlp-dir", type=str, default=None)
     parser.add_argument("--max-angle", type=float, default=22.0)
-    parser.add_argument("--eye-model", type=str, default=None)
-    parser.add_argument("--eye-backend", type=str, default="yolo", choices=["yolo", "geometric"])
-    parser.add_argument("--eye-blend", type=float, default=0.5)
     parser.add_argument("--xgb-path", type=str, default=None)
     parser.add_argument("--xgb", action="store_true", help="Start in XGBoost mode")
     args = parser.parse_args()
@@ -148,9 +145,6 @@ def main():
     # 1. Geometric
     pipelines[MODE_GEO] = FaceMeshPipeline(
         max_angle=args.max_angle,
-        eye_model_path=args.eye_model,
-        eye_backend=args.eye_backend,
-        eye_blend=args.eye_blend,
         detector=detector,
     )
     available_modes.append(MODE_GEO)
@@ -174,9 +168,6 @@ def main():
         try:
             pipelines[MODE_HYBRID] = HybridFocusPipeline(
                 model_dir=model_dir,
-                eye_model_path=args.eye_model,
-                eye_backend=args.eye_backend,
-                eye_blend=args.eye_blend,
                 max_angle=args.max_angle,
                 detector=detector,
             )
@@ -235,11 +226,6 @@ def main():
                 if hasattr(pipeline, "head_pose"):
                     pipeline.head_pose.draw_axes(frame, lm)
-                if result.get("left_bbox") and result.get("right_bbox"):
-                    lx1, ly1, lx2, ly2 = result["left_bbox"]
-                    rx1, ry1, rx2, ry2 = result["right_bbox"]
-                    cv2.rectangle(frame, (lx1, ly1), (lx2, ly2), YELLOW, 1)
-                    cv2.rectangle(frame, (rx1, ry1), (rx2, ry2), YELLOW, 1)
             # --- HUD ---
             status = "FOCUSED" if result["is_focused"] else "NOT FOCUSED"

     parser.add_argument("--camera", type=int, default=0)
     parser.add_argument("--mlp-dir", type=str, default=None)
     parser.add_argument("--max-angle", type=float, default=22.0)
     parser.add_argument("--xgb-path", type=str, default=None)
     parser.add_argument("--xgb", action="store_true", help="Start in XGBoost mode")
     args = parser.parse_args()
     # 1. Geometric
     pipelines[MODE_GEO] = FaceMeshPipeline(
         max_angle=args.max_angle,
         detector=detector,
     )
     available_modes.append(MODE_GEO)
         try:
             pipelines[MODE_HYBRID] = HybridFocusPipeline(
                 model_dir=model_dir,
                 max_angle=args.max_angle,
                 detector=detector,
             )
                 if hasattr(pipeline, "head_pose"):
                     pipeline.head_pose.draw_axes(frame, lm)
             # --- HUD ---
             status = "FOCUSED" if result["is_focused"] else "NOT FOCUSED"

ui/pipeline.py CHANGED Viewed

@@ -12,18 +12,19 @@ _PROJECT_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
 if _PROJECT_ROOT not in sys.path:
     sys.path.insert(0, _PROJECT_ROOT)
 from models.face_mesh import FaceMeshDetector
 from models.head_pose import HeadPoseEstimator
 from models.eye_scorer import EyeBehaviourScorer, compute_mar, MAR_YAWN_THRESHOLD
-from models.eye_crop import extract_eye_crops
-from models.eye_classifier import load_eye_classifier, GeometricOnlyClassifier
 from models.collect_features import FEATURE_NAMES, TemporalTracker, extract_features
 _FEAT_IDX = {name: i for i, name in enumerate(FEATURE_NAMES)}
 def _clip_features(vec):
-    """Clip raw features to the same ranges used during training."""
     out = vec.copy()
     _i = _FEAT_IDX
@@ -49,8 +50,6 @@ def _clip_features(vec):
 class _OutputSmoother:
-    """EMA smoothing on focus score with no-face grace period."""
     def __init__(self, alpha: float = 0.3, grace_frames: int = 15):
         self._alpha = alpha
         self._grace = grace_frames
@@ -73,19 +72,17 @@ class _OutputSmoother:
 DEFAULT_HYBRID_CONFIG = {
-    "w_mlp": 0.7,
-    "w_geo": 0.3,
-    "threshold": 0.55,
     "use_yawn_veto": True,
-    "geo_face_weight": 0.4,
-    "geo_eye_weight": 0.6,
     "mar_yawn_threshold": float(MAR_YAWN_THRESHOLD),
 }
 class _RuntimeFeatureEngine:
-    """Runtime feature engineering (magnitudes, velocities, variances) with EMA baselines."""
     _MAG_FEATURES = ["pitch", "yaw", "head_deviation", "gaze_offset", "v_gaze", "h_gaze"]
     _VEL_FEATURES = ["pitch", "yaw", "h_gaze", "v_gaze", "head_deviation", "gaze_offset"]
     _VAR_FEATURES = ["h_gaze", "v_gaze", "pitch"]
@@ -171,12 +168,9 @@ class FaceMeshPipeline:
     def __init__(
         self,
         max_angle: float = 22.0,
-        alpha: float = 0.4,
-        beta: float = 0.6,
         threshold: float = 0.55,
-        eye_model_path: str | None = None,
-        eye_backend: str = "yolo",
-        eye_blend: float = 0.5,
         detector=None,
     ):
         self.detector = detector or FaceMeshDetector()
@@ -186,16 +180,6 @@ class FaceMeshPipeline:
         self.alpha = alpha
         self.beta = beta
         self.threshold = threshold
-        self.eye_blend = eye_blend
-        self.eye_classifier = load_eye_classifier(
-            path=eye_model_path if eye_model_path and os.path.exists(eye_model_path) else None,
-            backend=eye_backend,
-            device="cpu",
-        )
-        self._has_eye_model = not isinstance(self.eye_classifier, GeometricOnlyClassifier)
-        if self._has_eye_model:
-            print(f"[PIPELINE] Eye model: {self.eye_classifier.name}")
         self._smoother = _OutputSmoother()
     def process_frame(self, bgr_frame: np.ndarray) -> dict:
@@ -227,17 +211,7 @@ class FaceMeshPipeline:
         if angles is not None:
             out["yaw"], out["pitch"], out["roll"] = angles
         out["s_face"] = self.head_pose.score(landmarks, w, h)
-        s_eye_geo = self.eye_scorer.score(landmarks)
-        if self._has_eye_model:
-            left_crop, right_crop, left_bbox, right_bbox = extract_eye_crops(bgr_frame, landmarks)
-            out["left_bbox"] = left_bbox
-            out["right_bbox"] = right_bbox
-            s_eye_model = self.eye_classifier.predict_score([left_crop, right_crop])
-            out["s_eye"] = (1.0 - self.eye_blend) * s_eye_geo + self.eye_blend * s_eye_model
-        else:
-            out["s_eye"] = s_eye_geo
         out["mar"] = compute_mar(landmarks)
         out["is_yawning"] = out["mar"] > MAR_YAWN_THRESHOLD
@@ -249,10 +223,6 @@ class FaceMeshPipeline:
         return out
-    @property
-    def has_eye_model(self) -> bool:
-        return self._has_eye_model
     def reset_session(self):
         self._smoother.reset()
@@ -318,7 +288,7 @@ def _load_hybrid_config(model_dir: str, config_path: str | None = None):
 class MLPPipeline:
-    def __init__(self, model_dir=None, detector=None, threshold=0.5):
         if model_dir is None:
             # Check primary location
             model_dir = os.path.join(_PROJECT_ROOT, "MLP", "models")
@@ -332,11 +302,7 @@ class MLPPipeline:
         self._scaler = joblib.load(scaler_path)
         meta = np.load(meta_path, allow_pickle=True)
         self._feature_names = list(meta["feature_names"])
-        norm_feats = list(meta["norm_features"]) if "norm_features" in meta else []
-        self._engine = _RuntimeFeatureEngine(FEATURE_NAMES, norm_features=norm_feats)
-        ext_names = self._engine.extended_names
-        self._indices = [ext_names.index(n) for n in self._feature_names]
         self._detector = detector or FaceMeshDetector()
         self._owns_detector = detector is None
@@ -378,8 +344,7 @@ class MLPPipeline:
         out["s_eye"] = float(vec[_FEAT_IDX["s_eye"]])
         out["mar"] = float(vec[_FEAT_IDX["mar"]])
-        ext_vec = self._engine.transform(vec)
-        X = ext_vec[self._indices].reshape(1, -1).astype(np.float64)
         X_sc = self._scaler.transform(X)
         if hasattr(self._mlp, "predict_proba"):
             mlp_prob = float(self._mlp.predict_proba(X_sc)[0, 1])
@@ -410,9 +375,6 @@ class HybridFocusPipeline:
         self,
         model_dir=None,
         config_path: str | None = None,
-        eye_model_path: str | None = None,
-        eye_backend: str = "yolo",
-        eye_blend: float = 0.5,
         max_angle: float = 22.0,
         detector=None,
     ):
@@ -426,11 +388,7 @@ class HybridFocusPipeline:
         self._scaler = joblib.load(scaler_path)
         meta = np.load(meta_path, allow_pickle=True)
         self._feature_names = list(meta["feature_names"])
-        norm_feats = list(meta["norm_features"]) if "norm_features" in meta else []
-        self._engine = _RuntimeFeatureEngine(FEATURE_NAMES, norm_features=norm_feats)
-        ext_names = self._engine.extended_names
-        self._indices = [ext_names.index(n) for n in self._feature_names]
         self._cfg, self._cfg_path = _load_hybrid_config(model_dir=model_dir, config_path=config_path)
@@ -439,16 +397,6 @@ class HybridFocusPipeline:
         self._head_pose = HeadPoseEstimator(max_angle=max_angle)
         self._eye_scorer = EyeBehaviourScorer()
         self._temporal = TemporalTracker()
-        self._eye_blend = eye_blend
-        self.eye_classifier = load_eye_classifier(
-            path=eye_model_path if eye_model_path and os.path.exists(eye_model_path) else None,
-            backend=eye_backend,
-            device="cpu",
-        )
-        self._has_eye_model = not isinstance(self.eye_classifier, GeometricOnlyClassifier)
-        if self._has_eye_model:
-            print(f"[HYBRID] Eye model: {self.eye_classifier.name}")
         self.head_pose = self._head_pose
         self._smoother = _OutputSmoother()
@@ -458,10 +406,6 @@ class HybridFocusPipeline:
             f"threshold={self._cfg['threshold']:.2f}"
         )
-    @property
-    def has_eye_model(self) -> bool:
-        return self._has_eye_model
     @property
     def config(self) -> dict:
         return dict(self._cfg)
@@ -498,15 +442,8 @@ class HybridFocusPipeline:
             out["yaw"], out["pitch"], out["roll"] = angles
         out["s_face"] = self._head_pose.score(landmarks, w, h)
-        s_eye_geo = self._eye_scorer.score(landmarks)
-        if self._has_eye_model:
-            left_crop, right_crop, left_bbox, right_bbox = extract_eye_crops(bgr_frame, landmarks)
-            out["left_bbox"] = left_bbox
-            out["right_bbox"] = right_bbox
-            s_eye_model = self.eye_classifier.predict_score([left_crop, right_crop])
-            out["s_eye"] = (1.0 - self._eye_blend) * s_eye_geo + self._eye_blend * s_eye_model
-        else:
-            out["s_eye"] = s_eye_geo
         geo_score = (
             self._cfg["geo_face_weight"] * out["s_face"] +
@@ -528,8 +465,7 @@ class HybridFocusPipeline:
         }
         vec = extract_features(landmarks, w, h, self._head_pose, self._eye_scorer, self._temporal, _pre=pre)
         vec = _clip_features(vec)
-        ext_vec = self._engine.transform(vec)
-        X = ext_vec[self._indices].reshape(1, -1).astype(np.float64)
         X_sc = self._scaler.transform(X)
         if hasattr(self._mlp, "predict_proba"):
             mlp_prob = float(self._mlp.predict_proba(X_sc)[0, 1])
@@ -559,15 +495,12 @@ class HybridFocusPipeline:
 class XGBoostPipeline:
-    """Real-time XGBoost inference pipeline using the same feature extraction as MLPPipeline."""
-    # Same 10 features used during training (data_preparation.prepare_dataset.SELECTED_FEATURES)
     SELECTED = [
         'head_deviation', 's_face', 's_eye', 'h_gaze', 'pitch',
         'ear_left', 'ear_avg', 'ear_right', 'gaze_offset', 'perclos',
     ]
-    def __init__(self, model_path=None, threshold=0.5):
         from xgboost import XGBClassifier
         if model_path is None:

 if _PROJECT_ROOT not in sys.path:
     sys.path.insert(0, _PROJECT_ROOT)
+from data_preparation.prepare_dataset import SELECTED_FEATURES
 from models.face_mesh import FaceMeshDetector
 from models.head_pose import HeadPoseEstimator
 from models.eye_scorer import EyeBehaviourScorer, compute_mar, MAR_YAWN_THRESHOLD
 from models.collect_features import FEATURE_NAMES, TemporalTracker, extract_features
+# Same 10 features used for MLP training (prepare_dataset) and inference
+MLP_FEATURE_NAMES = SELECTED_FEATURES["face_orientation"]
 _FEAT_IDX = {name: i for i, name in enumerate(FEATURE_NAMES)}
 def _clip_features(vec):
     out = vec.copy()
     _i = _FEAT_IDX
 class _OutputSmoother:
     def __init__(self, alpha: float = 0.3, grace_frames: int = 15):
         self._alpha = alpha
         self._grace = grace_frames
 DEFAULT_HYBRID_CONFIG = {
+    "w_mlp": 0.3,
+    "w_geo": 0.7,
+    "threshold": 0.35,
     "use_yawn_veto": True,
+    "geo_face_weight": 0.7,
+    "geo_eye_weight": 0.3,
     "mar_yawn_threshold": float(MAR_YAWN_THRESHOLD),
 }
 class _RuntimeFeatureEngine:
     _MAG_FEATURES = ["pitch", "yaw", "head_deviation", "gaze_offset", "v_gaze", "h_gaze"]
     _VEL_FEATURES = ["pitch", "yaw", "h_gaze", "v_gaze", "head_deviation", "gaze_offset"]
     _VAR_FEATURES = ["h_gaze", "v_gaze", "pitch"]
     def __init__(
         self,
         max_angle: float = 22.0,
+        alpha: float = 0.7,
+        beta: float = 0.3,
         threshold: float = 0.55,
         detector=None,
     ):
         self.detector = detector or FaceMeshDetector()
         self.alpha = alpha
         self.beta = beta
         self.threshold = threshold
         self._smoother = _OutputSmoother()
     def process_frame(self, bgr_frame: np.ndarray) -> dict:
         if angles is not None:
             out["yaw"], out["pitch"], out["roll"] = angles
         out["s_face"] = self.head_pose.score(landmarks, w, h)
+        out["s_eye"] = self.eye_scorer.score(landmarks)
         out["mar"] = compute_mar(landmarks)
         out["is_yawning"] = out["mar"] > MAR_YAWN_THRESHOLD
         return out
     def reset_session(self):
         self._smoother.reset()
 class MLPPipeline:
+    def __init__(self, model_dir=None, detector=None, threshold=0.23):
         if model_dir is None:
             # Check primary location
             model_dir = os.path.join(_PROJECT_ROOT, "MLP", "models")
         self._scaler = joblib.load(scaler_path)
         meta = np.load(meta_path, allow_pickle=True)
         self._feature_names = list(meta["feature_names"])
+        self._indices = [FEATURE_NAMES.index(n) for n in self._feature_names]
         self._detector = detector or FaceMeshDetector()
         self._owns_detector = detector is None
         out["s_eye"] = float(vec[_FEAT_IDX["s_eye"]])
         out["mar"] = float(vec[_FEAT_IDX["mar"]])
+        X = vec[self._indices].reshape(1, -1).astype(np.float64)
         X_sc = self._scaler.transform(X)
         if hasattr(self._mlp, "predict_proba"):
             mlp_prob = float(self._mlp.predict_proba(X_sc)[0, 1])
         self,
         model_dir=None,
         config_path: str | None = None,
         max_angle: float = 22.0,
         detector=None,
     ):
         self._scaler = joblib.load(scaler_path)
         meta = np.load(meta_path, allow_pickle=True)
         self._feature_names = list(meta["feature_names"])
+        self._indices = [FEATURE_NAMES.index(n) for n in self._feature_names]
         self._cfg, self._cfg_path = _load_hybrid_config(model_dir=model_dir, config_path=config_path)
         self._head_pose = HeadPoseEstimator(max_angle=max_angle)
         self._eye_scorer = EyeBehaviourScorer()
         self._temporal = TemporalTracker()
         self.head_pose = self._head_pose
         self._smoother = _OutputSmoother()
             f"threshold={self._cfg['threshold']:.2f}"
         )
     @property
     def config(self) -> dict:
         return dict(self._cfg)
             out["yaw"], out["pitch"], out["roll"] = angles
         out["s_face"] = self._head_pose.score(landmarks, w, h)
+        out["s_eye"] = self._eye_scorer.score(landmarks)
+        s_eye_geo = out["s_eye"]
         geo_score = (
             self._cfg["geo_face_weight"] * out["s_face"] +
         }
         vec = extract_features(landmarks, w, h, self._head_pose, self._eye_scorer, self._temporal, _pre=pre)
         vec = _clip_features(vec)
+        X = vec[self._indices].reshape(1, -1).astype(np.float64)
         X_sc = self._scaler.transform(X)
         if hasattr(self._mlp, "predict_proba"):
             mlp_prob = float(self._mlp.predict_proba(X_sc)[0, 1])
 class XGBoostPipeline:
     SELECTED = [
         'head_deviation', 's_face', 's_eye', 'h_gaze', 'pitch',
         'ear_left', 'ear_avg', 'ear_right', 'gaze_offset', 'perclos',
     ]
+    def __init__(self, model_path=None, threshold=0.38):
         from xgboost import XGBClassifier
         if model_path is None:

yolov8n.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:31e20dde3def09e2cf938c7be6fe23d9150bbbe503982af13345706515f2ef95
-size 6534387