Spaces:
Sleeping
Sleeping
Abdelrahman Almatrooshi commited on
Commit Β·
e405722
1
Parent(s): 964ff95
docs: HF Space README YAML (title, emoji, colors) + merge doc sections
Browse files
README.md
CHANGED
|
@@ -1,21 +1,20 @@
|
|
| 1 |
-
<<<<<<< HEAD
|
| 2 |
---
|
| 3 |
-
title:
|
|
|
|
|
|
|
|
|
|
| 4 |
sdk: docker
|
| 5 |
app_port: 7860
|
|
|
|
|
|
|
| 6 |
---
|
| 7 |
|
| 8 |
-
=======
|
| 9 |
-
>>>>>>> feature/integration2.0
|
| 10 |
# FocusGuard
|
| 11 |
|
| 12 |
-
Webcam-based focus detection: MediaPipe face mesh
|
| 13 |
|
| 14 |
-
|
| 15 |
-
=======
|
| 16 |
-
**Repository:** Add your repo link here (e.g. `https://github.com/your-org/FocusGuard`).
|
| 17 |
|
| 18 |
-
>>>>>>> feature/integration2.0
|
| 19 |
## Project layout
|
| 20 |
|
| 21 |
```
|
|
@@ -41,13 +40,10 @@ Webcam-based focus detection: MediaPipe face mesh -> 17 features (EAR, gaze, hea
|
|
| 41 |
βββ package.json
|
| 42 |
```
|
| 43 |
|
| 44 |
-
<<<<<<< HEAD
|
| 45 |
-
=======
|
| 46 |
## Config
|
| 47 |
|
| 48 |
Hyperparameters and app settings live in `config/default.yaml` (learning rates, batch size, thresholds, L2CS weights, etc.). Override with env `FOCUSGUARD_CONFIG` pointing to another YAML.
|
| 49 |
|
| 50 |
-
>>>>>>> feature/integration2.0
|
| 51 |
## Setup
|
| 52 |
|
| 53 |
```bash
|
|
@@ -95,8 +91,6 @@ python -m models.mlp.train
|
|
| 95 |
python -m models.xgboost.train
|
| 96 |
```
|
| 97 |
|
| 98 |
-
<<<<<<< HEAD
|
| 99 |
-
=======
|
| 100 |
### ClearML experiment tracking
|
| 101 |
|
| 102 |
All training and evaluation config (from `config/default.yaml`) is exposed as ClearML task parameters. Enable logging with `USE_CLEARML=1`; optionally run on a **remote GPU agent** instead of locally:
|
|
@@ -115,23 +109,19 @@ clearml-agent daemon --queue gpu
|
|
| 115 |
|
| 116 |
Logged to ClearML: **parameters** (full flattened config), **scalars** (loss, accuracy, F1, ROC-AUC, per-class precision/recall/F1, dataset sizes and class counts), **artifacts** (best checkpoint, training log JSON), and **plots** (confusion matrix, ROC curves in evaluation).
|
| 117 |
|
| 118 |
-
>>>>>>> feature/integration2.0
|
| 119 |
## Data
|
| 120 |
|
| 121 |
9 participants, 144,793 samples, 10 features, binary labels. Collect with `python -m models.collect_features --name <name>`. Data lives in `data/collected_<name>/`.
|
| 122 |
|
| 123 |
-
<<<<<<< HEAD
|
| 124 |
-
=======
|
| 125 |
**Train/val/test split:** All pooled training and evaluation use the same split for reproducibility. The test set is held out before any preprocessing; `StandardScaler` is fit on the training set only, then applied to val and test. Split ratios and random seed come from `config/default.yaml` (`data.split_ratios`, `mlp.seed`) via `data_preparation.prepare_dataset.get_default_split_config()`. MLP train, XGBoost train, eval_accuracy scripts, and benchmarks all use this single source so reported test accuracy is on the same held-out set.
|
| 126 |
|
| 127 |
-
>>>>>>> feature/integration2.0
|
| 128 |
## Models
|
| 129 |
|
| 130 |
| Model | What it uses | Best for |
|
| 131 |
|-------|-------------|----------|
|
| 132 |
| **Geometric** | Head pose angles + eye aspect ratio (EAR) | Fast, no ML needed |
|
| 133 |
| **XGBoost** | Trained classifier on head/eye features (600 trees, depth 8) | Balanced accuracy/speed |
|
| 134 |
-
| **MLP** | Neural network on same features (64
|
| 135 |
| **Hybrid** | Weighted MLP + Geometric ensemble | Best head-pose accuracy |
|
| 136 |
| **L2CS** | Deep gaze estimation (ResNet50, Gaze360 weights) | Detects eye-only gaze shifts |
|
| 137 |
|
|
@@ -140,10 +130,8 @@ Logged to ClearML: **parameters** (full flattened config), **scalars** (loss, ac
|
|
| 140 |
| Model | Accuracy | F1 | ROC-AUC |
|
| 141 |
|-------|----------|-----|---------|
|
| 142 |
| XGBoost (600 trees, depth 8) | 95.87% | 0.959 | 0.991 |
|
| 143 |
-
| MLP (64
|
| 144 |
|
| 145 |
-
<<<<<<< HEAD
|
| 146 |
-
=======
|
| 147 |
## Model numbers (LOPO, 9 participants)
|
| 148 |
|
| 149 |
| Model | LOPO AUC | Best threshold (Youden's J) | F1 @ best threshold | F1 @ 0.50 |
|
|
@@ -152,6 +140,7 @@ Logged to ClearML: **parameters** (full flattened config), **scalars** (loss, ac
|
|
| 152 |
| XGBoost | 0.8695 | 0.280 | 0.8549 | 0.8324 |
|
| 153 |
|
| 154 |
From the latest `python -m evaluation.justify_thresholds` run:
|
|
|
|
| 155 |
- Best geometric face weight (`alpha`) = `0.7` (mean LOPO F1 = `0.8195`)
|
| 156 |
- Best hybrid MLP weight (`w_mlp`) = `0.3` (mean LOPO F1 = `0.8409`)
|
| 157 |
|
|
@@ -180,23 +169,24 @@ Latest quick feature-selection run (`python -m evaluation.feature_importance --q
|
|
| 180 |
Top-5 XGBoost gain features: `s_face`, `ear_right`, `head_deviation`, `ear_avg`, `perclos`.
|
| 181 |
For full leave-one-feature-out ablation, run `python -m evaluation.feature_importance` (slower).
|
| 182 |
|
| 183 |
-
>>>>>>> feature/integration2.0
|
| 184 |
## L2CS Gaze Tracking
|
| 185 |
|
| 186 |
L2CS-Net predicts where your eyes are looking, not just where your head is pointed. This catches the scenario where your head faces the screen but your eyes wander.
|
| 187 |
|
| 188 |
### Standalone mode
|
| 189 |
-
Select **L2CS** as the model
|
| 190 |
|
| 191 |
### Boost mode
|
| 192 |
Select any other model, then click the **GAZE** toggle. L2CS runs alongside the base model:
|
|
|
|
| 193 |
- Base model handles head pose and eye openness (35% weight)
|
| 194 |
- L2CS handles gaze direction (65% weight)
|
| 195 |
- If L2CS detects gaze is clearly off-screen, it **vetoes** the base model regardless of score
|
| 196 |
|
| 197 |
### Calibration
|
| 198 |
After enabling L2CS or Gaze Boost, click **Calibrate** while a session is running:
|
| 199 |
-
|
|
|
|
| 200 |
2. Look at each dot as the progress ring fills
|
| 201 |
3. The first dot (centre) sets your baseline gaze offset
|
| 202 |
4. After all 9 points, a polynomial model maps your gaze angles to screen coordinates
|
|
@@ -205,9 +195,9 @@ After enabling L2CS or Gaze Boost, click **Calibrate** while a session is runnin
|
|
| 205 |
## Pipeline
|
| 206 |
|
| 207 |
1. Face mesh (MediaPipe 478 pts)
|
| 208 |
-
2. Head pose
|
| 209 |
-
3. Eye scorer
|
| 210 |
-
4. Temporal
|
| 211 |
-
5. 10-d vector
|
| 212 |
|
| 213 |
**Stack:** FastAPI, aiosqlite, React/Vite, PyTorch, XGBoost, MediaPipe, OpenCV, L2CS-Net.
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
title: Focus Guard Final v2
|
| 3 |
+
emoji: π―
|
| 4 |
+
colorFrom: blue
|
| 5 |
+
colorTo: indigo
|
| 6 |
sdk: docker
|
| 7 |
app_port: 7860
|
| 8 |
+
pinned: false
|
| 9 |
+
short_description: Webcam focus detection β MediaPipe, MLP/XGBoost/L2CS, React + FastAPI.
|
| 10 |
---
|
| 11 |
|
|
|
|
|
|
|
| 12 |
# FocusGuard
|
| 13 |
|
| 14 |
+
Webcam-based focus detection: MediaPipe face mesh β 17 features (EAR, gaze, head pose, PERCLOS, etc.) β MLP or XGBoost for focused/unfocused. React + FastAPI app with WebSocket video.
|
| 15 |
|
| 16 |
+
**Repository:** [KCL GAP project](https://github.kcl.ac.uk) (internal) β adjust link if you publish a public mirror.
|
|
|
|
|
|
|
| 17 |
|
|
|
|
| 18 |
## Project layout
|
| 19 |
|
| 20 |
```
|
|
|
|
| 40 |
βββ package.json
|
| 41 |
```
|
| 42 |
|
|
|
|
|
|
|
| 43 |
## Config
|
| 44 |
|
| 45 |
Hyperparameters and app settings live in `config/default.yaml` (learning rates, batch size, thresholds, L2CS weights, etc.). Override with env `FOCUSGUARD_CONFIG` pointing to another YAML.
|
| 46 |
|
|
|
|
| 47 |
## Setup
|
| 48 |
|
| 49 |
```bash
|
|
|
|
| 91 |
python -m models.xgboost.train
|
| 92 |
```
|
| 93 |
|
|
|
|
|
|
|
| 94 |
### ClearML experiment tracking
|
| 95 |
|
| 96 |
All training and evaluation config (from `config/default.yaml`) is exposed as ClearML task parameters. Enable logging with `USE_CLEARML=1`; optionally run on a **remote GPU agent** instead of locally:
|
|
|
|
| 109 |
|
| 110 |
Logged to ClearML: **parameters** (full flattened config), **scalars** (loss, accuracy, F1, ROC-AUC, per-class precision/recall/F1, dataset sizes and class counts), **artifacts** (best checkpoint, training log JSON), and **plots** (confusion matrix, ROC curves in evaluation).
|
| 111 |
|
|
|
|
| 112 |
## Data
|
| 113 |
|
| 114 |
9 participants, 144,793 samples, 10 features, binary labels. Collect with `python -m models.collect_features --name <name>`. Data lives in `data/collected_<name>/`.
|
| 115 |
|
|
|
|
|
|
|
| 116 |
**Train/val/test split:** All pooled training and evaluation use the same split for reproducibility. The test set is held out before any preprocessing; `StandardScaler` is fit on the training set only, then applied to val and test. Split ratios and random seed come from `config/default.yaml` (`data.split_ratios`, `mlp.seed`) via `data_preparation.prepare_dataset.get_default_split_config()`. MLP train, XGBoost train, eval_accuracy scripts, and benchmarks all use this single source so reported test accuracy is on the same held-out set.
|
| 117 |
|
|
|
|
| 118 |
## Models
|
| 119 |
|
| 120 |
| Model | What it uses | Best for |
|
| 121 |
|-------|-------------|----------|
|
| 122 |
| **Geometric** | Head pose angles + eye aspect ratio (EAR) | Fast, no ML needed |
|
| 123 |
| **XGBoost** | Trained classifier on head/eye features (600 trees, depth 8) | Balanced accuracy/speed |
|
| 124 |
+
| **MLP** | Neural network on same features (64β32) | Higher accuracy |
|
| 125 |
| **Hybrid** | Weighted MLP + Geometric ensemble | Best head-pose accuracy |
|
| 126 |
| **L2CS** | Deep gaze estimation (ResNet50, Gaze360 weights) | Detects eye-only gaze shifts |
|
| 127 |
|
|
|
|
| 130 |
| Model | Accuracy | F1 | ROC-AUC |
|
| 131 |
|-------|----------|-----|---------|
|
| 132 |
| XGBoost (600 trees, depth 8) | 95.87% | 0.959 | 0.991 |
|
| 133 |
+
| MLP (64β32) | 92.92% | 0.929 | 0.971 |
|
| 134 |
|
|
|
|
|
|
|
| 135 |
## Model numbers (LOPO, 9 participants)
|
| 136 |
|
| 137 |
| Model | LOPO AUC | Best threshold (Youden's J) | F1 @ best threshold | F1 @ 0.50 |
|
|
|
|
| 140 |
| XGBoost | 0.8695 | 0.280 | 0.8549 | 0.8324 |
|
| 141 |
|
| 142 |
From the latest `python -m evaluation.justify_thresholds` run:
|
| 143 |
+
|
| 144 |
- Best geometric face weight (`alpha`) = `0.7` (mean LOPO F1 = `0.8195`)
|
| 145 |
- Best hybrid MLP weight (`w_mlp`) = `0.3` (mean LOPO F1 = `0.8409`)
|
| 146 |
|
|
|
|
| 169 |
Top-5 XGBoost gain features: `s_face`, `ear_right`, `head_deviation`, `ear_avg`, `perclos`.
|
| 170 |
For full leave-one-feature-out ablation, run `python -m evaluation.feature_importance` (slower).
|
| 171 |
|
|
|
|
| 172 |
## L2CS Gaze Tracking
|
| 173 |
|
| 174 |
L2CS-Net predicts where your eyes are looking, not just where your head is pointed. This catches the scenario where your head faces the screen but your eyes wander.
|
| 175 |
|
| 176 |
### Standalone mode
|
| 177 |
+
Select **L2CS** as the model β it handles everything.
|
| 178 |
|
| 179 |
### Boost mode
|
| 180 |
Select any other model, then click the **GAZE** toggle. L2CS runs alongside the base model:
|
| 181 |
+
|
| 182 |
- Base model handles head pose and eye openness (35% weight)
|
| 183 |
- L2CS handles gaze direction (65% weight)
|
| 184 |
- If L2CS detects gaze is clearly off-screen, it **vetoes** the base model regardless of score
|
| 185 |
|
| 186 |
### Calibration
|
| 187 |
After enabling L2CS or Gaze Boost, click **Calibrate** while a session is running:
|
| 188 |
+
|
| 189 |
+
1. A fullscreen overlay shows 9 target dots (3Γ3 grid)
|
| 190 |
2. Look at each dot as the progress ring fills
|
| 191 |
3. The first dot (centre) sets your baseline gaze offset
|
| 192 |
4. After all 9 points, a polynomial model maps your gaze angles to screen coordinates
|
|
|
|
| 195 |
## Pipeline
|
| 196 |
|
| 197 |
1. Face mesh (MediaPipe 478 pts)
|
| 198 |
+
2. Head pose β yaw, pitch, roll, scores, gaze offset
|
| 199 |
+
3. Eye scorer β EAR, gaze ratio, MAR
|
| 200 |
+
4. Temporal β PERCLOS, blink rate, yawn
|
| 201 |
+
5. 10-d vector β MLP or XGBoost β focused / unfocused
|
| 202 |
|
| 203 |
**Stack:** FastAPI, aiosqlite, React/Vite, PyTorch, XGBoost, MediaPipe, OpenCV, L2CS-Net.
|