arnavam commited on
Commit
b53629f
·
1 Parent(s): 3a06bce

added readme

Browse files
.DS_Store CHANGED
Binary files a/.DS_Store and b/.DS_Store differ
 
.dockerignore ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ .Python
7
+ *.egg-info/
8
+ dist/
9
+ build/
10
+
11
+ # Virtual environments
12
+ .venv/
13
+ venv/
14
+ ENV/
15
+ env/
16
+
17
+ # Environment variables
18
+ .env
19
+ .env.local
20
+
21
+ # IDE
22
+ .vscode/
23
+ .idea/
24
+ *.swp
25
+ *.swo
26
+ *~
27
+
28
+ # OS
29
+ .DS_Store
30
+ Thumbs.db
31
+
32
+ # Testing
33
+ test/
34
+ *.coverage
35
+ htmlcov/
36
+ .pytest_cache/
37
+
38
+ # Git
39
+ .git/
40
+ .gitignore
41
+
42
+ # UV lock file (runtime dependency)
43
+ uv.lock
44
+
45
+ # Python version file
46
+ .python-version
47
+
48
+ # Documentation (if not needed in container)
49
+ API_DOCS.md
50
+ REFACTORING_SUMMARY.md
51
+ README.md
52
+
53
+ # Logs
54
+ *.log
.env ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # JWT Configuration
2
+ # IMPORTANT: Generate a secure secret key for production
3
+ # You can generate one using: openssl rand -hex 32
4
+ JWT_SECRET_KEY=your-secret-key-change-in-production
5
+
6
+ # MongoDB Configuration
7
+ # Connection string for MongoDB database
8
+ MONGODB_URI=mongodb+srv://arnavjagadeesh09_db_user:aP4x5QkUdSThzpxT@cluster0.uo44a8g.mongodb.net/?appName=Cluster0
9
+
10
+ # MongoDB database name
11
+ MONGODB_DB=afs
.env.example ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # JWT Configuration
2
+ # IMPORTANT: Generate a secure secret key for production
3
+ # You can generate one using: openssl rand -hex 32
4
+ JWT_SECRET_KEY=your-secret-key-change-in-production
5
+
6
+ # MongoDB Configuration
7
+ # Connection string for MongoDB database
8
+ MONGODB_URI=mongodb://localhost:27017
9
+
10
+ # MongoDB database name
11
+ MONGODB_DB=afs
.python-version ADDED
@@ -0,0 +1 @@
 
 
1
+ 3.13
API_DOCS.md ADDED
@@ -0,0 +1,243 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # API Documentation
2
+
3
+ ## Overview
4
+ This document describes the new API endpoints added to the AFS backend for face recognition and audio streaming.
5
+
6
+ ## Face Recognition APIs
7
+
8
+ ### 1. Upload 360-Degree Reference Video
9
+ **Endpoint:** `POST /api/face/upload-video`
10
+
11
+ **Description:** Upload a 360-degree reference video for face recognition training. The video will be processed to extract high-quality face embeddings.
12
+
13
+ **Authentication:** Required (JWT token)
14
+
15
+ **Request:**
16
+ - Content-Type: `multipart/form-data`
17
+ - Body: `file` (video file - .mp4, .avi, .mov, .mkv)
18
+
19
+ **Response:**
20
+ ```json
21
+ {
22
+ "ok": true,
23
+ "message": "Video processed successfully",
24
+ "frames_used": 10,
25
+ "embeddings_count": 1
26
+ }
27
+ ```
28
+
29
+ ### 2. Upload Reference Image
30
+ **Endpoint:** `POST /api/face/upload-image`
31
+
32
+ **Description:** Upload a single reference image for face recognition.
33
+
34
+ **Authentication:** Required (JWT token)
35
+
36
+ **Request:**
37
+ - Content-Type: `multipart/form-data`
38
+ - Body: `file` (image file - .jpg, .jpeg, .png)
39
+
40
+ **Response:**
41
+ ```json
42
+ {
43
+ "ok": true,
44
+ "message": "Image processed successfully",
45
+ "embeddings_count": 1,
46
+ "saved_path": "/path/to/Model/ref_image.jpg"
47
+ }
48
+ ```
49
+
50
+ ### 3. Get Cache Status
51
+ **Endpoint:** `GET /api/face/cache-status`
52
+
53
+ **Description:** Check if face recognition embeddings are cached and ready to use.
54
+
55
+ **Authentication:** Required (JWT token)
56
+
57
+ **Response (Cached):**
58
+ ```json
59
+ {
60
+ "ok": true,
61
+ "cached": true,
62
+ "video_path": "my_scan.mp4",
63
+ "model_name": "ArcFace",
64
+ "num_frames_used": 10,
65
+ "version": 2
66
+ }
67
+ ```
68
+
69
+ **Response (Not Cached):**
70
+ ```json
71
+ {
72
+ "ok": true,
73
+ "cached": false,
74
+ "message": "No cache found. Please upload a reference video or image."
75
+ }
76
+ ```
77
+
78
+ ## Audio Streaming APIs
79
+
80
+ ### 1. Start Audio Stream
81
+ **Endpoint:** `POST /api/audio/start-stream`
82
+
83
+ **Description:** Start a new audio recording session. Returns a session ID for streaming.
84
+
85
+ **Authentication:** Required (JWT token)
86
+
87
+ **Request:**
88
+ - Content-Type: `multipart/form-data`
89
+ - Body:
90
+ - `sample_rate` (optional, default: 16000)
91
+ - `channels` (optional, default: 1 for mono, 2 for stereo)
92
+
93
+ **Response:**
94
+ ```json
95
+ {
96
+ "ok": true,
97
+ "session_id": "uuid-here",
98
+ "filename": "/path/to/Model/audio_recordings/audio_uuid_timestamp.wav",
99
+ "sample_rate": 16000,
100
+ "channels": 1
101
+ }
102
+ ```
103
+
104
+ ### 2. Audio WebSocket Stream
105
+ **Endpoint:** `WebSocket /ws/audio/{session_id}`
106
+
107
+ **Description:** WebSocket endpoint for streaming audio data with optional angle information.
108
+
109
+ **Authentication:** Not required at WebSocket level (use session_id from start-stream)
110
+
111
+ **Send (Binary Audio Data):**
112
+ ```
113
+ WebSocket Binary Message: raw audio bytes (16-bit PCM)
114
+ ```
115
+
116
+ **Send (JSON with Angle):**
117
+ ```json
118
+ {
119
+ "audio_data": "base64-encoded-audio-bytes",
120
+ "angle": 45.5
121
+ }
122
+ ```
123
+
124
+ **Send (Stop Command):**
125
+ ```json
126
+ {
127
+ "command": "stop"
128
+ }
129
+ ```
130
+
131
+ **Receive:**
132
+ ```json
133
+ {
134
+ "status": "received",
135
+ "bytes": 1024
136
+ }
137
+ ```
138
+
139
+ or
140
+
141
+ ```json
142
+ {
143
+ "status": "received",
144
+ "angle": 45.5
145
+ }
146
+ ```
147
+
148
+ ### 3. Stop Audio Stream
149
+ **Endpoint:** `POST /api/audio/stop-stream/{session_id}`
150
+
151
+ **Description:** Stop an active audio recording stream.
152
+
153
+ **Authentication:** Required (JWT token)
154
+
155
+ **Response:**
156
+ ```json
157
+ {
158
+ "ok": true,
159
+ "message": "Audio stream stopped successfully"
160
+ }
161
+ ```
162
+
163
+ ### 4. List Audio Recordings
164
+ **Endpoint:** `GET /api/audio/recordings`
165
+
166
+ **Description:** Get a list of all audio recordings.
167
+
168
+ **Authentication:** Required (JWT token)
169
+
170
+ **Response:**
171
+ ```json
172
+ {
173
+ "ok": true,
174
+ "recordings": [
175
+ "/path/to/Model/audio_recordings/audio_uuid1_timestamp1.wav",
176
+ "/path/to/Model/audio_recordings/audio_uuid2_timestamp2.wav"
177
+ ],
178
+ "count": 2
179
+ }
180
+ ```
181
+
182
+ ## File Storage
183
+
184
+ All uploaded files and processed data are stored in the `/Model/` directory:
185
+
186
+ - **Reference Videos:** `/Model/my_scan.mp4` (overwritten on each upload)
187
+ - **Reference Images:** `/Model/ref_{filename}`
188
+ - **Embeddings Cache:** `/Model/embeddings_cache.pkl`
189
+ - **Audio Recordings:** `/Model/audio_recordings/audio_{session_id}_{timestamp}.wav`
190
+ - **Audio Metadata:** `/Model/audio_recordings/audio_{session_id}_{timestamp}_metadata.txt`
191
+
192
+ ## Metadata Format
193
+
194
+ Audio metadata files contain timestamp and angle data in CSV format:
195
+ ```
196
+ timestamp,angle
197
+ 0.000,45.50
198
+ 0.064,46.20
199
+ 0.128,47.00
200
+ ```
201
+
202
+ ## Usage Example (Python)
203
+
204
+ ```python
205
+ import requests
206
+ import websockets
207
+ import asyncio
208
+
209
+ # 1. Upload reference video
210
+ with open("my_360_scan.mp4", "rb") as f:
211
+ response = requests.post(
212
+ "http://localhost:8000/api/face/upload-video",
213
+ files={"file": f},
214
+ headers={"Authorization": f"Bearer {token}"}
215
+ )
216
+ print(response.json())
217
+
218
+ # 2. Start audio stream
219
+ response = requests.post(
220
+ "http://localhost:8000/api/audio/start-stream",
221
+ data={"sample_rate": 16000, "channels": 1},
222
+ headers={"Authorization": f"Bearer {token}"}
223
+ )
224
+ session_id = response.json()["session_id"]
225
+
226
+ # 3. Stream audio via WebSocket
227
+ async def stream_audio():
228
+ uri = f"ws://localhost:8000/ws/audio/{session_id}"
229
+ async with websockets.connect(uri) as websocket:
230
+ # Send audio chunk with angle
231
+ await websocket.send(json.dumps({
232
+ "audio_data": base64.b64encode(audio_bytes).decode(),
233
+ "angle": 45.5
234
+ }))
235
+
236
+ # Or send raw binary
237
+ await websocket.send(audio_bytes)
238
+
239
+ # Stop when done
240
+ await websocket.send(json.dumps({"command": "stop"}))
241
+
242
+ asyncio.run(stream_audio())
243
+ ```
Dockerfile ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11-slim
2
+
3
+ WORKDIR /app
4
+
5
+ RUN apt-get update && apt-get install -y \
6
+ libgl1-mesa-glx \
7
+ libglib2.0-0 \
8
+ libsm6 \
9
+ libxext6 \
10
+ libxrender-dev \
11
+ libgomp1 \
12
+ && rm -rf /var/lib/apt/lists/*
13
+
14
+ COPY requirements.txt .
15
+
16
+ RUN pip install --no-cache-dir -r requirements.txt
17
+
18
+ COPY . .
19
+
20
+ EXPOSE 8000
21
+
22
+ CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "8000"]
README.md ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: AFS Backend
3
+ emoji: 🚀
4
+ colorFrom: blue
5
+ colorTo: green
6
+ sdk: gradio
7
+ sdk_version: "4.31.0"
8
+ app_file: app.py
9
+ pinned: false
10
+ ---
REFACTORING_SUMMARY.md ADDED
@@ -0,0 +1,127 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Refactoring Summary
2
+
3
+ ## What Was Done
4
+
5
+ ### 1. **Model Directory Usage Analysis**
6
+ The backend uses the following files from `/Model/` directory:
7
+ - `embeddings_cache.pkl` - Face recognition embeddings cache
8
+ - `yolov8n-face.pt` - YOLO face detection model
9
+ - `my_scan.mp4` - Reference 360-degree scan video
10
+ - `Adi.jpg` - Reference images
11
+
12
+ Both `single_tracker.py` and `multi_tracker.py` access the Model directory.
13
+
14
+ ### 2. **Created New Services**
15
+
16
+ #### `services/face_recognition.py`
17
+ - Extracted face recognition logic from `Model/face_model.py`
18
+ - Class: `FaceRecognitionService`
19
+ - Methods:
20
+ - `extract_embeddings_from_video()` - Process 360° video with quality filtering
21
+ - `extract_embeddings_from_image()` - Process single reference image
22
+ - `save_embeddings_cache()` - Save processed embeddings
23
+ - `load_embeddings_cache()` - Load cached embeddings
24
+ - `calculate_blur_score()` - Image sharpness detection
25
+ - `calculate_frontal_score()` - Face frontality score
26
+
27
+ #### `services/audio_processing.py`
28
+ - New service for audio streaming with angle data
29
+ - Class: `AudioProcessor`
30
+ - Methods:
31
+ - `create_audio_stream()` - Start new recording session
32
+ - `write_audio_chunk()` - Write audio with optional angle metadata
33
+ - `close_audio_stream()` - Finalize recording
34
+ - `get_audio_files()` - List all recordings
35
+
36
+ ### 3. **Added API Endpoints to `server.py`**
37
+
38
+ #### Face Recognition APIs:
39
+ - `POST /api/face/upload-video` - Upload 360° reference video
40
+ - `POST /api/face/upload-image` - Upload reference image
41
+ - `GET /api/face/cache-status` - Check embeddings cache status
42
+
43
+ #### Audio Streaming APIs:
44
+ - `POST /api/audio/start-stream` - Start audio recording session
45
+ - `WebSocket /ws/audio/{session_id}` - Stream audio with angle data
46
+ - `POST /api/audio/stop-stream/{session_id}` - Stop recording
47
+ - `GET /api/audio/recordings` - List all recordings
48
+
49
+ ### 4. **File Storage Structure**
50
+ ```
51
+ /Model/
52
+ ├── my_scan.mp4 # Reference video (uploaded via API)
53
+ ├── ref_*.jpg # Reference images (uploaded via API)
54
+ ├── embeddings_cache.pkl # Processed face embeddings
55
+ ├── yolov8n-face.pt # YOLO model (static)
56
+ └── audio_recordings/
57
+ ├── audio_{uuid}_{timestamp}.wav # Audio recording
58
+ └── audio_{uuid}_{timestamp}_metadata.txt # Angle metadata (CSV)
59
+ ```
60
+
61
+ ### 5. **Audio Metadata Format**
62
+ The metadata file stores timestamp and angle in CSV format:
63
+ ```csv
64
+ timestamp,angle
65
+ 0.000,45.50
66
+ 0.064,46.20
67
+ 0.128,47.00
68
+ ```
69
+
70
+ ## How to Use
71
+
72
+ ### Upload 360-Degree Video:
73
+ ```bash
74
+ curl -X POST "http://localhost:8000/api/face/upload-video" \
75
+ -H "Authorization: Bearer YOUR_TOKEN" \
76
+ -F "file=@my_360_scan.mp4"
77
+ ```
78
+
79
+ ### Upload Reference Image:
80
+ ```bash
81
+ curl -X POST "http://localhost:8000/api/face/upload-image" \
82
+ -H "Authorization: Bearer YOUR_TOKEN" \
83
+ -F "file=@reference.jpg"
84
+ ```
85
+
86
+ ### Start Audio Stream:
87
+ ```bash
88
+ # 1. Start stream (get session_id)
89
+ curl -X POST "http://localhost:8000/api/audio/start-stream" \
90
+ -H "Authorization: Bearer YOUR_TOKEN" \
91
+ -F "sample_rate=16000" \
92
+ -F "channels=1"
93
+
94
+ # 2. Connect via WebSocket and stream
95
+ # ws://localhost:8000/ws/audio/{session_id}
96
+
97
+ # 3. Send audio chunks (binary or JSON with angle)
98
+ # Binary: raw 16-bit PCM audio bytes
99
+ # JSON: {"audio_data": "base64...", "angle": 45.5}
100
+
101
+ # 4. Stop: {"command": "stop"}
102
+ ```
103
+
104
+ ## Key Features
105
+
106
+ 1. **Quality Filtering**: Video processing uses blur detection and frontal face scoring to select best frames
107
+ 2. **Temporal Spacing**: Selects frames evenly distributed across the video for comprehensive coverage
108
+ 3. **Angle Tracking**: Audio streams can include direction/angle metadata for spatial audio analysis
109
+ 4. **Mono/Stereo Support**: Configurable audio channels (1 or 2)
110
+ 5. **Authentication**: All endpoints protected with JWT tokens
111
+ 6. **Async Processing**: CPU-intensive tasks run in thread pool executor
112
+
113
+ ## Original face_model.py
114
+
115
+ The original file at `/Model/face_model.py` remains unchanged and can still be run standalone for testing or manual processing. The new API provides the same functionality but in a service-oriented architecture accessible via HTTP/WebSocket.
116
+
117
+ ## Dependencies
118
+
119
+ All required packages are already in `requirements.txt`:
120
+ - FastAPI, Uvicorn
121
+ - OpenCV (cv2)
122
+ - DeepFace
123
+ - Ultralytics (YOLO)
124
+ - NumPy
125
+ - Wave (stdlib)
126
+
127
+ No additional dependencies needed!
pyproject.toml ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [project]
2
+ name = "backend"
3
+ version = "0.1.0"
4
+ description = "Add your description here"
5
+ readme = "README.md"
6
+ requires-python = ">=3.13"
7
+ dependencies = [
8
+ "deepface>=0.0.99",
9
+ "fastapi>=0.135.3",
10
+ "pymongo>=4.9.0",
11
+ "numpy>=2.4.4",
12
+ "opencv-python>=4.13.0.92",
13
+ "passlib[bcrypt]>=1.7.4",
14
+ "python-jose[cryptography]>=3.5.0",
15
+ "tf-keras>=2.21.0",
16
+ "ultralytics>=8.4.33",
17
+ "uvicorn>=0.43.0",
18
+ "websockets>=16.0",
19
+ "lap>=0.5.13",
20
+ "bcrypt>=5.0.0",
21
+ "python-dotenv>=1.2.2",
22
+ "python-multipart>=0.0.22",
23
+ ]
requirements.txt CHANGED
@@ -5,3 +5,8 @@ opencv-python
5
  numpy
6
  ultralytics
7
  deepface
 
 
 
 
 
 
5
  numpy
6
  ultralytics
7
  deepface
8
+ pymongo>=4.9.0
9
+ passlib[bcrypt]
10
+ bcrypt
11
+ python-jose[cryptography]
12
+ python-dotenv
server.py CHANGED
@@ -1,19 +1,35 @@
1
- from fastapi import FastAPI, WebSocket, WebSocketDisconnect
2
  from fastapi.middleware.cors import CORSMiddleware
3
  from fastapi.responses import StreamingResponse
 
4
  import uvicorn
5
  import cv2
6
  import numpy as np
7
- import base64
8
  import json
9
  import logging
10
  import asyncio
11
  from concurrent.futures import ThreadPoolExecutor
12
- from datetime import datetime
13
  import threading
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  from services.single_tracker import SingleTracker
16
  from services.multi_tracker import MultiTracker
 
 
 
 
 
17
 
18
  # Configure logging
19
  logging.basicConfig(level=logging.INFO)
@@ -49,9 +65,125 @@ app.add_middleware(
49
  allow_headers=["*"],
50
  )
51
 
52
- # Initialize trackers
 
53
  single_tracker = SingleTracker()
54
  multi_tracker = MultiTracker()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
  def decode_binary_image(img_data: bytes):
57
  """Decodes raw JPEG bytes into an OpenCV numpy array."""
@@ -186,8 +318,95 @@ async def vcam_generator_loop():
186
 
187
  @app.on_event("startup")
188
  async def startup_event():
 
 
 
 
 
 
 
 
 
189
  asyncio.create_task(vcam_generator_loop())
190
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
191
  @app.websocket("/ws")
192
  async def websocket_endpoint(websocket: WebSocket):
193
  global is_recording, video_writer, recording_filename, latest_obs_frame, is_obs_active
@@ -312,5 +531,184 @@ async def websocket_endpoint(websocket: WebSocket):
312
  video_writer = None
313
  is_recording = False
314
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
315
  if __name__ == "__main__":
316
  uvicorn.run("server:app", host="0.0.0.0", port=8000, reload=True)
 
1
+ from fastapi import FastAPI, WebSocket, WebSocketDisconnect, HTTPException, status, Depends, UploadFile, File, Form
2
  from fastapi.middleware.cors import CORSMiddleware
3
  from fastapi.responses import StreamingResponse
4
+ from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
5
  import uvicorn
6
  import cv2
7
  import numpy as np
 
8
  import json
9
  import logging
10
  import asyncio
11
  from concurrent.futures import ThreadPoolExecutor
12
+ from datetime import datetime, timedelta
13
  import threading
14
+ import os
15
+ import base64
16
+ import hashlib
17
+ from pydantic import BaseModel, Field
18
+ from pymongo import AsyncMongoClient
19
+ from passlib.context import CryptContext
20
+ from jose import JWTError, jwt
21
+ from dotenv import load_dotenv
22
+ from pathlib import Path
23
+ import shutil
24
+ import uuid
25
 
26
  from services.single_tracker import SingleTracker
27
  from services.multi_tracker import MultiTracker
28
+ from services.face_recognition import FaceRecognitionService
29
+ from services.audio_processing import AudioProcessor
30
+
31
+ # Load environment variables from .env file
32
+ load_dotenv()
33
 
34
  # Configure logging
35
  logging.basicConfig(level=logging.INFO)
 
65
  allow_headers=["*"],
66
  )
67
 
68
+ # Initialize trackers and services
69
+ MODEL_DIR = Path(__file__).parent.parent / "Model"
70
  single_tracker = SingleTracker()
71
  multi_tracker = MultiTracker()
72
+ face_service = FaceRecognitionService(str(MODEL_DIR))
73
+ audio_processor = AudioProcessor(str(MODEL_DIR))
74
+
75
+ # MongoDB state
76
+ mongo_client: AsyncMongoClient | None = None
77
+ users_collection = None
78
+
79
+ pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
80
+
81
+ # JWT Configuration
82
+ SECRET_KEY = os.getenv("JWT_SECRET_KEY", "your-secret-key-change-in-production")
83
+ ALGORITHM = "HS256"
84
+ ACCESS_TOKEN_EXPIRE_MINUTES = 60 * 24 * 7 # 7 days
85
+
86
+ security = HTTPBearer()
87
+
88
+
89
+ class RegisterRequest(BaseModel):
90
+ full_name: str = Field(min_length=2, max_length=80)
91
+ email: str = Field(min_length=5, max_length=254)
92
+ password: str = Field(min_length=8, max_length=128)
93
+
94
+
95
+ class LoginRequest(BaseModel):
96
+ email: str = Field(min_length=5, max_length=254)
97
+ password: str = Field(min_length=8, max_length=128)
98
+
99
+
100
+ class UserPublic(BaseModel):
101
+ id: str
102
+ full_name: str
103
+ email: str
104
+
105
+
106
+ class AuthResponse(BaseModel):
107
+ ok: bool
108
+ message: str
109
+ user: UserPublic
110
+ token: str
111
+
112
+
113
+ def normalize_email(email: str) -> str:
114
+ return email.strip().lower()
115
+
116
+
117
+ def get_password_hash(password: str) -> str:
118
+ # Hash password with SHA256 first to handle any length, then use bcrypt
119
+ password_hash = hashlib.sha256(password.encode('utf-8')).hexdigest()
120
+ return pwd_context.hash(password_hash)
121
+
122
+
123
+ def verify_password(plain_password: str, hashed_password: str) -> bool:
124
+ # Apply same SHA256 transformation before verifying
125
+ password_hash = hashlib.sha256(plain_password.encode('utf-8')).hexdigest()
126
+ return pwd_context.verify(password_hash, hashed_password)
127
+
128
+
129
+ def require_users_collection():
130
+ if users_collection is None:
131
+ raise HTTPException(
132
+ status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
133
+ detail="Database is not initialized yet. Please retry.",
134
+ )
135
+ return users_collection
136
+
137
+
138
+ def create_access_token(data: dict, expires_delta: timedelta | None = None):
139
+ to_encode = data.copy()
140
+ if expires_delta:
141
+ expire = datetime.utcnow() + expires_delta
142
+ else:
143
+ expire = datetime.utcnow() + timedelta(minutes=ACCESS_TOKEN_EXPIRE_MINUTES)
144
+ to_encode.update({"exp": expire})
145
+ encoded_jwt = jwt.encode(to_encode, SECRET_KEY, algorithm=ALGORITHM)
146
+ return encoded_jwt
147
+
148
+
149
+ async def get_current_user(credentials: HTTPAuthorizationCredentials = Depends(security)):
150
+ collection = require_users_collection()
151
+ token = credentials.credentials
152
+
153
+ try:
154
+ payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
155
+ user_id: str = payload.get("sub")
156
+ if user_id is None:
157
+ raise HTTPException(
158
+ status_code=status.HTTP_401_UNAUTHORIZED,
159
+ detail="Invalid authentication credentials",
160
+ )
161
+ except JWTError:
162
+ raise HTTPException(
163
+ status_code=status.HTTP_401_UNAUTHORIZED,
164
+ detail="Invalid or expired token",
165
+ )
166
+
167
+ from bson import ObjectId
168
+ try:
169
+ user_doc = await collection.find_one({"_id": ObjectId(user_id)})
170
+ except:
171
+ raise HTTPException(
172
+ status_code=status.HTTP_401_UNAUTHORIZED,
173
+ detail="User not found",
174
+ )
175
+
176
+ if user_doc is None:
177
+ raise HTTPException(
178
+ status_code=status.HTTP_401_UNAUTHORIZED,
179
+ detail="User not found",
180
+ )
181
+
182
+ return UserPublic(
183
+ id=str(user_doc["_id"]),
184
+ full_name=user_doc["full_name"],
185
+ email=user_doc["email"],
186
+ )
187
 
188
  def decode_binary_image(img_data: bytes):
189
  """Decodes raw JPEG bytes into an OpenCV numpy array."""
 
318
 
319
  @app.on_event("startup")
320
  async def startup_event():
321
+ global mongo_client, users_collection
322
+ mongo_uri = os.getenv("MONGODB_URI", "mongodb://localhost:27017")
323
+ mongo_db_name = os.getenv("MONGODB_DB", "afs")
324
+
325
+ mongo_client = AsyncMongoClient(mongo_uri)
326
+ users_collection = mongo_client[mongo_db_name]["users"]
327
+ await users_collection.create_index("email", unique=True)
328
+ logger.info("Connected to MongoDB and initialized users index.")
329
+
330
  asyncio.create_task(vcam_generator_loop())
331
 
332
+
333
+ @app.on_event("shutdown")
334
+ async def shutdown_event():
335
+ global mongo_client
336
+ if mongo_client is not None:
337
+ mongo_client.close()
338
+ logger.info("MongoDB connection closed.")
339
+
340
+
341
+ @app.post("/auth/register", response_model=AuthResponse)
342
+ async def register(payload: RegisterRequest):
343
+ collection = require_users_collection()
344
+ email = normalize_email(payload.email)
345
+
346
+ existing_user = await collection.find_one({"email": email})
347
+ if existing_user:
348
+ raise HTTPException(
349
+ status_code=status.HTTP_409_CONFLICT,
350
+ detail="An account with this email already exists.",
351
+ )
352
+
353
+ now = datetime.utcnow()
354
+ user_doc = {
355
+ "full_name": payload.full_name.strip(),
356
+ "email": email,
357
+ "password_hash": get_password_hash(payload.password),
358
+ "created_at": now,
359
+ "updated_at": now,
360
+ }
361
+ insert_result = await collection.insert_one(user_doc)
362
+
363
+ user_id = str(insert_result.inserted_id)
364
+ access_token = create_access_token(data={"sub": user_id})
365
+
366
+ return AuthResponse(
367
+ ok=True,
368
+ message="Account created successfully.",
369
+ user=UserPublic(
370
+ id=user_id,
371
+ full_name=user_doc["full_name"],
372
+ email=user_doc["email"],
373
+ ),
374
+ token=access_token,
375
+ )
376
+
377
+
378
+ @app.post("/auth/login", response_model=AuthResponse)
379
+ async def login(payload: LoginRequest):
380
+ collection = require_users_collection()
381
+ email = normalize_email(payload.email)
382
+
383
+ user_doc = await collection.find_one({"email": email})
384
+ if not user_doc or not verify_password(payload.password, user_doc["password_hash"]):
385
+ raise HTTPException(
386
+ status_code=status.HTTP_401_UNAUTHORIZED,
387
+ detail="Invalid email or password.",
388
+ )
389
+
390
+ user_id = str(user_doc["_id"])
391
+ access_token = create_access_token(data={"sub": user_id})
392
+
393
+ return AuthResponse(
394
+ ok=True,
395
+ message="Login successful.",
396
+ user=UserPublic(
397
+ id=user_id,
398
+ full_name=user_doc["full_name"],
399
+ email=user_doc["email"],
400
+ ),
401
+ token=access_token,
402
+ )
403
+
404
+
405
+ @app.get("/auth/verify", response_model=UserPublic)
406
+ async def verify_token(current_user: UserPublic = Depends(get_current_user)):
407
+ """Verify JWT token and return user info"""
408
+ return current_user
409
+
410
  @app.websocket("/ws")
411
  async def websocket_endpoint(websocket: WebSocket):
412
  global is_recording, video_writer, recording_filename, latest_obs_frame, is_obs_active
 
531
  video_writer = None
532
  is_recording = False
533
 
534
+ # === FACE RECOGNITION ENDPOINTS ===
535
+
536
+ @app.post("/api/face/upload-video")
537
+ async def upload_reference_video(
538
+ file: UploadFile = File(...),
539
+ current_user: UserPublic = Depends(get_current_user)
540
+ ):
541
+ """Upload a 360-degree reference video for face recognition training."""
542
+ if not file.filename.endswith(('.mp4', '.avi', '.mov', '.mkv')):
543
+ raise HTTPException(status_code=400, detail="Invalid video format. Use mp4, avi, mov, or mkv")
544
+
545
+ video_path = MODEL_DIR / "my_scan.mp4"
546
+
547
+ try:
548
+ with open(video_path, 'wb') as f:
549
+ shutil.copyfileobj(file.file, f)
550
+
551
+ embeddings, num_frames = await asyncio.get_event_loop().run_in_executor(
552
+ executor, face_service.extract_embeddings_from_video, str(video_path)
553
+ )
554
+
555
+ face_service.save_embeddings_cache(embeddings, str(video_path), num_frames)
556
+
557
+ return {
558
+ "ok": True,
559
+ "message": "Video processed successfully",
560
+ "frames_used": num_frames,
561
+ "embeddings_count": len(embeddings)
562
+ }
563
+ except Exception as e:
564
+ logger.error(f"Error processing video: {e}")
565
+ raise HTTPException(status_code=500, detail=str(e))
566
+
567
+ @app.post("/api/face/upload-image")
568
+ async def upload_reference_image(
569
+ file: UploadFile = File(...),
570
+ current_user: UserPublic = Depends(get_current_user)
571
+ ):
572
+ """Upload a reference image for face recognition."""
573
+ if not file.filename.endswith(('.jpg', '.jpeg', '.png')):
574
+ raise HTTPException(status_code=400, detail="Invalid image format. Use jpg, jpeg, or png")
575
+
576
+ image_path = MODEL_DIR / f"ref_{file.filename}"
577
+
578
+ try:
579
+ with open(image_path, 'wb') as f:
580
+ shutil.copyfileobj(file.file, f)
581
+
582
+ embeddings = await asyncio.get_event_loop().run_in_executor(
583
+ executor, face_service.extract_embeddings_from_image, str(image_path)
584
+ )
585
+
586
+ return {
587
+ "ok": True,
588
+ "message": "Image processed successfully",
589
+ "embeddings_count": len(embeddings),
590
+ "saved_path": str(image_path)
591
+ }
592
+ except Exception as e:
593
+ logger.error(f"Error processing image: {e}")
594
+ raise HTTPException(status_code=500, detail=str(e))
595
+
596
+ @app.get("/api/face/cache-status")
597
+ async def get_cache_status(current_user: UserPublic = Depends(get_current_user)):
598
+ """Get the current face recognition cache status."""
599
+ cache_data = face_service.load_embeddings_cache()
600
+
601
+ if cache_data:
602
+ return {
603
+ "ok": True,
604
+ "cached": True,
605
+ "video_path": cache_data.get('video_path'),
606
+ "model_name": cache_data.get('model_name'),
607
+ "num_frames_used": cache_data.get('num_frames_used'),
608
+ "version": cache_data.get('version')
609
+ }
610
+ else:
611
+ return {
612
+ "ok": True,
613
+ "cached": False,
614
+ "message": "No cache found. Please upload a reference video or image."
615
+ }
616
+
617
+ # === AUDIO STREAMING ENDPOINTS ===
618
+
619
+ @app.post("/api/audio/start-stream")
620
+ async def start_audio_stream(
621
+ sample_rate: int = Form(16000),
622
+ channels: int = Form(1),
623
+ current_user: UserPublic = Depends(get_current_user)
624
+ ):
625
+ """Start a new audio recording stream."""
626
+ session_id = str(uuid.uuid4())
627
+
628
+ try:
629
+ filename = audio_processor.create_audio_stream(session_id, sample_rate, channels)
630
+ return {
631
+ "ok": True,
632
+ "session_id": session_id,
633
+ "filename": filename,
634
+ "sample_rate": sample_rate,
635
+ "channels": channels
636
+ }
637
+ except Exception as e:
638
+ logger.error(f"Error starting audio stream: {e}")
639
+ raise HTTPException(status_code=500, detail=str(e))
640
+
641
+ @app.websocket("/ws/audio/{session_id}")
642
+ async def websocket_audio_stream(websocket: WebSocket, session_id: str):
643
+ """WebSocket endpoint for streaming audio with angle data."""
644
+ await websocket.accept()
645
+ logger.info(f"Audio WebSocket connection established for session {session_id}")
646
+
647
+ try:
648
+ while True:
649
+ message = await websocket.receive()
650
+
651
+ if "bytes" in message:
652
+ audio_data = message["bytes"]
653
+ audio_processor.write_audio_chunk(session_id, audio_data)
654
+ await websocket.send_json({"status": "received", "bytes": len(audio_data)})
655
+
656
+ elif "text" in message:
657
+ try:
658
+ payload = json.loads(message["text"])
659
+
660
+ if "audio_data" in payload and "angle" in payload:
661
+ audio_bytes = base64.b64decode(payload["audio_data"])
662
+ angle = float(payload["angle"])
663
+ audio_processor.write_audio_chunk(session_id, audio_bytes, angle)
664
+ await websocket.send_json({"status": "received", "angle": angle})
665
+
666
+ elif payload.get("command") == "stop":
667
+ audio_processor.close_audio_stream(session_id)
668
+ await websocket.send_json({"status": "stopped", "message": "Stream closed"})
669
+ break
670
+
671
+ except json.JSONDecodeError:
672
+ logger.error("Invalid JSON in audio stream")
673
+
674
+ except WebSocketDisconnect:
675
+ logger.info(f"Audio WebSocket client disconnected for session {session_id}")
676
+ if session_id in audio_processor.active_streams:
677
+ audio_processor.close_audio_stream(session_id)
678
+ except Exception as e:
679
+ logger.error(f"Audio WebSocket error: {e}")
680
+ if session_id in audio_processor.active_streams:
681
+ audio_processor.close_audio_stream(session_id)
682
+
683
+ @app.post("/api/audio/stop-stream/{session_id}")
684
+ async def stop_audio_stream(
685
+ session_id: str,
686
+ current_user: UserPublic = Depends(get_current_user)
687
+ ):
688
+ """Stop an active audio recording stream."""
689
+ try:
690
+ audio_processor.close_audio_stream(session_id)
691
+ return {
692
+ "ok": True,
693
+ "message": "Audio stream stopped successfully"
694
+ }
695
+ except Exception as e:
696
+ logger.error(f"Error stopping audio stream: {e}")
697
+ raise HTTPException(status_code=500, detail=str(e))
698
+
699
+ @app.get("/api/audio/recordings")
700
+ async def list_audio_recordings(current_user: UserPublic = Depends(get_current_user)):
701
+ """List all audio recordings."""
702
+ try:
703
+ recordings = audio_processor.get_audio_files()
704
+ return {
705
+ "ok": True,
706
+ "recordings": recordings,
707
+ "count": len(recordings)
708
+ }
709
+ except Exception as e:
710
+ logger.error(f"Error listing recordings: {e}")
711
+ raise HTTPException(status_code=500, detail=str(e))
712
+
713
  if __name__ == "__main__":
714
  uvicorn.run("server:app", host="0.0.0.0", port=8000, reload=True)
services/audio_processing.py ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import wave
2
+ import numpy as np
3
+ import logging
4
+ from pathlib import Path
5
+ from datetime import datetime
6
+
7
+ logger = logging.getLogger(__name__)
8
+
9
+ class AudioProcessor:
10
+ def __init__(self, model_dir: str):
11
+ self.model_dir = Path(model_dir)
12
+ self.audio_dir = self.model_dir / "audio_recordings"
13
+ self.audio_dir.mkdir(exist_ok=True)
14
+
15
+ self.active_streams = {}
16
+
17
+ def create_audio_stream(self, session_id: str, sample_rate: int = 16000, channels: int = 1):
18
+ """Create a new audio recording stream."""
19
+ timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
20
+ filename = self.audio_dir / f"audio_{session_id}_{timestamp}.wav"
21
+ metadata_file = self.audio_dir / f"audio_{session_id}_{timestamp}_metadata.txt"
22
+
23
+ wav_file = wave.open(str(filename), 'wb')
24
+ wav_file.setnchannels(channels)
25
+ wav_file.setsampwidth(2)
26
+ wav_file.setframerate(sample_rate)
27
+
28
+ self.active_streams[session_id] = {
29
+ 'wav_file': wav_file,
30
+ 'metadata_file': metadata_file,
31
+ 'metadata_handle': open(metadata_file, 'w'),
32
+ 'sample_rate': sample_rate,
33
+ 'channels': channels,
34
+ 'frame_count': 0
35
+ }
36
+
37
+ logger.info(f"Created audio stream {session_id} -> {filename}")
38
+ return str(filename)
39
+
40
+ def write_audio_chunk(self, session_id: str, audio_data: bytes, angle: float = None):
41
+ """Write audio chunk with optional angle metadata."""
42
+ if session_id not in self.active_streams:
43
+ raise ValueError(f"No active stream for session {session_id}")
44
+
45
+ stream = self.active_streams[session_id]
46
+ stream['wav_file'].writeframes(audio_data)
47
+
48
+ if angle is not None:
49
+ timestamp = stream['frame_count'] / stream['sample_rate']
50
+ stream['metadata_handle'].write(f"{timestamp:.3f},{angle:.2f}\n")
51
+
52
+ stream['frame_count'] += len(audio_data) // (2 * stream['channels'])
53
+
54
+ def close_audio_stream(self, session_id: str):
55
+ """Close and finalize audio stream."""
56
+ if session_id not in self.active_streams:
57
+ raise ValueError(f"No active stream for session {session_id}")
58
+
59
+ stream = self.active_streams[session_id]
60
+ stream['wav_file'].close()
61
+ stream['metadata_handle'].close()
62
+
63
+ logger.info(f"Closed audio stream {session_id}")
64
+ del self.active_streams[session_id]
65
+
66
+ def get_audio_files(self):
67
+ """List all audio recordings."""
68
+ wav_files = list(self.audio_dir.glob("*.wav"))
69
+ return [str(f) for f in sorted(wav_files, reverse=True)]
services/face_recognition.py ADDED
@@ -0,0 +1,160 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import cv2
2
+ from ultralytics import YOLO
3
+ from deepface import DeepFace
4
+ import numpy as np
5
+ import pickle
6
+ import os
7
+ import logging
8
+ from pathlib import Path
9
+
10
+ logger = logging.getLogger(__name__)
11
+
12
+ class FaceRecognitionService:
13
+ def __init__(self, model_dir: str):
14
+ self.model_dir = Path(model_dir)
15
+ self.model_name = "ArcFace"
16
+ self.detector_model = "yolov8n-face.pt"
17
+ self.cache_file = self.model_dir / "embeddings_cache.pkl"
18
+
19
+ self.num_best_frames = 10
20
+ self.min_blur_threshold = 10
21
+ self.blur_weight = 0.6
22
+ self.frontal_weight = 0.4
23
+
24
+ def calculate_blur_score(self, image):
25
+ """Calculate sharpness using Laplacian variance."""
26
+ gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
27
+ laplacian_var = cv2.Laplacian(gray, cv2.CV_64F).var()
28
+ return laplacian_var
29
+
30
+ def calculate_frontal_score(self, face_data):
31
+ """Calculate how frontal the face is based on facial area."""
32
+ try:
33
+ facial_area = face_data.get('facial_area', {})
34
+ area = facial_area.get('w', 0) * facial_area.get('h', 0)
35
+ frontal_score = min(area / 50000.0, 1.0) * 100
36
+ return frontal_score
37
+ except:
38
+ return 50.0
39
+
40
+ def extract_embeddings_from_video(self, video_path: str):
41
+ """Extract high-quality face embeddings from a 360-degree scan video."""
42
+ logger.info(f"Processing reference video scan: {video_path}")
43
+ logger.info("Phase 1: Analyzing frame quality...")
44
+
45
+ cap_ref = cv2.VideoCapture(video_path)
46
+ total_frames = int(cap_ref.get(cv2.CAP_PROP_FRAME_COUNT))
47
+
48
+ candidate_frames = []
49
+ frame_idx = 0
50
+
51
+ while cap_ref.isOpened():
52
+ ret, frame = cap_ref.read()
53
+ if not ret:
54
+ break
55
+
56
+ if frame_idx % 15 == 0:
57
+ blur_score = self.calculate_blur_score(frame)
58
+
59
+ if blur_score <= self.min_blur_threshold:
60
+ logger.debug(f"Frame {frame_idx}: Blur={blur_score:.1f} (too blurry, skipped)")
61
+
62
+ if blur_score > self.min_blur_threshold:
63
+ try:
64
+ face_data = DeepFace.represent(frame, model_name=self.model_name, enforce_detection=True)[0]
65
+ frontal_score = self.calculate_frontal_score(face_data)
66
+ quality_score = (blur_score * self.blur_weight) + (frontal_score * self.frontal_weight)
67
+
68
+ candidate_frames.append({
69
+ 'frame_idx': frame_idx,
70
+ 'frame': frame.copy(),
71
+ 'blur_score': blur_score,
72
+ 'frontal_score': frontal_score,
73
+ 'quality_score': quality_score,
74
+ 'embedding': face_data['embedding']
75
+ })
76
+
77
+ logger.debug(f"Frame {frame_idx}: Quality={quality_score:.1f}")
78
+ except Exception as e:
79
+ logger.debug(f"Frame {frame_idx}: No face detected - {e}")
80
+
81
+ frame_idx += 1
82
+
83
+ cap_ref.release()
84
+
85
+ if not candidate_frames:
86
+ raise ValueError("No valid frames found in video")
87
+
88
+ logger.info(f"Phase 2: Selecting top {self.num_best_frames} frames with temporal spacing...")
89
+ candidate_frames.sort(key=lambda x: x['quality_score'], reverse=True)
90
+
91
+ segment_size = total_frames // self.num_best_frames
92
+ selected_frames = []
93
+
94
+ for segment_idx in range(self.num_best_frames):
95
+ segment_start = segment_idx * segment_size
96
+ segment_end = (segment_idx + 1) * segment_size
97
+
98
+ best_in_segment = None
99
+ best_quality = -1
100
+
101
+ for candidate in candidate_frames:
102
+ if segment_start <= candidate['frame_idx'] < segment_end:
103
+ if candidate['quality_score'] > best_quality:
104
+ best_quality = candidate['quality_score']
105
+ best_in_segment = candidate
106
+
107
+ if best_in_segment:
108
+ selected_frames.append(best_in_segment)
109
+ logger.debug(f"Segment {segment_idx+1}: Frame {best_in_segment['frame_idx']}")
110
+
111
+ if len(selected_frames) < self.num_best_frames:
112
+ for candidate in candidate_frames:
113
+ if candidate not in selected_frames:
114
+ selected_frames.append(candidate)
115
+ if len(selected_frames) >= self.num_best_frames:
116
+ break
117
+
118
+ logger.info(f"Phase 3: Averaging {len(selected_frames)} embeddings...")
119
+ embeddings_to_average = [frame['embedding'] for frame in selected_frames]
120
+ master_embedding = np.mean(embeddings_to_average, axis=0).tolist()
121
+
122
+ return [master_embedding], len(selected_frames)
123
+
124
+ def extract_embeddings_from_image(self, image_path: str):
125
+ """Extract face embedding from a single image."""
126
+ try:
127
+ embedding = DeepFace.represent(img_path=image_path, model_name=self.model_name)[0]["embedding"]
128
+ return [embedding]
129
+ except Exception as e:
130
+ logger.error(f"Could not extract embedding from {image_path}: {e}")
131
+ raise
132
+
133
+ def save_embeddings_cache(self, embeddings, video_path: str, num_frames_used: int):
134
+ """Save embeddings to cache file."""
135
+ cache_data = {
136
+ 'video_path': video_path,
137
+ 'video_mtime': os.path.getmtime(video_path) if os.path.exists(video_path) else None,
138
+ 'model_name': self.model_name,
139
+ 'embeddings': embeddings,
140
+ 'version': 2,
141
+ 'num_frames_used': num_frames_used
142
+ }
143
+
144
+ with open(self.cache_file, 'wb') as f:
145
+ pickle.dump(cache_data, f)
146
+
147
+ logger.info(f"Saved embeddings cache to {self.cache_file}")
148
+
149
+ def load_embeddings_cache(self):
150
+ """Load embeddings from cache file."""
151
+ if not os.path.exists(self.cache_file):
152
+ return None
153
+
154
+ try:
155
+ with open(self.cache_file, 'rb') as f:
156
+ cache_data = pickle.load(f)
157
+ return cache_data
158
+ except Exception as e:
159
+ logger.error(f"Could not load cache: {e}")
160
+ return None
test/test_mongodb.py ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Test MongoDB connection to verify configuration"""
2
+ import os
3
+ import asyncio
4
+ from dotenv import load_dotenv
5
+ from pymongo import AsyncMongoClient
6
+
7
+ async def test_mongodb_connection():
8
+ # Load environment variables
9
+ load_dotenv()
10
+
11
+ mongo_uri = os.getenv("MONGODB_URI", "mongodb://localhost:27017")
12
+ mongo_db_name = os.getenv("MONGODB_DB", "afs")
13
+
14
+ print(f"Testing MongoDB connection...")
15
+ print(f"URI: {mongo_uri[:20]}... (truncated for security)")
16
+ print(f"Database: {mongo_db_name}")
17
+
18
+ try:
19
+ client = AsyncMongoClient(mongo_uri)
20
+ # Test the connection
21
+ await client.admin.command('ping')
22
+ print("✓ Successfully connected to MongoDB!")
23
+
24
+ # Test database access
25
+ db = client[mongo_db_name]
26
+ collections = await db.list_collection_names()
27
+ print(f"✓ Database '{mongo_db_name}' accessible")
28
+ print(f" Collections: {collections if collections else '(none)'}")
29
+
30
+ client.close()
31
+ print("✓ Connection closed successfully")
32
+ return True
33
+ except Exception as e:
34
+ print(f"✗ Connection failed: {e}")
35
+ return False
36
+
37
+ if __name__ == "__main__":
38
+ asyncio.run(test_mongodb_connection())
test/test_ndi.py ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from ctypes import cdll, POINTER, c_int, c_void_p, Structure, c_char_p, byref
2
+ import sys
3
+
4
+ print("Loading NDI dylib directly...")
5
+ try:
6
+ libndi = cdll.LoadLibrary("/usr/local/lib/libndi.dylib")
7
+ print("Found NDI in /usr/local/lib")
8
+ except OSError:
9
+ try:
10
+ libndi = cdll.LoadLibrary("/Library/NDI SDK for Apple/lib/macOS/libndi.dylib")
11
+ print("Found NDI in SDK path")
12
+ except OSError as e:
13
+ print("NDI NOT FOUND:", e)
14
+ sys.exit(1)
15
+
16
+ # Make sure basic initialization works
17
+ libndi.NDIlib_initialize.restype = c_int
18
+ result = libndi.NDIlib_initialize()
19
+ print(f"NDIlib_initialize returned: {result}")
20
+ if result:
21
+ print("NDI initialized perfectly via ctypes bindings!")
test/test_vcam.py ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ import pyvirtualcam
2
+ import numpy as np
3
+
4
+ try:
5
+ with pyvirtualcam.Camera(width=1280, height=720, fps=30) as cam:
6
+ print(f"Virtual camera started: {cam.device} ({cam.width}x{cam.height} @ {cam.fps}fps)")
7
+ cam.send(np.zeros((720, 1280, 4), np.uint8))
8
+ except Exception as e:
9
+ print(f"FAILED VCAM: {e}")
test/test_writer.py ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import cv2
2
+ import numpy as np
3
+ try:
4
+ fourcc = cv2.VideoWriter_fourcc(*'avc1')
5
+ out = cv2.VideoWriter('test_avc1.mp4', fourcc, 5.0, (640, 480))
6
+ for i in range(10):
7
+ frame = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
8
+ out.write(frame)
9
+ out.release()
10
+ print("avc1 SUCCESS")
11
+ except Exception as e:
12
+ print(f"FAILED: {e}")