MacBook pro commited on
Commit
ba8225a
·
1 Parent(s): 32226d2

Pivot: remove GFPGAN + reenactment stack; CodeFormer-only enhancement, purge legacy files, update docs & downloader

Browse files
.github/copilot-instructions.md ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Prime Directive:
2
+ Deliver production-ready, systemic solutions to root causes. Prioritize core utility and absolute system integrity. There is zero tolerance for surface patches, brittle fixes, or non-functional code.
3
+ Mandatory Protocol:
4
+ Map the System: Before acting, map all relevant logic flows, data transformations, and dependencies. Identify all side effects.
5
+ Isolate Root Cause: Diagnose the fundamental issue with code-based evidence. Ensure the fix is systemic and permanent.
6
+ Align with Utility: Every change must advance the project's core objective. Reject low-impact optimizations.
7
+ Implementation Mandates:
8
+ Code Integrity: All code must be robust, generalizable, and directly executable. Prohibit all hardcoding, duplicated functionality, and placeholder logic.
9
+ Quality & Security: Enforce static typing, descriptive naming, and strict linting. Validate all I/O, eliminate unsafe calls, and add regression guards.
10
+ Testing: Test coverage must target both the symptom and its root cause. The full test suite must pass without warnings.
11
+ Execution Workflow:
12
+ Analyze system flow.
13
+ Confirm root cause.
14
+ Plan solution.
15
+ Implement the robust fix.
16
+ Validate with all tests.
17
+ Document systemic insights.
18
+
19
+ Project: Implements an AI avatar by streaming a user's local audio and video to a Hugging Face GPU server for immediate processing. In the cloud, the system performs simultaneous generative face swapping—animating a source image's identity with the user's live motion—and real-time voice conversion, which morphs the user's speech to a target profile while preserving the original prosody. The fully synchronized audio-visual output is then streamed back to the local machine, functioning as an integrated virtual camera and microphone for seamless use in communication platforms like Zoom and WhatsApp.
20
+
21
+ Operational instructions:
22
+ - All implementations must be architected for the huggingface space located at https://huggingface.co/spaces/Islamckennon/mirage
23
+ - After every change, push to github and huggingface, then await user feedback for next steps.
24
+ - All code must be archhitected towards project real-world functionality only.
.github/instructions.md ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Prime Directive:
2
+ Deliver production-ready, systemic solutions to root causes. Prioritize core utility and absolute system integrity. There is zero tolerance for surface patches, brittle fixes, or non-functional code.
3
+ Mandatory Protocol:
4
+ Map the System: Before acting, map all relevant logic flows, data transformations, and dependencies. Identify all side effects.
5
+ Isolate Root Cause: Diagnose the fundamental issue with code-based evidence. Ensure the fix is systemic and permanent.
6
+ Align with Utility: Every change must advance the project's core objective. Reject low-impact optimizations.
7
+ Implementation Mandates:
8
+ Code Integrity: All code must be robust, generalizable, and directly executable. Prohibit all hardcoding, duplication, and placeholder logic.
9
+ Quality & Security: Enforce static typing, descriptive naming, and strict linting. Validate all I/O, eliminate unsafe calls, and add regression guards.
10
+ Testing: Test coverage must target both the symptom and its root cause. The full test suite must pass without warnings.
11
+ Execution Workflow:
12
+ Analyze system flow.
13
+ Confirm root cause.
14
+ Plan solution.
15
+ Implement the robust fix.
16
+ Validate with all tests.
17
+ Document systemic insights.
README.md CHANGED
@@ -10,63 +10,61 @@ pinned: false
10
  license: mit
11
  hardware: a10g-large
12
  python_version: "3.10"
13
- models:
14
- - KwaiVGI/LivePortrait
15
- - RVC-Project/Retrieval-based-Voice-Conversion-WebUI
16
  tags:
17
  - real-time
18
  - ai-avatar
19
- - face-animation
20
  - voice-conversion
21
- - live-portrait
22
- - rvc
23
  - virtual-camera
24
- short_description: "Real-time AI avatar with face animation and voice conversion"
25
  ---
26
 
27
  # 🎭 Mirage: Real-time AI Avatar System
28
 
29
- Transform yourself into an AI avatar in real-time with sub-250ms latency! Perfect for video calls, streaming, and virtual meetings.
30
 
31
  ## 🚀 Features
32
 
33
- - **Real-time Face Animation**: Live portrait animation using state-of-the-art AI
34
- - **Voice Conversion**: Real-time voice transformation with RVC
35
- - **Ultra-low Latency**: <250ms end-to-end latency optimized for A10G GPU
36
- - **Virtual Camera**: Direct integration with Zoom, Teams, Discord, and more
37
- - **Adaptive Quality**: Automatic quality adjustment to maintain real-time performance
38
- - **GPU Optimized**: Efficient memory management and CUDA acceleration
 
39
 
40
  ## 🎯 Use Cases
41
 
42
- - **Video Conferencing**: Use AI avatars in Zoom, Google Meet, Microsoft Teams
43
- - **Content Creation**: Streaming with animated avatars on Twitch, YouTube
44
- - **Virtual Meetings**: Professional presentations with consistent avatar appearance
45
- - **Privacy Protection**: Maintain anonymity while participating in video calls
46
 
47
  ## 🛠️ Technology Stack
48
 
49
- - **Face Animation**: LivePortrait (KwaiVGI)
50
- - **Voice Conversion**: RVC (Retrieval-based Voice Conversion)
51
- - **Face Detection**: SCRFD with optimized inference
52
- - **Backend**: FastAPI with WebRTC (aiortc)
53
- - **Frontend**: WebRTC-enabled real-time client
54
- - **GPU**: NVIDIA A10G with CUDA optimization
 
55
 
56
- ## 📊 Performance Specs
57
 
58
- - **Video Resolution**: 512x512 @ 20 FPS (adaptive)
59
- - **Audio Processing**: 160ms chunks @ 16kHz
60
- - **End-to-end Latency**: <250ms target
61
- - **GPU Memory**: ~8GB peak usage on A10G
62
- - **Face Detection**: SCRFD every 5 frames for efficiency
63
 
64
- ## 🚀 Quick Start
65
 
66
- 1. **Initialize Pipeline**: Click "Initialize AI Pipeline" to load models
67
- 2. **Set Reference**: Upload your reference image for avatar creation
68
- 3. **Start Capture**: Begin real-time avatar generation
69
- 4. **Enable Virtual Camera**: Use avatar output in third-party apps
 
 
70
 
71
  ## 🔧 Technical Details
72
 
@@ -76,16 +74,18 @@ Transform yourself into an AI avatar in real-time with sub-250ms latency! Perfec
76
  - GPU memory management and cleanup
77
  - Audio-video synchronization within 150ms
78
 
79
- ### Model Architecture
80
- - **LivePortrait**: Efficient portrait animation with stitching control
81
- - **RVC**: High-quality voice conversion with minimal latency
82
- - **SCRFD**: Fast face detection with confidence thresholding
 
 
83
 
84
  ### Real-time Features
85
- - WebSocket streaming for minimal overhead
86
- - Adaptive resolution (512x512 384x384 256x256)
87
- - Quality degradation order: Quality FPS Resolution
88
- - Automatic recovery when performance improves
89
 
90
  ## 📱 Virtual Camera Integration
91
 
@@ -96,45 +96,81 @@ The system creates a virtual camera device that can be used in:
96
  - **Social Media**: WhatsApp Desktop, Skype, Facebook Messenger
97
  - **Gaming**: Steam, Discord voice channels
98
 
99
- ## ⚡ Performance Monitoring
100
-
101
- Real-time metrics include:
102
- - Video FPS and latency
103
- - GPU memory usage
104
- - Audio processing time
105
- - Frame drop statistics
106
- - System resource utilization
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
 
108
  ## 🔒 Privacy & Security
109
 
110
- - All processing happens locally on the GPU
111
- - No data is stored or transmitted to external servers
112
- - Reference images are processed in memory only
113
- - WebSocket connections use secure protocols
114
 
115
- ## 🔧 Advanced Configuration
116
 
117
- The system automatically adapts quality based on performance:
118
-
119
- - **High Performance**: 512x512 @ 20 FPS, full quality
120
- - **Medium Performance**: 384x384 @ 18 FPS, reduced quality
121
- - **Low Performance**: 256x256 @ 15 FPS, minimum quality
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
122
 
123
  ## 📋 Requirements
124
 
125
- - **GPU**: NVIDIA A10G or equivalent (RTX 3080+ recommended)
126
- - **Memory**: 16GB+ RAM, 8GB+ VRAM
127
- - **Browser**: Chrome/Edge with WebRTC support
128
- - **Camera**: Any USB webcam or built-in camera
 
129
 
130
- ## 🛠️ Development
131
 
132
- Built with modern technologies:
133
- - FastAPI for high-performance backend (Docker entrypoint: uvicorn original_fastapi_app:app)
134
- - PyTorch with CUDA acceleration
135
- - OpenCV for image processing
136
- - WebRTC (aiortc) for real-time media transport
137
- - Docker for consistent deployment
 
 
138
 
139
  ## 📄 License
140
 
@@ -142,30 +178,19 @@ MIT License - Feel free to use and modify for your projects!
142
 
143
  ## 🙏 Acknowledgments
144
 
145
- - [LivePortrait](https://github.com/KwaiVGI/LivePortrait) for face animation
146
- - [RVC Project](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI) for voice conversion
147
- - [InsightFace](https://github.com/deepinsight/insightface) for face detection
148
- - HuggingFace for providing A10G GPU infrastructure
149
-
150
- ## Metrics Endpoints
151
- - `GET /metrics` – JSON with audio/video counters, EMAs (loop interval, inference), rolling FPS, frame interval EMA.
152
- - `GET /gpu` – GPU availability & memory (torch or `nvidia-smi` fallback).
153
- - `GET /metrics/async` – Async worker stats (frames submitted/processed, queue depth, last latency ms).
154
- - `GET /metrics/stage_histogram` Histogram buckets of recent inference stage latencies (snapshot window).
155
- - `GET /metrics/motion` – Recent motion magnitudes (normalized) plus tail statistics.
156
- - `GET /metrics/pacing` – Latency EMA and pacing hint multiplier ( >1.0 suggests you can raise FPS, <1.0 suggests throttling ).
157
- - `POST /smoothing/update` – Runtime update of One Euro keypoint smoothing params. JSON body keys: `min_cutoff`, `beta`, `d_cutoff` (all optional floats).
158
-
159
- Example:
160
- ```bash
161
- curl -s http://localhost:7860/metrics | jq '.video_fps_rolling, .audio_infer_time_ema_ms'
162
- ```
163
 
164
  ## Voice Stub Activation
165
- Set `MIRAGE_VOICE_ENABLE=1` to activate the voice processor stub. Behavior:
166
- - Audio chunks are routed through `voice_processor.process_pcm_int16` (pass-through now).
167
- - `audio_infer_time_ema_ms` becomes > 0 after a few chunks.
168
- - When disabled, inference EMA remains 0.0.
169
 
170
  ## Future Parameterization
171
  - Frontend will fetch a `/config` endpoint to align `chunk_ms` and `video_max_fps` dynamically.
@@ -202,10 +227,8 @@ If the Space shows a perpetual "Restarting" badge:
202
 
203
  If problems persist, capture the Container log stack trace and open an issue.
204
 
205
- ## Enable ONNX Model Downloads (Safe LivePortrait)
206
- ## Advanced Real-time Metrics & Control
207
-
208
- New runtime observability & control surfaces were added to tune real-time performance:
209
 
210
  ### Endpoints Recap
211
  See Metrics Endpoints section above. Typical usage examples:
@@ -225,57 +248,23 @@ curl -s http://localhost:7860/metrics/motion | jq '.recent_motion[-5:]'
225
  ### Motion Magnitude
226
  Aggregated from per-frame keypoint motion vectors; higher values trigger more frequent face detection to avoid drift. Low motion stretches automatically reduce detection frequency to save compute.
227
 
228
- ### One Euro Smoothing Parameters
229
- You can initialize or override smoothing parameters via environment variables:
230
-
231
- | Variable | Default | Meaning |
232
- |----------|---------|---------|
233
- | `MIRAGE_ONEEURO_MIN_CUTOFF` | 1.0 | Base cutoff frequency controlling overall smoothing strength |
234
- | `MIRAGE_ONEEURO_BETA` | 0.05 | Speed coefficient (higher reduces lag during fast motion) |
235
- | `MIRAGE_ONEEURO_D_CUTOFF` | 1.0 | Derivative cutoff for velocity filtering |
236
-
237
- Runtime adjustments:
238
- ```bash
239
- curl -X POST http://localhost:7860/smoothing/update \
240
- -H 'Content-Type: application/json' \
241
- -d '{"min_cutoff":0.8, "beta":0.07}'
242
- ```
243
- Missing keys leave existing values unchanged. The response echoes the active parameters.
244
 
245
  ### Latency Histogram Snapshots
246
  `/metrics/stage_histogram` exposes periodic snapshots (e.g. every N frames) of stage latency distribution to help identify tail regressions. Use to tune pacing thresholds or decide on model quantization.
247
 
248
- ## Environment Variables Summary (New Additions)
249
-
250
- | Name | Purpose | Default |
251
- |------|---------|---------|
252
- | `MIRAGE_ONEEURO_MIN_CUTOFF` | One Euro base cutoff | 1.0 |
253
- | `MIRAGE_ONEEURO_BETA` | One Euro speed coefficient | 0.05 |
254
- | `MIRAGE_ONEEURO_D_CUTOFF` | One Euro derivative cutoff | 1.0 |
255
-
256
-
257
- To pull LivePortrait ONNX files into the container at runtime and enable the safe animation path:
258
-
259
- 1) Set these Space secrets/variables in the Settings → Variables panel:
260
-
261
- - `MIRAGE_ENABLE_SCRFD=1` (already default in Dockerfile)
262
- - `MIRAGE_ENABLE_LIVEPORTRAIT=1`
263
- - `MIRAGE_DOWNLOAD_MODELS=1`
264
- - `MIRAGE_LP_APPEARANCE_URL=https://huggingface.co/myn0908/Live-Portrait-ONNX/resolve/main/appearance_feature_extractor.onnx`
265
- - `MIRAGE_LP_MOTION_URL=https://huggingface.co/myn0908/Live-Portrait-ONNX/resolve/main/motion_extractor.onnx` (optional)
266
-
267
- 2) Restart the Space. The server will download models in the background on startup, and also sync once when you hit "Initialize AI Pipeline".
268
-
269
- 3) Check `/pipeline_status` or the in-UI metrics to see:
270
- - `ai_pipeline.animator_available: true`
271
- - `ai_pipeline.reference_set: true` (after uploading a reference)
272
-
273
- Notes:
274
- - The safe loader uses onnxruntime-gpu if available, otherwise CPU. This path provides a visible transformation placeholder and validates end-to-end integration.
275
- - Keep model URLs only to assets you have permission to download.
276
 
277
- ## Model Weights (Planned Voice Pipeline)
278
- The codebase now contains placeholder directories for upcoming audio feature extraction and conversion models.
279
 
280
  ```
281
  models/
 
10
  license: mit
11
  hardware: a10g-large
12
  python_version: "3.10"
 
 
 
13
  tags:
14
  - real-time
15
  - ai-avatar
16
+ - face-swap
17
  - voice-conversion
 
 
18
  - virtual-camera
19
+ short_description: "Real-time AI avatar with face swap + voice conversion"
20
  ---
21
 
22
  # 🎭 Mirage: Real-time AI Avatar System
23
 
24
+ Mirage performs real-time identity-preserving face swap plus optional facial enhancement and (stub) voice conversion, streaming back a virtual camera + microphone feed with sub250ms target latency. Designed for live calls, streaming overlays, and privacy where you want a consistent alternate appearance.
25
 
26
  ## 🚀 Features
27
 
28
+ - **Real-time Face Swap (InSwapper)**: Identity transfer from a single reference image to your live video.
29
+ - **Enhancement (Optional)**: CodeFormer restoration (fidelity‑controllable) if weights present.
30
+ - **Low Latency WebRTC**: Bi-directional streaming via aiortc (camera + mic) with adaptive frame scaling.
31
+ - **Voice Conversion Stub**: Pluggable path ready for RVC / HuBERT integration (currently pass-through by default).
32
+ - **Virtual Camera**: Output suitable for Zoom, Meet, Discord, OBS (via local virtual camera module).
33
+ - **Model Auto-Provisioning**: Deterministic downloader for required swap + enhancer weights.
34
+ - **Metrics & Health**: JSON endpoints for latency, FPS, GPU memory, and pipeline stats.
35
 
36
  ## 🎯 Use Cases
37
 
38
+ - **Video Conferencing Privacy**: Appear as a consistent alternate identity.
39
+ - **Streaming / VTubing**: Lightweight swap + enhancement pipeline for overlays.
40
+ - **A/B Creative Experiments**: Rapid prototyping of face identity transforms.
41
+ - **Data Minimization**: Keep original face private while communicating.
42
 
43
  ## 🛠️ Technology Stack
44
 
45
+ - **Face Detection & Embedding**: InsightFace `buffalo_l` (SCRFD + embedding).
46
+ - **Face Swap Core**: `inswapper_128_fp16.onnx` (InSwapper) via InsightFace model zoo.
47
+ - **Enhancer (optional)**: CodeFormer 0.1 (fidelity controllable).
48
+ - **Backend**: FastAPI + aiortc (WebRTC) + asyncio.
49
+ - **Metrics**: Custom endpoints (`/metrics`, `/gpu`) with rolling latency/FPS stats.
50
+ - **Downloader**: Atomic, lock-protected model fetcher (`model_downloader.py`).
51
+ - **Frontend**: Minimal WebRTC client (`static/`).
52
 
53
+ ## 📊 Performance Targets
54
 
55
+ - **Processing Window**: <50ms typical swap @ 512px (A10G) w/ single face.
56
+ - **End-to-end Latency Goal**: <250ms (capture swap → enhancement → return).
57
+ - **Adaptive Scale**: Frames >512px longest side are downscaled before inference.
58
+ - **Enhancement Overhead**: CodeFormer ~18–35ms (A10G, single face, 512px) – approximate; adjust fidelity to trade quality vs latency.
 
59
 
60
+ ## 🚀 Quick Start (Hugging Face Space)
61
 
62
+ 1. Open the Space UI and allow camera/microphone.
63
+ 2. Click **Initialize** triggers model download (if not already cached) & pipeline load.
64
+ 3. Upload a clear, front-facing reference image (only largest face is used).
65
+ 4. Start streaming swapped frames appear in the preview.
66
+ 5. (Optional) Provide CodeFormer weights (`models/codeformer/codeformer.pth`) for enhancement.
67
+ 6. Use the virtual camera integration locally (if running self-hosted) to broadcast swapped output to Zoom/OBS.
68
 
69
  ## 🔧 Technical Details
70
 
 
74
  - GPU memory management and cleanup
75
  - Audio-video synchronization within 150ms
76
 
77
+ ### Model Flow
78
+ 1. Capture frame optional downscale to <=512 max side
79
+ 2. InsightFace detector+embedding obtains face bboxes + identity vectors
80
+ 3. InSwapper ONNX performs identity replacement using source embedding
81
+ 4. Optional CodeFormer enhancer refines facial region
82
+ 5. Frame returned to WebRTC outbound track
83
 
84
  ### Real-time Features
85
+ - WebRTC (aiortc) low-latency transport.
86
+ - Asynchronous frame processing (background tasks) to avoid blocking capture.
87
+ - Adaptive pre-inference downscale heuristic (cap largest dimension to 512).
88
+ - Metrics-driven latency tracking for dynamic future pacing.
89
 
90
  ## 📱 Virtual Camera Integration
91
 
 
96
  - **Social Media**: WhatsApp Desktop, Skype, Facebook Messenger
97
  - **Gaming**: Steam, Discord voice channels
98
 
99
+ ## ⚡ Metrics & Observability
100
+
101
+ Key endpoints (base URL: running server root):
102
+
103
+ | Endpoint | Description |
104
+ |----------|-------------|
105
+ | `/metrics` | Core video/audio latency & FPS stats |
106
+ | `/gpu` | GPU presence + memory usage (torch / nvidia-smi) |
107
+ | `/webrtc/ping` | WebRTC router availability & TURN status |
108
+ | `/pipeline_status` (if implemented) | High-level pipeline readiness |
109
+
110
+ Pipeline stats (subset) from swap pipeline:
111
+ ```json
112
+ {
113
+ "frames": 240,
114
+ "avg_latency_ms": 42.7,
115
+ "swap_faces_last": 1,
116
+ "enhanced_frames": 180,
117
+ "enhancer": "codeformer",
118
+ "codeformer_fidelity": 0.75,
119
+ "codeformer_loaded": true
120
+ }
121
+ ```
122
 
123
  ## 🔒 Privacy & Security
124
 
125
+ - No reference image persisted to disk (processed in-memory).
126
+ - Only model weights are cached; media frames are transient.
127
+ - Optional API key enforcement via `MIRAGE_API_KEY` + `MIRAGE_REQUIRE_API_KEY=1`.
 
128
 
129
+ ## 🔧 Environment Variables (Face Swap & Enhancers)
130
 
131
+ | Variable | Purpose | Default |
132
+ |----------|---------|---------|
133
+ | `MIRAGE_DOWNLOAD_MODELS` | Auto download required models on startup | `1` |
134
+ | `MIRAGE_INSWAPPER_URL` | Override InSwapper ONNX URL | internal default |
135
+ | `MIRAGE_CODEFORMER_URL` | Override CodeFormer weight URL | 0.1 release |
136
+ | `MIRAGE_CODEFORMER_FIDELITY` | 0.0=more detail recovery, 1.0=preserve input | `0.75` |
137
+ | `MIRAGE_MAX_FACES` | Swap up to N largest faces per frame | `1` |
138
+ | `MIRAGE_CUDA_ONLY` | Restrict ONNX to CUDA EP + CPU fallback | unset |
139
+ | `MIRAGE_API_KEY` | Shared secret for control / TURN token | unset |
140
+ | `MIRAGE_REQUIRE_API_KEY` | Enforce API key if set | `0` |
141
+ | `MIRAGE_TOKEN_TTL` | Signed token lifetime (seconds) | `300` |
142
+ | `MIRAGE_STUN_URLS` | Comma list of STUN servers | Google defaults |
143
+ | `MIRAGE_TURN_URL` | TURN URI(s) (comma separated) | unset |
144
+ | `MIRAGE_TURN_USER` | TURN username | unset |
145
+ | `MIRAGE_TURN_PASS` | TURN credential | unset |
146
+ | `MIRAGE_FORCE_RELAY` | Force relay-only traffic | `0` |
147
+ | `MIRAGE_TURN_TLS_ONLY` | Filter TURN to TLS/TCP | `1` |
148
+ | `MIRAGE_PREFER_H264` | Prefer H264 codec in SDP munging | `0` |
149
+ | `MIRAGE_VOICE_ENABLE` | Enable voice processor stub | `0` |
150
+
151
+ CodeFormer fidelity example:
152
+ ```bash
153
+ MIRAGE_CODEFORMER_FIDELITY=0.6
154
+ ```
155
 
156
  ## 📋 Requirements
157
 
158
+ - **GPU**: NVIDIA (Ampere+ recommended). CPU-only will be extremely slow.
159
+ - **VRAM**: ~3–4GB baseline (swap + detector) + optional enhancer overhead.
160
+ - **RAM**: 8GB+ (12–16GB recommended for multitasking).
161
+ - **Browser**: Chromium-based / Firefox with WebRTC.
162
+ - **Reference Image**: Clear, frontal, good lighting, minimal occlusions.
163
 
164
+ ## 🛠️ Development / Running Locally
165
 
166
+ Download models & start server:
167
+ ```bash
168
+ python model_downloader.py # or set MIRAGE_DOWNLOAD_MODELS=1 and let startup handle
169
+ uvicorn app:app --port 7860 --host 0.0.0.0
170
+ ```
171
+ Open the browser client at `http://localhost:7860`.
172
+
173
+ Set a reference image via UI (Base64 upload path) then begin WebRTC session. Inspect `/metrics` for swap latency and `webrtc/debug_state` for connection internals.
174
 
175
  ## 📄 License
176
 
 
178
 
179
  ## 🙏 Acknowledgments
180
 
181
+ - [InsightFace](https://github.com/deepinsight/insightface) (detection + swap)
182
+ - [CodeFormer](https://github.com/sczhou/CodeFormer) (fidelity-controllable enhancement)
183
+ - Hugging Face (inference infra)
184
+
185
+ ## Metrics Endpoints (Current Subset)
186
+ - `GET /metrics`
187
+ - `GET /gpu`
188
+ - `GET /webrtc/ping`
189
+ - `GET /webrtc/debug_state`
190
+ - (Legacy endpoints referenced in SPEC may be pruned in future refactors.)
 
 
 
 
 
 
 
 
191
 
192
  ## Voice Stub Activation
193
+ Set `MIRAGE_VOICE_ENABLE=1` to route audio through the placeholder voice processor. Current behavior is pass‑through while preserving structural hooks for future RVC model integration.
 
 
 
194
 
195
  ## Future Parameterization
196
  - Frontend will fetch a `/config` endpoint to align `chunk_ms` and `video_max_fps` dynamically.
 
227
 
228
  If problems persist, capture the Container log stack trace and open an issue.
229
 
230
+ ## Model Auto-Download
231
+ `model_downloader.py` manages required weights with atomic file locks. It supports overriding sources via env variables and gracefully continues if optional enhancers fail to download.
 
 
232
 
233
  ### Endpoints Recap
234
  See Metrics Endpoints section above. Typical usage examples:
 
248
  ### Motion Magnitude
249
  Aggregated from per-frame keypoint motion vectors; higher values trigger more frequent face detection to avoid drift. Low motion stretches automatically reduce detection frequency to save compute.
250
 
251
+ ### Enhancer Fidelity (CodeFormer)
252
+ Fidelity weight (`w`):
253
+ - Lower (e.g. 0.3–0.5): More aggressive restoration, may alter identity details.
254
+ - Higher (0.7–0.9): Preserve more original swapped structure, less smoothing.
255
+ Tune with `MIRAGE_CODEFORMER_FIDELITY`.
 
 
 
 
 
 
 
 
 
 
 
256
 
257
  ### Latency Histogram Snapshots
258
  `/metrics/stage_histogram` exposes periodic snapshots (e.g. every N frames) of stage latency distribution to help identify tail regressions. Use to tune pacing thresholds or decide on model quantization.
259
 
260
+ ## Security Notes
261
+ If exposing publicly:
262
+ - Set `MIRAGE_API_KEY` and `MIRAGE_REQUIRE_API_KEY=1`.
263
+ - Serve behind TLS (reverse proxy like Caddy / Nginx for certificate management).
264
+ - Optionally restrict TURN server usage or enforce relay only for stricter NAT traversal control.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
265
 
266
+ ## Planned Voice Pipeline (Future)
267
+ Placeholder directories exist for future real-time voice conversion integration.
268
 
269
  ```
270
  models/
Repair/INCREMENTAL_INTEGRATION.md DELETED
@@ -1,119 +0,0 @@
1
- # Incremental Model Integration Guide
2
-
3
- ## Respecting Your Existing Architecture
4
-
5
- Your team made excellent decisions to avoid wholesale replacement. Here's how to safely integrate AI models:
6
-
7
- ## Phase 1: Add Feature Flags (Zero Risk)
8
-
9
- Add to your environment or startup:
10
- ```bash
11
- # Start with models disabled
12
- export MIRAGE_ENABLE_SCRFD=0
13
- export MIRAGE_ENABLE_LIVEPORTRAIT=0
14
-
15
- # Enable gradually
16
- export MIRAGE_ENABLE_SCRFD=1 # Enable face detection first
17
- export MIRAGE_ENABLE_LIVEPORTRAIT=1 # Enable animation second
18
- ```
19
-
20
- ## Phase 2: Integrate Safe Model Loader
21
-
22
- In your existing `avatar_pipeline.py`, add:
23
- ```python
24
- # At the top
25
- from safe_model_integration import get_safe_model_loader
26
-
27
- class RealTimeAvatarPipeline:
28
- def __init__(self):
29
- # Your existing code...
30
-
31
- # Add safe model loader
32
- self.safe_loader = get_safe_model_loader()
33
-
34
- async def initialize(self):
35
- # Your existing initialization...
36
-
37
- # Add safe model loading
38
- await self.safe_loader.safe_load_scrfd()
39
- await self.safe_loader.safe_load_liveportrait()
40
-
41
- def process_video_frame(self, frame, frame_idx):
42
- # Your existing code...
43
-
44
- # Enhanced face detection (graceful fallback)
45
- bbox = self.safe_loader.safe_detect_face(frame)
46
-
47
- # Enhanced animation (graceful fallback to pass-through)
48
- if self.reference_frame is not None:
49
- result = self.safe_loader.safe_animate_face(self.reference_frame, frame)
50
- else:
51
- result = frame # Keep existing pass-through logic
52
-
53
- return result
54
- ```
55
-
56
- ## Phase 3: Enhanced Metrics (Drop-in)
57
-
58
- In your existing `get_performance_stats()`:
59
- ```python
60
- from enhanced_metrics import enhance_existing_stats
61
-
62
- def get_performance_stats(self):
63
- # Your existing stats collection...
64
- base_stats = {
65
- "models_loaded": self.loaded,
66
- # ... your existing metrics
67
- }
68
-
69
- # Enhance with percentiles
70
- return enhance_existing_stats(base_stats)
71
- ```
72
-
73
- ## Phase 4: Optional Model Download
74
-
75
- When you want models:
76
- ```bash
77
- # Check what's needed
78
- python3 scripts/optional_download_models.py --status
79
-
80
- # Download only when features are enabled
81
- MIRAGE_ENABLE_SCRFD=1 python3 scripts/optional_download_models.py --download-needed
82
- ```
83
-
84
- ## Phase 5: WebRTC Monitoring (Optional)
85
-
86
- In your existing `webrtc_server.py`:
87
- ```python
88
- from webrtc_connection_monitoring import add_connection_monitoring
89
-
90
- # After creating your router
91
- add_connection_monitoring(router, _peer_state)
92
- ```
93
-
94
- ## Validation Steps
95
-
96
- 1. **Feature Flags Off**: System works exactly as before
97
- 2. **SCRFD Enabled**: Face detection works, falls back gracefully
98
- 3. **LivePortrait Enabled**: Animation works, falls back to pass-through
99
- 4. **Metrics Enhanced**: More detailed latency tracking
100
- 5. **Models Optional**: Download only when needed
101
-
102
- ## Rollback Strategy
103
-
104
- At any point:
105
- ```bash
106
- # Disable all features
107
- export MIRAGE_ENABLE_SCRFD=0
108
- export MIRAGE_ENABLE_LIVEPORTRAIT=0
109
-
110
- # System returns to existing pass-through behavior
111
- ```
112
-
113
- This approach:
114
- - ✅ Keeps your token auth intact
115
- - ✅ Preserves existing WebRTC message schema
116
- - ✅ Maintains Docker compatibility
117
- - ✅ Allows gradual rollout with instant rollback
118
- - ✅ No background tasks at import time
119
- - ✅ Compatible with your A10G + CUDA 12.1 setup
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Repair/QUICK_CHECKLIST.md DELETED
@@ -1,77 +0,0 @@
1
- # 🚀 QUICK ACTION CHECKLIST
2
-
3
- ## Immediate Actions (Today - 30 minutes)
4
-
5
- ### ✅ Step 1: Add Enhanced Metrics (5 minutes)
6
- ```python
7
- # Copy enhanced_metrics.py to your project
8
- # In your existing avatar_pipeline.py get_performance_stats():
9
- from enhanced_metrics import enhance_existing_stats
10
- return enhance_existing_stats(base_stats)
11
- ```
12
-
13
- ### ✅ Step 2: Add Safe Model Integration (10 minutes)
14
- ```python
15
- # Copy safe_model_integration.py to your project
16
- # In your existing avatar_pipeline.py __init__():
17
- from safe_model_integration import get_safe_model_loader
18
- self.safe_loader = get_safe_model_loader()
19
-
20
- # In your existing initialize():
21
- await self.safe_loader.safe_load_scrfd()
22
- await self.safe_loader.safe_load_liveportrait()
23
-
24
- # In your process_video_frame():
25
- bbox = self.safe_loader.safe_detect_face(frame)
26
- if self.reference_frame is not None:
27
- result = self.safe_loader.safe_animate_face(self.reference_frame, frame)
28
- else:
29
- result = frame
30
- ```
31
-
32
- ### ✅ Step 3: Test with Features Disabled (5 minutes)
33
- ```bash
34
- export MIRAGE_ENABLE_SCRFD=0
35
- export MIRAGE_ENABLE_LIVEPORTRAIT=0
36
- # Verify system works exactly as before
37
- curl /health && curl /metrics
38
- ```
39
-
40
- ### ✅ Step 4: Enable SCRFD Gradually (5 minutes)
41
- ```bash
42
- export MIRAGE_ENABLE_SCRFD=1
43
- # Test face detection
44
- curl -X POST /initialize
45
- curl /metrics # Check for face detection timing
46
- ```
47
-
48
- ### ✅ Step 5: Enable LivePortrait (5 minutes)
49
- ```bash
50
- export MIRAGE_ENABLE_LIVEPORTRAIT=1
51
- # Test animation
52
- curl /metrics # Check for animation timing
53
- ```
54
-
55
- ## Success Indicators
56
-
57
- - [ ] Enhanced metrics show P50/P95 latency percentiles
58
- - [ ] SCRFD=1 enables face detection, fallback works on errors
59
- - [ ] LIVEPORTRAIT=1 enables animation, fallback works on errors
60
- - [ ] System maintains existing pass-through behavior
61
- - [ ] /health endpoint shows models_loaded status
62
- - [ ] Token auth and message schemas unchanged
63
-
64
- ## Instant Rollback
65
- ```bash
66
- export MIRAGE_ENABLE_SCRFD=0
67
- export MIRAGE_ENABLE_LIVEPORTRAIT=0
68
- # Returns to exact previous behavior
69
- ```
70
-
71
- ## Files to Copy
72
- - [x] `safe_model_integration.py` → Your project root
73
- - [x] `enhanced_metrics.py` → Your project root
74
- - [x] `scripts/optional_download_models.py` → Your scripts/ folder
75
- - [x] `webrtc_connection_monitoring.py` → Optional for /webrtc/connections
76
-
77
- **🎯 Total Time: 30 minutes for safe AI integration**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Repair/TARGETED_RECOMMENDATIONS.md DELETED
@@ -1,207 +0,0 @@
1
- # 🎯 TARGETED RECOMMENDATIONS - SAFE AI INTEGRATION
2
-
3
- ## Assessment: Your Dev Team is Absolutely Right
4
-
5
- Your team's analysis shows **excellent engineering judgment**. Wholesale replacement would introduce unnecessary risks to a working system. Here are targeted improvements that respect your architecture:
6
-
7
- ## ✅ IMMEDIATE WINS (Zero Risk)
8
-
9
- ### 1. Enhanced Metrics (Drop-in Compatible)
10
- **File**: `enhanced_metrics.py`
11
- **Integration**: Add to existing `get_performance_stats()`
12
- ```python
13
- from enhanced_metrics import enhance_existing_stats
14
- return enhance_existing_stats(existing_stats)
15
- ```
16
- **Benefits**:
17
- - P50/P95/P99 latency percentiles
18
- - Component-level timing breakdown
19
- - GPU memory monitoring
20
- - **Zero breaking changes**
21
-
22
- ### 2. Feature-Flagged Model Loading
23
- **File**: `safe_model_integration.py`
24
- **Integration**: Import in existing pipeline
25
- ```bash
26
- export MIRAGE_ENABLE_SCRFD=0 # Start disabled
27
- export MIRAGE_ENABLE_LIVEPORTRAIT=0
28
- ```
29
- **Benefits**:
30
- - Graceful fallback to pass-through
31
- - Enable/disable models instantly
32
- - No changes to existing message schemas
33
- - **Complete rollback capability**
34
-
35
- ## 🚀 MEDIUM-TERM ADDITIONS (Low Risk)
36
-
37
- ### 3. Connection Monitoring Endpoint
38
- **File**: `webrtc_connection_monitoring.py`
39
- **Integration**: Add to existing WebRTC router
40
- ```python
41
- add_connection_monitoring(router, _peer_state)
42
- ```
43
- **Benefits**:
44
- - `/webrtc/connections` diagnostic endpoint
45
- - Works with single-peer architecture
46
- - **No auth changes required**
47
-
48
- ### 4. Optional Model Download Utility
49
- **File**: `scripts/optional_download_models.py`
50
- **Usage**: On-demand only (not in Docker build)
51
- ```bash
52
- python3 scripts/optional_download_models.py --status
53
- ```
54
- **Benefits**:
55
- - Download models when features are enabled
56
- - Conservative model list (SCRFD + LivePortrait basics)
57
- - **Not baked into Docker build**
58
-
59
- ## 🎯 RESPECTS YOUR ARCHITECTURE DECISIONS
60
-
61
- ### ✅ What We're NOT Changing
62
- - **Docker Base**: Keep your CUDA 12.1.1 + cuDNN 8 runtime
63
- - **Token Auth**: Preserve your WebRTC authentication system
64
- - **Message Schema**: Keep `image_jpeg_base64` format
65
- - **Entry Point**: Keep your `original_fastapi_app.py`
66
- - **Background Tasks**: No import-time tasks
67
- - **Router Integration**: Keep your existing WebRTC setup
68
-
69
- ### ✅ What We're Safely Adding
70
- - **Feature flags** for gradual AI model rollout
71
- - **Enhanced metrics** for better observability
72
- - **Graceful fallbacks** that maintain pass-through behavior
73
- - **Optional utilities** for model management
74
- - **Diagnostic endpoints** for connection monitoring
75
-
76
- ## 📊 EXPECTED RESULTS WITH SAFE INTEGRATION
77
-
78
- ### Phase 1: Metrics Enhanced (Day 1)
79
- ```
80
- Before: Basic latency averages
81
- After: P50/P95/P99 percentiles + component breakdown
82
- Risk: Zero (pure addition)
83
- ```
84
-
85
- ### Phase 2: SCRFD Enabled (Day 2-3)
86
- ```
87
- Before: No face detection
88
- After: Real face detection with pass-through fallback
89
- Risk: Low (feature flag controlled)
90
- Command: MIRAGE_ENABLE_SCRFD=1
91
- ```
92
-
93
- ### Phase 3: LivePortrait Enabled (Day 4-7)
94
- ```
95
- Before: Pass-through video
96
- After: Real face animation with pass-through fallback
97
- Risk: Low (feature flag controlled)
98
- Command: MIRAGE_ENABLE_LIVEPORTRAIT=1
99
- ```
100
-
101
- ## 🔧 INTEGRATION SEQUENCE
102
-
103
- ### Step 1: Add Enhanced Metrics (5 minutes)
104
- ```python
105
- # In your existing pipeline get_performance_stats()
106
- from enhanced_metrics import enhance_existing_stats
107
- return enhance_existing_stats(base_stats)
108
- ```
109
-
110
- ### Step 2: Add Safe Model Loader (10 minutes)
111
- ```python
112
- # In your existing pipeline __init__()
113
- from safe_model_integration import get_safe_model_loader
114
- self.safe_loader = get_safe_model_loader()
115
-
116
- # In your existing initialize()
117
- await self.safe_loader.safe_load_scrfd()
118
- await self.safe_loader.safe_load_liveportrait()
119
- ```
120
-
121
- ### Step 3: Enable Features Gradually
122
- ```bash
123
- # Test SCRFD first
124
- export MIRAGE_ENABLE_SCRFD=1
125
- # Verify face detection works, fallback to pass-through on errors
126
-
127
- # Test LivePortrait second
128
- export MIRAGE_ENABLE_LIVEPORTRAIT=1
129
- # Verify animation works, fallback to pass-through on errors
130
- ```
131
-
132
- ### Step 4: Monitor and Validate
133
- ```bash
134
- curl /metrics # Check enhanced metrics
135
- curl /webrtc/connections # Check connection status
136
- curl /health # Verify system health
137
- ```
138
-
139
- ## ⚡ INSTANT ROLLBACK STRATEGY
140
-
141
- At any point, disable features:
142
- ```bash
143
- export MIRAGE_ENABLE_SCRFD=0
144
- export MIRAGE_ENABLE_LIVEPORTRAIT=0
145
- # System immediately returns to existing pass-through behavior
146
- ```
147
-
148
- ## 🎉 BENEFITS OF THIS APPROACH
149
-
150
- ### Technical Benefits
151
- - **Zero breaking changes** to existing working code
152
- - **Instant rollback** capability with feature flags
153
- - **Incremental validation** of each AI component
154
- - **Enhanced observability** with detailed metrics
155
- - **Compatible** with your CUDA 12.1 + A10G setup
156
-
157
- ### Business Benefits
158
- - **Reduced risk** of system downtime
159
- - **Faster iteration** with safe feature toggles
160
- - **Better debugging** with component-level metrics
161
- - **Proven stability** before full AI rollout
162
-
163
- ## 📋 FILES PROVIDED
164
-
165
- | File | Purpose | Integration Risk |
166
- |------|---------|------------------|
167
- | `safe_model_integration.py` | Feature-flagged AI models | **Low** - Graceful fallbacks |
168
- | `enhanced_metrics.py` | P50/P95 performance tracking | **Zero** - Pure addition |
169
- | `webrtc_connection_monitoring.py` | Connection diagnostics | **Low** - Read-only endpoint |
170
- | `scripts/optional_download_models.py` | On-demand model utility | **Zero** - Manual use only |
171
- | `INCREMENTAL_INTEGRATION.md` | Step-by-step guide | **Zero** - Documentation |
172
-
173
- ## 🚀 RECOMMENDED NEXT STEPS
174
-
175
- ### Today (30 minutes)
176
- 1. Add `enhanced_metrics.py` to your pipeline
177
- 2. Verify metrics show P50/P95 latencies
178
- 3. Add `safe_model_integration.py` with flags disabled
179
- 4. Test that system works exactly as before
180
-
181
- ### This Week
182
- 1. Enable SCRFD: `MIRAGE_ENABLE_SCRFD=1`
183
- 2. Verify face detection works with fallbacks
184
- 3. Monitor enhanced metrics for performance impact
185
- 4. Enable LivePortrait: `MIRAGE_ENABLE_LIVEPORTRAIT=1`
186
-
187
- ### Next Week
188
- 1. Validate end-to-end AI pipeline performance
189
- 2. Fine-tune model parameters if needed
190
- 3. Consider adding connection monitoring endpoint
191
- 4. Plan gradual rollout to production users
192
-
193
- ---
194
-
195
- ## 🎯 CONCLUSION
196
-
197
- Your team's approach is **architecturally sound**. These targeted improvements provide:
198
-
199
- - ✅ **Real AI model integration** with safety guardrails
200
- - ✅ **Enhanced observability** for performance debugging
201
- - ✅ **Zero risk** to existing stability and auth systems
202
- - ✅ **Instant rollback** capability at any point
203
- - ✅ **Incremental validation** of each component
204
-
205
- **This is the right way to add AI to a production system.**
206
-
207
- Your current working foundation + these safe additions = Production-ready AI avatar system with <200ms latency.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
SPEC.md CHANGED
@@ -1,3 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ## Goals
2
  - End-to-end audio latency < 250 ms (capture -> inference -> playback)
3
  - Video pipeline: 512x512 @ ≥20 FPS target under load
 
1
+ ## Architectural Reassessment (September 2025)
2
+
3
+ The initial implementation adopted a motion-driven portrait reenactment stack (LivePortrait ONNX models + custom alignment & smoothing) which is misaligned with the updated product goal: low-latency real-time face swapping with optional enhancement.
4
+
5
+ ### Misalignment Summary
6
+
7
+ | Target Need | LivePortrait Path | Impact |
8
+ |-------------|-------------------|--------|
9
+ | Direct identity substitution | Motion reenactment of a canonicalized reference | Unnecessary motion keypoint pipeline |
10
+ | Minimal per-frame latency (<80ms) | ~500–600ms generator stages logged | Fails real-time threshold |
11
+ | Simple detector→swap flow | Multi-stage appearance + motion + generator | Complexity & fragile compositing |
12
+ | Artifact cleanup (optional) | No enhancement stage | Lower visual fidelity |
13
+ | Multi-face capability | Single-face canonical reenactment focus | Limits scalability |
14
+
15
+ ### New Model Stack
16
+ 1. Detector / embeddings: insightface FaceAnalysis (buffalo_l pack → SCRFD_10G_KPS + recognition)
17
+ 2. Swapper: inswapper_128_fp16.onnx
18
+ 3. Enhancement (optional):
19
+ - CodeFormer (codeformer.pth) for fidelity‑controllable restoration
20
+
21
+ ### New Processing Loop
22
+ 1. Capture frame
23
+ 2. Detect faces (FaceAnalysis)
24
+ 3. For each target face (top-N): apply InSwapper with pre-extracted source identity
25
+ 4. (Optional) Run CodeFormer enhancer on final composited frame (if weights present)
26
+ 5. Emit frame to WebRTC
27
+
28
+ ### Environment Variables (Video / Enhancer)
29
+ | Variable | Values | Description |
30
+ |----------|--------|-------------|
31
+ | MIRAGE_MAX_FACES | int (default 1) | Swap up to N largest faces |
32
+ | MIRAGE_CODEFORMER_FIDELITY | 0.0–1.0 (default 0.75) | Balance identity (1.0) vs reconstruction sharpness |
33
+ | MIRAGE_INSWAPPER_URL | URL | Override InSwapper model source |
34
+ | MIRAGE_CODEFORMER_URL | URL | Override CodeFormer model source |
35
+
36
+ ### Deprecated / To Remove
37
+ liveportrait_engine.py, avatar_pipeline.py, alignment.py, smoothing.py, realtime_optimizer.py, virtual_camera.py (current unused), enhanced_metrics.py, landmark_reenactor.py, safe_model_integration.py, debug_mediapipe.py
38
+
39
+ These abstractions are reenactment-specific (appearance feature caching, keypoint smoothing, inverse warp compositing) and will be replaced by a concise `swap_pipeline.py`.
40
+
41
+ ---
42
  ## Goals
43
  - End-to-end audio latency < 250 ms (capture -> inference -> playback)
44
  - Video pipeline: 512x512 @ ≥20 FPS target under load
model_downloader.py CHANGED
@@ -1,18 +1,20 @@
1
- """
2
- Optional model downloader for deterministic builds.
3
- - Controlled with env MIRAGE_DOWNLOAD_MODELS=1
4
- - LivePortrait ONNX URLs controlled via env:
5
- * MIRAGE_LP_APPEARANCE_URL
6
- * MIRAGE_LP_MOTION_URL (required for motion)
7
- * MIRAGE_LP_GENERATOR_URL (optional; enables full neural synthesis)
8
- * MIRAGE_LP_STITCHING_URL (optional; some pipelines include extra stitching stage)
9
- - InsightFace models will still use the package cache; SCRFD will populate on first run.
 
10
 
11
- More robust with retries and alternative download methods (requests, huggingface_hub).
12
  """
13
  import os
14
  import sys
15
  import shutil
 
16
  from pathlib import Path
17
  import time
18
  from typing import Optional
@@ -39,7 +41,8 @@ try:
39
  except Exception:
40
  hf_hub_download = None
41
 
42
- LP_DIR = Path(__file__).parent / 'models' / 'liveportrait'
 
43
  HF_HOME = Path(os.getenv('HF_HOME', Path(__file__).parent / '.cache' / 'huggingface'))
44
  HF_HOME.mkdir(parents=True, exist_ok=True)
45
 
@@ -177,9 +180,9 @@ class _FileLock:
177
 
178
  def _audit(event: str, **extra):
179
  try:
180
- lp_dir = LP_DIR
181
- lp_dir.mkdir(parents=True, exist_ok=True)
182
- audit_path = lp_dir / '_download_audit.jsonl'
183
  payload = {
184
  'ts': time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime()),
185
  'event': event,
@@ -193,166 +196,60 @@ def _audit(event: str, **extra):
193
 
194
 
195
  def maybe_download() -> bool:
196
- if os.getenv('MIRAGE_DOWNLOAD_MODELS', '1').lower() not in ('1', 'true', 'yes', 'on'):
197
  print('[downloader] MIRAGE_DOWNLOAD_MODELS disabled')
198
  _audit('disabled')
199
  return False
200
-
201
- app_url = os.getenv('MIRAGE_LP_APPEARANCE_URL')
202
- motion_url = os.getenv('MIRAGE_LP_MOTION_URL')
203
- success = True
204
  _audit('start')
205
-
206
- # Download LivePortrait appearance extractor
207
- if app_url:
208
- dest = LP_DIR / 'appearance_feature_extractor.onnx'
209
- if not dest.exists():
210
- try:
211
- print(f'[downloader] Downloading appearance extractor...')
212
- with _FileLock(dest):
213
- if not dest.exists():
214
- _download(app_url, dest)
215
- converted = _maybe_convert_opset_to_19(dest)
216
- if converted != dest:
217
- try:
218
- shutil.copyfile(converted, dest)
219
- print(f"[downloader] Replaced appearance with opset19: {converted.name}")
220
- except Exception:
221
- pass
222
- print(f'[downloader] Downloaded: {dest}')
223
- _audit('download_ok', model='appearance', path=str(dest))
224
- except Exception as e:
225
- print(f'[downloader] Failed to download appearance extractor: {e}')
226
- _audit('download_error', model='appearance', error=str(e))
227
- success = False
228
- else:
229
- converted = _maybe_convert_opset_to_19(dest)
230
- if converted != dest:
231
- try:
232
- shutil.copyfile(converted, dest)
233
- print(f"[downloader] Updated cached appearance to opset19")
234
- except Exception:
235
- pass
236
- print(f'[downloader] ✅ Appearance extractor already exists: {dest}')
237
- _audit('exists', model='appearance', path=str(dest))
238
-
239
- # Download LivePortrait motion extractor
240
- if motion_url:
241
- dest = LP_DIR / 'motion_extractor.onnx'
242
- if not dest.exists():
243
- try:
244
- print(f'[downloader] Downloading motion extractor...')
245
- with _FileLock(dest):
246
- if not dest.exists():
247
- _download(motion_url, dest)
248
- converted = _maybe_convert_opset_to_19(dest)
249
- if converted != dest:
250
- try:
251
- shutil.copyfile(converted, dest)
252
- print(f"[downloader] Replaced motion with opset19: {converted.name}")
253
- except Exception:
254
- pass
255
- print(f'[downloader] ✅ Downloaded: {dest}')
256
- _audit('download_ok', model='motion', path=str(dest))
257
- except Exception as e:
258
- print(f'[downloader] ❌ Failed to download motion extractor: {e}')
259
- _audit('download_error', model='motion', error=str(e))
260
- success = False
261
- else:
262
- converted = _maybe_convert_opset_to_19(dest)
263
- if converted != dest:
264
- try:
265
- shutil.copyfile(converted, dest)
266
- print(f"[downloader] Updated cached motion to opset19")
267
- except Exception:
268
- pass
269
- print(f'[downloader] ✅ Motion extractor already exists: {dest}')
270
- _audit('exists', model='motion', path=str(dest))
271
-
272
- # Download additional models (generator required in neural-only mode)
273
- generator_url = os.getenv('MIRAGE_LP_GENERATOR_URL')
274
- if generator_url:
275
- dest = LP_DIR / 'generator.onnx'
276
- if not dest.exists():
277
- try:
278
- print(f'[downloader] Downloading generator model...')
279
- with _FileLock(dest):
280
- if not dest.exists():
281
- _download(generator_url, dest)
282
- if not _is_valid_onnx(dest):
283
- print(f"[downloader] ❌ Generator ONNX validation failed for {generator_url}")
284
- try:
285
- dest.unlink()
286
- except Exception:
287
- pass
288
- raise RuntimeError('generator download invalid')
289
- print(f'[downloader] ✅ Downloaded: {dest}')
290
- _audit('download_ok', model='generator', path=str(dest))
291
- except Exception as e:
292
- print(f'[downloader] ❌ Failed to download generator (required): {e}')
293
- _audit('download_error', model='generator', error=str(e))
294
- success = False
295
- else:
296
- if not _is_valid_onnx(dest):
297
- try:
298
- print(f"[downloader] Existing generator is invalid, removing and retrying download")
299
- dest.unlink()
300
- except Exception:
301
- pass
302
- try:
303
- print(f'[downloader] Downloading generator model...')
304
- with _FileLock(dest):
305
- if not dest.exists():
306
- _download(generator_url, dest)
307
- if not _is_valid_onnx(dest):
308
- raise RuntimeError(f'generator invalid after re-download: {generator_url}')
309
- print(f'[downloader] ✅ Downloaded: {dest}')
310
- _audit('download_ok', model='generator', path=str(dest), refreshed=True)
311
- except Exception as e2:
312
- print(f'[downloader] ❌ Failed to refresh invalid generator: {e2}')
313
- _audit('download_error', model='generator', error=str(e2), refreshed=True)
314
- success = False
315
- else:
316
- print(f'[downloader] ✅ Generator already exists: {dest}')
317
- _audit('exists', model='generator', path=str(dest))
318
- # Optional stitching model
319
- stitching_url = os.getenv('MIRAGE_LP_STITCHING_URL')
320
- if stitching_url:
321
- dest = LP_DIR / 'stitching.onnx'
322
- if not dest.exists():
323
- try:
324
- print(f'[downloader] Downloading stitching model...')
325
- _download(stitching_url, dest)
326
- print(f'[downloader] ✅ Downloaded: {dest}')
327
- _audit('download_ok', model='stitching', path=str(dest))
328
- except Exception as e:
329
- print(f'[downloader] ⚠️ Failed to download stitching (optional): {e}')
330
- _audit('download_error', model='stitching', error=str(e))
331
-
332
- # Optional custom ops plugin for GridSample 3D used by some generator variants
333
- grid_plugin_url = os.getenv('MIRAGE_LP_GRID_PLUGIN_URL')
334
- if grid_plugin_url:
335
- dest = LP_DIR / 'libgrid_sample_3d_plugin.so'
336
- if not dest.exists():
337
- try:
338
- print(f'[downloader] Downloading grid sample plugin...')
339
- _download(grid_plugin_url, dest)
340
- print(f'[downloader] ✅ Downloaded: {dest}')
341
- _audit('download_ok', model='grid_plugin', path=str(dest))
342
- except Exception as e:
343
- print(f'[downloader] ⚠️ Failed to download grid plugin (optional): {e}')
344
- _audit('download_error', model='grid_plugin', error=str(e))
345
-
346
  _audit('complete', success=success)
347
  return success
348
 
349
 
350
  if __name__ == '__main__':
351
- """Direct execution for debugging"""
352
- print("=== LivePortrait Model Downloader ===")
353
- success = maybe_download()
354
- if success:
355
- print("✅ All required models downloaded successfully")
356
  else:
357
- print("❌ Some model downloads failed")
358
  sys.exit(1)
 
1
+ """Model downloader for face swap stack (InSwapper + CodeFormer).
2
+
3
+ Environment:
4
+ MIRAGE_DOWNLOAD_MODELS=1|0
5
+ MIRAGE_INSWAPPER_URL (default HF inswapper 128)
6
+ MIRAGE_CODEFORMER_URL (default CodeFormer official release)
7
+
8
+ Models are stored under:
9
+ models/inswapper/inswapper_128_fp16.onnx
10
+ models/codeformer/codeformer.pth
11
 
12
+ Download priority: requests -> huggingface_hub heuristic. Safe across parallel processes via file locks.
13
  """
14
  import os
15
  import sys
16
  import shutil
17
+ import json
18
  from pathlib import Path
19
  import time
20
  from typing import Optional
 
41
  except Exception:
42
  hf_hub_download = None
43
 
44
+ INSWAPPER_DIR = Path(__file__).parent / 'models' / 'inswapper'
45
+ CODEFORMER_DIR = Path(__file__).parent / 'models' / 'codeformer'
46
  HF_HOME = Path(os.getenv('HF_HOME', Path(__file__).parent / '.cache' / 'huggingface'))
47
  HF_HOME.mkdir(parents=True, exist_ok=True)
48
 
 
180
 
181
  def _audit(event: str, **extra):
182
  try:
183
+ audit_dir = Path(__file__).parent / 'models' / '_logs'
184
+ audit_dir.mkdir(parents=True, exist_ok=True)
185
+ audit_path = audit_dir / 'download_audit.jsonl'
186
  payload = {
187
  'ts': time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime()),
188
  'event': event,
 
196
 
197
 
198
  def maybe_download() -> bool:
199
+ if os.getenv('MIRAGE_DOWNLOAD_MODELS', '1').lower() not in ('1','true','yes','on'):
200
  print('[downloader] MIRAGE_DOWNLOAD_MODELS disabled')
201
  _audit('disabled')
202
  return False
 
 
 
 
203
  _audit('start')
204
+ success = True
205
+
206
+ inswapper_url = os.getenv('MIRAGE_INSWAPPER_URL', 'https://huggingface.co/deepinsight/inswapper/resolve/main/inswapper_128_fp16.onnx')
207
+ codeformer_url = os.getenv('MIRAGE_CODEFORMER_URL', 'https://github.com/TencentARC/CodeFormer/releases/download/v0.1.0/codeformer.pth')
208
+
209
+ # InSwapper
210
+ inswapper_dest = INSWAPPER_DIR / 'inswapper_128_fp16.onnx'
211
+ if not inswapper_dest.exists():
212
+ try:
213
+ print('[downloader] Downloading InSwapper model...')
214
+ with _FileLock(inswapper_dest):
215
+ if not inswapper_dest.exists():
216
+ _download(inswapper_url, inswapper_dest)
217
+ print(f'[downloader] ✅ InSwapper ready: {inswapper_dest}')
218
+ _audit('download_ok', model='inswapper', path=str(inswapper_dest))
219
+ except Exception as e:
220
+ print(f'[downloader] ❌ InSwapper download failed: {e}')
221
+ _audit('download_error', model='inswapper', error=str(e))
222
+ success = False
223
+ else:
224
+ print(f'[downloader] InSwapper exists: {inswapper_dest}')
225
+ _audit('exists', model='inswapper', path=str(inswapper_dest))
226
+
227
+ # CodeFormer (optional)
228
+ codef_dest = CODEFORMER_DIR / 'codeformer.pth'
229
+ if not codef_dest.exists():
230
+ try:
231
+ print('[downloader] Downloading CodeFormer model...')
232
+ with _FileLock(codef_dest):
233
+ if not codef_dest.exists():
234
+ _download(codeformer_url, codef_dest)
235
+ print(f'[downloader] ✅ CodeFormer ready: {codef_dest}')
236
+ _audit('download_ok', model='codeformer', path=str(codef_dest))
237
+ except Exception as e:
238
+ print(f'[downloader] ⚠️ CodeFormer download failed (continuing): {e}')
239
+ _audit('download_error', model='codeformer', error=str(e))
240
+ else:
241
+ print(f'[downloader] CodeFormer exists: {codef_dest}')
242
+ _audit('exists', model='codeformer', path=str(codef_dest))
243
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
244
  _audit('complete', success=success)
245
  return success
246
 
247
 
248
  if __name__ == '__main__':
249
+ print("=== Model Downloader (InSwapper + CodeFormer) ===")
250
+ ok = maybe_download()
251
+ if ok:
252
+ print("✅ All required models downloaded successfully (some optional)")
 
253
  else:
254
+ print("❌ Some required model downloads failed")
255
  sys.exit(1)
requirements.txt CHANGED
@@ -1,21 +1,14 @@
1
  fastapi==0.104.1
2
  uvicorn[standard]==0.24.0
3
- # Torch packages are installed via Dockerfile with CUDA 11.8 wheels; avoid conflicting pins here
4
  aiortc==1.6.0
5
  websockets==11.0.3
6
  numpy==1.24.4
7
  opencv-python==4.8.1.78
8
  Pillow==10.0.1
9
- librosa==0.10.1
10
- soundfile==0.12.1
11
  insightface==0.7.3
12
- transformers==4.44.2
13
- onnx==1.16.1
14
- # ORT GPU pinned to a CUDA 11.x-friendly build (supports opset 20). 1.18.1 has CUDA 12 wheels too; ensure Docker CUDA base matches.
15
- onnxruntime-gpu==1.18.1
16
- onnxruntime-extensions==0.12.0
17
- huggingface-hub==0.24.5
18
  python-multipart==0.0.9
19
  av==11.0.0
20
  psutil==5.9.8
21
- mediapipe==0.10.7
 
1
  fastapi==0.104.1
2
  uvicorn[standard]==0.24.0
 
3
  aiortc==1.6.0
4
  websockets==11.0.3
5
  numpy==1.24.4
6
  opencv-python==4.8.1.78
7
  Pillow==10.0.1
 
 
8
  insightface==0.7.3
9
+ basicsr==1.4.2
10
+ timm==0.9.12
 
 
 
 
11
  python-multipart==0.0.9
12
  av==11.0.0
13
  psutil==5.9.8
14
+ huggingface-hub==0.24.5
swap_pipeline.py ADDED
@@ -0,0 +1,210 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import io
3
+ import time
4
+ import logging
5
+ from typing import Optional, Dict, Any, List
6
+
7
+ import numpy as np
8
+ import cv2
9
+ from PIL import Image
10
+
11
+ import insightface # ensures model pack loading
12
+ from insightface.app import FaceAnalysis
13
+
14
+ logger = logging.getLogger(__name__)
15
+
16
+ INSWAPPER_ONNX_PATH = os.path.join('models', 'inswapper', 'inswapper_128_fp16.onnx')
17
+ CODEFORMER_PATH = os.path.join('models', 'codeformer', 'codeformer.pth')
18
+
19
+ class FaceSwapPipeline:
20
+ """Direct face swap + optional enhancement pipeline.
21
+
22
+ Lifecycle:
23
+ 1. initialize() -> loads detector/recognizer (buffalo_l) and inswapper onnx
24
+ 2. set_source_image(image_bytes|np.array) -> extracts source identity face object
25
+ 3. process_frame(frame) -> swap all or top-N faces using source face
26
+ 4. (optional) CodeFormer enhancement (always attempted if model present)
27
+ """
28
+ def __init__(self):
29
+ self.initialized = False
30
+ self.source_face = None
31
+ self.source_img_meta = {}
32
+ # Single enhancer path: CodeFormer (optional)
33
+ self.max_faces = int(os.getenv('MIRAGE_MAX_FACES', '1'))
34
+ self._stats = {
35
+ 'frames': 0,
36
+ 'last_latency_ms': None,
37
+ 'avg_latency_ms': None,
38
+ 'swap_faces_last': 0,
39
+ 'enhanced_frames': 0
40
+ }
41
+ self._lat_hist: List[float] = []
42
+ self.app: Optional[FaceAnalysis] = None
43
+ self.swapper = None
44
+ self.codeformer = None
45
+ self.codeformer_fidelity = float(os.getenv('MIRAGE_CODEFORMER_FIDELITY', '0.75'))
46
+ self.codeformer_loaded = False
47
+
48
+ def initialize(self):
49
+ if self.initialized:
50
+ return True
51
+ providers = None
52
+ try:
53
+ # Let insightface choose; can restrict with env MIRAGE_CUDA_ONLY
54
+ if os.getenv('MIRAGE_CUDA_ONLY'):
55
+ providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
56
+ except Exception:
57
+ providers = None
58
+ self.app = FaceAnalysis(name='buffalo_l', providers=providers)
59
+ self.app.prepare(ctx_id=0, det_size=(640,640))
60
+ # Load swapper
61
+ if not os.path.isfile(INSWAPPER_ONNX_PATH):
62
+ raise FileNotFoundError(f"Missing inswapper model at {INSWAPPER_ONNX_PATH}")
63
+ self.swapper = insightface.model_zoo.get_model(INSWAPPER_ONNX_PATH, providers=providers)
64
+ # Optional CodeFormer enhancer
65
+ try:
66
+ # CodeFormer dependencies
67
+ from basicsr.utils import imwrite # noqa: F401
68
+ from basicsr.archs.rrdbnet_arch import RRDBNet # noqa: F401
69
+ import torch
70
+ from torchvision import transforms # noqa: F401
71
+ from collections import OrderedDict
72
+ # Lazy import codeformer util packaged structure (user expected to mount model)
73
+ if not os.path.isfile(CODEFORMER_PATH):
74
+ logger.warning(f"CodeFormer selected but model file missing: {CODEFORMER_PATH}")
75
+ else:
76
+ # Minimal inline loader (avoid full repo clone)
77
+ from torch import nn
78
+ class CodeFormerWrapper:
79
+ def __init__(self, model_path: str, fidelity: float):
80
+ from codeformer.archs.codeformer_arch import CodeFormer # type: ignore
81
+ self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
82
+ self.net = CodeFormer(dim_embd=512, codebook_size=1024, n_head=8, n_layers=9,
83
+ connect_list=['32','64','128','256']).to(self.device)
84
+ ckpt = torch.load(model_path, map_location='cpu')
85
+ if 'params_ema' in ckpt:
86
+ self.net.load_state_dict(ckpt['params_ema'], strict=False)
87
+ else:
88
+ self.net.load_state_dict(ckpt['state_dict'], strict=False)
89
+ self.net.eval()
90
+ self.fidelity = min(max(fidelity, 0.0), 1.0)
91
+ @torch.no_grad()
92
+ def enhance(self, img_bgr: np.ndarray) -> np.ndarray:
93
+ import torch.nn.functional as F
94
+ img = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
95
+ tensor = torch.from_numpy(img).float().to(self.device) / 255.0
96
+ tensor = tensor.permute(2,0,1).unsqueeze(0)
97
+ # CodeFormer forward expects (B,C,H,W)
98
+ try:
99
+ out = self.net(tensor, w=self.fidelity, adain=True)[0]
100
+ except Exception:
101
+ # Some variants return tuple
102
+ out = self.net(tensor, w=self.fidelity)[0]
103
+ out = (out.clamp(0,1) * 255.0).byte().permute(1,2,0).cpu().numpy()
104
+ return cv2.cvtColor(out, cv2.COLOR_RGB2BGR)
105
+ self.codeformer = CodeFormerWrapper(CODEFORMER_PATH, self.codeformer_fidelity)
106
+ self.codeformer_loaded = True
107
+ logger.info('CodeFormer loaded')
108
+ except Exception as e:
109
+ logger.warning(f"CodeFormer init failed, disabling: {e}")
110
+ self.codeformer = None
111
+ self.initialized = True
112
+ logger.info('FaceSwapPipeline initialized')
113
+ return True
114
+
115
+ def _decode_image(self, data) -> np.ndarray:
116
+ if isinstance(data, bytes):
117
+ arr = np.frombuffer(data, np.uint8)
118
+ img = cv2.imdecode(arr, cv2.IMREAD_COLOR)
119
+ return img
120
+ if isinstance(data, np.ndarray):
121
+ return data
122
+ if hasattr(data, 'read'):
123
+ buff = data.read()
124
+ arr = np.frombuffer(buff, np.uint8)
125
+ return cv2.imdecode(arr, cv2.IMREAD_COLOR)
126
+ raise TypeError('Unsupported image input type')
127
+
128
+ def set_source_image(self, image_input) -> bool:
129
+ if not self.initialized:
130
+ self.initialize()
131
+ img = self._decode_image(image_input)
132
+ if img is None:
133
+ logger.error('Failed to decode source image')
134
+ return False
135
+ faces = self.app.get(img)
136
+ if not faces:
137
+ logger.error('No face detected in source image')
138
+ return False
139
+ # Choose the largest face by bbox area
140
+ def _area(face):
141
+ x1,y1,x2,y2 = face.bbox.astype(int)
142
+ return (x2-x1)*(y2-y1)
143
+ faces.sort(key=_area, reverse=True)
144
+ self.source_face = faces[0]
145
+ self.source_img_meta = {'resolution': img.shape[:2], 'num_faces': len(faces)}
146
+ logger.info('Source face set')
147
+ return True
148
+
149
+ def process_frame(self, frame: np.ndarray) -> np.ndarray:
150
+ if not self.initialized or self.swapper is None or self.app is None or self.source_face is None:
151
+ return frame
152
+ t0 = time.time()
153
+ faces = self.app.get(frame)
154
+ if not faces:
155
+ self._record_latency(time.time() - t0)
156
+ self._stats['swap_faces_last'] = 0
157
+ return frame
158
+ # Sort faces by area and keep top-N
159
+ def _area(face):
160
+ x1,y1,x2,y2 = face.bbox.astype(int)
161
+ return (x2-x1)*(y2-y1)
162
+ faces.sort(key=_area, reverse=True)
163
+ out = frame
164
+ count = 0
165
+ for f in faces[:self.max_faces]:
166
+ try:
167
+ out = self.swapper.get(out, f, self.source_face, paste_back=True)
168
+ count += 1
169
+ except Exception as e:
170
+ logger.debug(f"Swap failed for face: {e}")
171
+ if count > 0 and self.codeformer is not None:
172
+ try:
173
+ out = self.codeformer.enhance(out)
174
+ self._stats['enhanced_frames'] += 1
175
+ except Exception as e:
176
+ logger.debug(f"CodeFormer enhancement failed: {e}")
177
+ self._record_latency(time.time() - t0)
178
+ self._stats['swap_faces_last'] = count
179
+ self._stats['frames'] += 1
180
+ return out
181
+
182
+ def _record_latency(self, dt: float):
183
+ ms = dt * 1000.0
184
+ self._stats['last_latency_ms'] = ms
185
+ self._lat_hist.append(ms)
186
+ if len(self._lat_hist) > 200:
187
+ self._lat_hist.pop(0)
188
+ self._stats['avg_latency_ms'] = float(np.mean(self._lat_hist)) if self._lat_hist else None
189
+
190
+ def get_stats(self) -> Dict[str, Any]:
191
+ return dict(
192
+ self._stats,
193
+ initialized=self.initialized,
194
+ codeformer_fidelity=self.codeformer_fidelity if self.codeformer is not None else None,
195
+ codeformer_loaded=self.codeformer_loaded,
196
+ )
197
+
198
+ # Backwards compatibility for earlier server expecting process_video_frame
199
+ def process_video_frame(self, frame: np.ndarray, frame_idx: int | None = None) -> np.ndarray:
200
+ return self.process_frame(frame)
201
+
202
+ # Singleton access similar to previous pattern
203
+ _pipeline_instance: Optional[FaceSwapPipeline] = None
204
+
205
+ def get_pipeline() -> FaceSwapPipeline:
206
+ global _pipeline_instance
207
+ if _pipeline_instance is None:
208
+ _pipeline_instance = FaceSwapPipeline()
209
+ _pipeline_instance.initialize()
210
+ return _pipeline_instance
webrtc_server.py CHANGED
@@ -66,8 +66,7 @@ class _PassThroughPipeline:
66
  def initialize(self):
67
  return True
68
 
69
- def set_reference_frame(self, img):
70
- # No-op reference; return False to indicate not used
71
  return False
72
 
73
  def process_video_frame(self, img, frame_idx=None):
@@ -86,10 +85,10 @@ def get_pipeline(): # type: ignore
86
  if _pipeline_singleton is not None:
87
  return _pipeline_singleton
88
  try:
89
- from avatar_pipeline import get_pipeline as _real_get_pipeline
90
  _pipeline_singleton = _real_get_pipeline()
91
  except Exception as e:
92
- logger.error(f"avatar_pipeline unavailable, using pass-through: {e}")
93
  _pipeline_singleton = _PassThroughPipeline()
94
  return _pipeline_singleton
95
 
 
66
  def initialize(self):
67
  return True
68
 
69
+ def set_source_image(self, img):
 
70
  return False
71
 
72
  def process_video_frame(self, img, frame_idx=None):
 
85
  if _pipeline_singleton is not None:
86
  return _pipeline_singleton
87
  try:
88
+ from swap_pipeline import get_pipeline as _real_get_pipeline
89
  _pipeline_singleton = _real_get_pipeline()
90
  except Exception as e:
91
+ logger.error(f"swap_pipeline unavailable, using pass-through: {e}")
92
  _pipeline_singleton = _PassThroughPipeline()
93
  return _pipeline_singleton
94