Spaces:
Paused
Paused
MacBook pro
commited on
Commit
·
ba8225a
1
Parent(s):
32226d2
Pivot: remove GFPGAN + reenactment stack; CodeFormer-only enhancement, purge legacy files, update docs & downloader
Browse files- .github/copilot-instructions.md +24 -0
- .github/instructions.md +17 -0
- README.md +133 -144
- Repair/INCREMENTAL_INTEGRATION.md +0 -119
- Repair/QUICK_CHECKLIST.md +0 -77
- Repair/TARGETED_RECOMMENDATIONS.md +0 -207
- SPEC.md +41 -0
- model_downloader.py +63 -166
- requirements.txt +3 -10
- swap_pipeline.py +210 -0
- webrtc_server.py +3 -4
.github/copilot-instructions.md
ADDED
|
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Prime Directive:
|
| 2 |
+
Deliver production-ready, systemic solutions to root causes. Prioritize core utility and absolute system integrity. There is zero tolerance for surface patches, brittle fixes, or non-functional code.
|
| 3 |
+
Mandatory Protocol:
|
| 4 |
+
Map the System: Before acting, map all relevant logic flows, data transformations, and dependencies. Identify all side effects.
|
| 5 |
+
Isolate Root Cause: Diagnose the fundamental issue with code-based evidence. Ensure the fix is systemic and permanent.
|
| 6 |
+
Align with Utility: Every change must advance the project's core objective. Reject low-impact optimizations.
|
| 7 |
+
Implementation Mandates:
|
| 8 |
+
Code Integrity: All code must be robust, generalizable, and directly executable. Prohibit all hardcoding, duplicated functionality, and placeholder logic.
|
| 9 |
+
Quality & Security: Enforce static typing, descriptive naming, and strict linting. Validate all I/O, eliminate unsafe calls, and add regression guards.
|
| 10 |
+
Testing: Test coverage must target both the symptom and its root cause. The full test suite must pass without warnings.
|
| 11 |
+
Execution Workflow:
|
| 12 |
+
Analyze system flow.
|
| 13 |
+
Confirm root cause.
|
| 14 |
+
Plan solution.
|
| 15 |
+
Implement the robust fix.
|
| 16 |
+
Validate with all tests.
|
| 17 |
+
Document systemic insights.
|
| 18 |
+
|
| 19 |
+
Project: Implements an AI avatar by streaming a user's local audio and video to a Hugging Face GPU server for immediate processing. In the cloud, the system performs simultaneous generative face swapping—animating a source image's identity with the user's live motion—and real-time voice conversion, which morphs the user's speech to a target profile while preserving the original prosody. The fully synchronized audio-visual output is then streamed back to the local machine, functioning as an integrated virtual camera and microphone for seamless use in communication platforms like Zoom and WhatsApp.
|
| 20 |
+
|
| 21 |
+
Operational instructions:
|
| 22 |
+
- All implementations must be architected for the huggingface space located at https://huggingface.co/spaces/Islamckennon/mirage
|
| 23 |
+
- After every change, push to github and huggingface, then await user feedback for next steps.
|
| 24 |
+
- All code must be archhitected towards project real-world functionality only.
|
.github/instructions.md
ADDED
|
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Prime Directive:
|
| 2 |
+
Deliver production-ready, systemic solutions to root causes. Prioritize core utility and absolute system integrity. There is zero tolerance for surface patches, brittle fixes, or non-functional code.
|
| 3 |
+
Mandatory Protocol:
|
| 4 |
+
Map the System: Before acting, map all relevant logic flows, data transformations, and dependencies. Identify all side effects.
|
| 5 |
+
Isolate Root Cause: Diagnose the fundamental issue with code-based evidence. Ensure the fix is systemic and permanent.
|
| 6 |
+
Align with Utility: Every change must advance the project's core objective. Reject low-impact optimizations.
|
| 7 |
+
Implementation Mandates:
|
| 8 |
+
Code Integrity: All code must be robust, generalizable, and directly executable. Prohibit all hardcoding, duplication, and placeholder logic.
|
| 9 |
+
Quality & Security: Enforce static typing, descriptive naming, and strict linting. Validate all I/O, eliminate unsafe calls, and add regression guards.
|
| 10 |
+
Testing: Test coverage must target both the symptom and its root cause. The full test suite must pass without warnings.
|
| 11 |
+
Execution Workflow:
|
| 12 |
+
Analyze system flow.
|
| 13 |
+
Confirm root cause.
|
| 14 |
+
Plan solution.
|
| 15 |
+
Implement the robust fix.
|
| 16 |
+
Validate with all tests.
|
| 17 |
+
Document systemic insights.
|
README.md
CHANGED
|
@@ -10,63 +10,61 @@ pinned: false
|
|
| 10 |
license: mit
|
| 11 |
hardware: a10g-large
|
| 12 |
python_version: "3.10"
|
| 13 |
-
models:
|
| 14 |
-
- KwaiVGI/LivePortrait
|
| 15 |
-
- RVC-Project/Retrieval-based-Voice-Conversion-WebUI
|
| 16 |
tags:
|
| 17 |
- real-time
|
| 18 |
- ai-avatar
|
| 19 |
-
- face-
|
| 20 |
- voice-conversion
|
| 21 |
-
- live-portrait
|
| 22 |
-
- rvc
|
| 23 |
- virtual-camera
|
| 24 |
-
short_description: "Real-time AI avatar with face
|
| 25 |
---
|
| 26 |
|
| 27 |
# 🎭 Mirage: Real-time AI Avatar System
|
| 28 |
|
| 29 |
-
|
| 30 |
|
| 31 |
## 🚀 Features
|
| 32 |
|
| 33 |
-
- **Real-time Face
|
| 34 |
-
- **
|
| 35 |
-
- **
|
| 36 |
-
- **
|
| 37 |
-
- **
|
| 38 |
-
- **
|
|
|
|
| 39 |
|
| 40 |
## 🎯 Use Cases
|
| 41 |
|
| 42 |
-
- **Video Conferencing**:
|
| 43 |
-
- **
|
| 44 |
-
- **
|
| 45 |
-
- **
|
| 46 |
|
| 47 |
## 🛠️ Technology Stack
|
| 48 |
|
| 49 |
-
- **Face
|
| 50 |
-
- **
|
| 51 |
-
- **
|
| 52 |
-
- **Backend**: FastAPI
|
| 53 |
-
- **
|
| 54 |
-
- **
|
|
|
|
| 55 |
|
| 56 |
-
## 📊 Performance
|
| 57 |
|
| 58 |
-
- **
|
| 59 |
-
- **
|
| 60 |
-
- **
|
| 61 |
-
- **
|
| 62 |
-
- **Face Detection**: SCRFD every 5 frames for efficiency
|
| 63 |
|
| 64 |
-
## 🚀 Quick Start
|
| 65 |
|
| 66 |
-
1.
|
| 67 |
-
2. **
|
| 68 |
-
3.
|
| 69 |
-
4.
|
|
|
|
|
|
|
| 70 |
|
| 71 |
## 🔧 Technical Details
|
| 72 |
|
|
@@ -76,16 +74,18 @@ Transform yourself into an AI avatar in real-time with sub-250ms latency! Perfec
|
|
| 76 |
- GPU memory management and cleanup
|
| 77 |
- Audio-video synchronization within 150ms
|
| 78 |
|
| 79 |
-
### Model
|
| 80 |
-
|
| 81 |
-
|
| 82 |
-
|
|
|
|
|
|
|
| 83 |
|
| 84 |
### Real-time Features
|
| 85 |
-
-
|
| 86 |
-
-
|
| 87 |
-
-
|
| 88 |
-
-
|
| 89 |
|
| 90 |
## 📱 Virtual Camera Integration
|
| 91 |
|
|
@@ -96,45 +96,81 @@ The system creates a virtual camera device that can be used in:
|
|
| 96 |
- **Social Media**: WhatsApp Desktop, Skype, Facebook Messenger
|
| 97 |
- **Gaming**: Steam, Discord voice channels
|
| 98 |
|
| 99 |
-
## ⚡
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 107 |
|
| 108 |
## 🔒 Privacy & Security
|
| 109 |
|
| 110 |
-
-
|
| 111 |
-
-
|
| 112 |
-
-
|
| 113 |
-
- WebSocket connections use secure protocols
|
| 114 |
|
| 115 |
-
## 🔧
|
| 116 |
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 122 |
|
| 123 |
## 📋 Requirements
|
| 124 |
|
| 125 |
-
- **GPU**: NVIDIA
|
| 126 |
-
- **
|
| 127 |
-
- **
|
| 128 |
-
- **
|
|
|
|
| 129 |
|
| 130 |
-
## 🛠️ Development
|
| 131 |
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
|
|
|
|
|
|
|
| 138 |
|
| 139 |
## 📄 License
|
| 140 |
|
|
@@ -142,30 +178,19 @@ MIT License - Feel free to use and modify for your projects!
|
|
| 142 |
|
| 143 |
## 🙏 Acknowledgments
|
| 144 |
|
| 145 |
-
- [
|
| 146 |
-
- [
|
| 147 |
-
-
|
| 148 |
-
|
| 149 |
-
|
| 150 |
-
|
| 151 |
-
- `GET /
|
| 152 |
-
- `GET /
|
| 153 |
-
- `GET /
|
| 154 |
-
-
|
| 155 |
-
- `GET /metrics/motion` – Recent motion magnitudes (normalized) plus tail statistics.
|
| 156 |
-
- `GET /metrics/pacing` – Latency EMA and pacing hint multiplier ( >1.0 suggests you can raise FPS, <1.0 suggests throttling ).
|
| 157 |
-
- `POST /smoothing/update` – Runtime update of One Euro keypoint smoothing params. JSON body keys: `min_cutoff`, `beta`, `d_cutoff` (all optional floats).
|
| 158 |
-
|
| 159 |
-
Example:
|
| 160 |
-
```bash
|
| 161 |
-
curl -s http://localhost:7860/metrics | jq '.video_fps_rolling, .audio_infer_time_ema_ms'
|
| 162 |
-
```
|
| 163 |
|
| 164 |
## Voice Stub Activation
|
| 165 |
-
Set `MIRAGE_VOICE_ENABLE=1` to
|
| 166 |
-
- Audio chunks are routed through `voice_processor.process_pcm_int16` (pass-through now).
|
| 167 |
-
- `audio_infer_time_ema_ms` becomes > 0 after a few chunks.
|
| 168 |
-
- When disabled, inference EMA remains 0.0.
|
| 169 |
|
| 170 |
## Future Parameterization
|
| 171 |
- Frontend will fetch a `/config` endpoint to align `chunk_ms` and `video_max_fps` dynamically.
|
|
@@ -202,10 +227,8 @@ If the Space shows a perpetual "Restarting" badge:
|
|
| 202 |
|
| 203 |
If problems persist, capture the Container log stack trace and open an issue.
|
| 204 |
|
| 205 |
-
##
|
| 206 |
-
|
| 207 |
-
|
| 208 |
-
New runtime observability & control surfaces were added to tune real-time performance:
|
| 209 |
|
| 210 |
### Endpoints Recap
|
| 211 |
See Metrics Endpoints section above. Typical usage examples:
|
|
@@ -225,57 +248,23 @@ curl -s http://localhost:7860/metrics/motion | jq '.recent_motion[-5:]'
|
|
| 225 |
### Motion Magnitude
|
| 226 |
Aggregated from per-frame keypoint motion vectors; higher values trigger more frequent face detection to avoid drift. Low motion stretches automatically reduce detection frequency to save compute.
|
| 227 |
|
| 228 |
-
###
|
| 229 |
-
|
| 230 |
-
|
| 231 |
-
|
| 232 |
-
|
| 233 |
-
| `MIRAGE_ONEEURO_MIN_CUTOFF` | 1.0 | Base cutoff frequency controlling overall smoothing strength |
|
| 234 |
-
| `MIRAGE_ONEEURO_BETA` | 0.05 | Speed coefficient (higher reduces lag during fast motion) |
|
| 235 |
-
| `MIRAGE_ONEEURO_D_CUTOFF` | 1.0 | Derivative cutoff for velocity filtering |
|
| 236 |
-
|
| 237 |
-
Runtime adjustments:
|
| 238 |
-
```bash
|
| 239 |
-
curl -X POST http://localhost:7860/smoothing/update \
|
| 240 |
-
-H 'Content-Type: application/json' \
|
| 241 |
-
-d '{"min_cutoff":0.8, "beta":0.07}'
|
| 242 |
-
```
|
| 243 |
-
Missing keys leave existing values unchanged. The response echoes the active parameters.
|
| 244 |
|
| 245 |
### Latency Histogram Snapshots
|
| 246 |
`/metrics/stage_histogram` exposes periodic snapshots (e.g. every N frames) of stage latency distribution to help identify tail regressions. Use to tune pacing thresholds or decide on model quantization.
|
| 247 |
|
| 248 |
-
##
|
| 249 |
-
|
| 250 |
-
|
| 251 |
-
|
| 252 |
-
|
| 253 |
-
| `MIRAGE_ONEEURO_BETA` | One Euro speed coefficient | 0.05 |
|
| 254 |
-
| `MIRAGE_ONEEURO_D_CUTOFF` | One Euro derivative cutoff | 1.0 |
|
| 255 |
-
|
| 256 |
-
|
| 257 |
-
To pull LivePortrait ONNX files into the container at runtime and enable the safe animation path:
|
| 258 |
-
|
| 259 |
-
1) Set these Space secrets/variables in the Settings → Variables panel:
|
| 260 |
-
|
| 261 |
-
- `MIRAGE_ENABLE_SCRFD=1` (already default in Dockerfile)
|
| 262 |
-
- `MIRAGE_ENABLE_LIVEPORTRAIT=1`
|
| 263 |
-
- `MIRAGE_DOWNLOAD_MODELS=1`
|
| 264 |
-
- `MIRAGE_LP_APPEARANCE_URL=https://huggingface.co/myn0908/Live-Portrait-ONNX/resolve/main/appearance_feature_extractor.onnx`
|
| 265 |
-
- `MIRAGE_LP_MOTION_URL=https://huggingface.co/myn0908/Live-Portrait-ONNX/resolve/main/motion_extractor.onnx` (optional)
|
| 266 |
-
|
| 267 |
-
2) Restart the Space. The server will download models in the background on startup, and also sync once when you hit "Initialize AI Pipeline".
|
| 268 |
-
|
| 269 |
-
3) Check `/pipeline_status` or the in-UI metrics to see:
|
| 270 |
-
- `ai_pipeline.animator_available: true`
|
| 271 |
-
- `ai_pipeline.reference_set: true` (after uploading a reference)
|
| 272 |
-
|
| 273 |
-
Notes:
|
| 274 |
-
- The safe loader uses onnxruntime-gpu if available, otherwise CPU. This path provides a visible transformation placeholder and validates end-to-end integration.
|
| 275 |
-
- Keep model URLs only to assets you have permission to download.
|
| 276 |
|
| 277 |
-
##
|
| 278 |
-
|
| 279 |
|
| 280 |
```
|
| 281 |
models/
|
|
|
|
| 10 |
license: mit
|
| 11 |
hardware: a10g-large
|
| 12 |
python_version: "3.10"
|
|
|
|
|
|
|
|
|
|
| 13 |
tags:
|
| 14 |
- real-time
|
| 15 |
- ai-avatar
|
| 16 |
+
- face-swap
|
| 17 |
- voice-conversion
|
|
|
|
|
|
|
| 18 |
- virtual-camera
|
| 19 |
+
short_description: "Real-time AI avatar with face swap + voice conversion"
|
| 20 |
---
|
| 21 |
|
| 22 |
# 🎭 Mirage: Real-time AI Avatar System
|
| 23 |
|
| 24 |
+
Mirage performs real-time identity-preserving face swap plus optional facial enhancement and (stub) voice conversion, streaming back a virtual camera + microphone feed with sub‑250ms target latency. Designed for live calls, streaming overlays, and privacy where you want a consistent alternate appearance.
|
| 25 |
|
| 26 |
## 🚀 Features
|
| 27 |
|
| 28 |
+
- **Real-time Face Swap (InSwapper)**: Identity transfer from a single reference image to your live video.
|
| 29 |
+
- **Enhancement (Optional)**: CodeFormer restoration (fidelity‑controllable) if weights present.
|
| 30 |
+
- **Low Latency WebRTC**: Bi-directional streaming via aiortc (camera + mic) with adaptive frame scaling.
|
| 31 |
+
- **Voice Conversion Stub**: Pluggable path ready for RVC / HuBERT integration (currently pass-through by default).
|
| 32 |
+
- **Virtual Camera**: Output suitable for Zoom, Meet, Discord, OBS (via local virtual camera module).
|
| 33 |
+
- **Model Auto-Provisioning**: Deterministic downloader for required swap + enhancer weights.
|
| 34 |
+
- **Metrics & Health**: JSON endpoints for latency, FPS, GPU memory, and pipeline stats.
|
| 35 |
|
| 36 |
## 🎯 Use Cases
|
| 37 |
|
| 38 |
+
- **Video Conferencing Privacy**: Appear as a consistent alternate identity.
|
| 39 |
+
- **Streaming / VTubing**: Lightweight swap + enhancement pipeline for overlays.
|
| 40 |
+
- **A/B Creative Experiments**: Rapid prototyping of face identity transforms.
|
| 41 |
+
- **Data Minimization**: Keep original face private while communicating.
|
| 42 |
|
| 43 |
## 🛠️ Technology Stack
|
| 44 |
|
| 45 |
+
- **Face Detection & Embedding**: InsightFace `buffalo_l` (SCRFD + embedding).
|
| 46 |
+
- **Face Swap Core**: `inswapper_128_fp16.onnx` (InSwapper) via InsightFace model zoo.
|
| 47 |
+
- **Enhancer (optional)**: CodeFormer 0.1 (fidelity controllable).
|
| 48 |
+
- **Backend**: FastAPI + aiortc (WebRTC) + asyncio.
|
| 49 |
+
- **Metrics**: Custom endpoints (`/metrics`, `/gpu`) with rolling latency/FPS stats.
|
| 50 |
+
- **Downloader**: Atomic, lock-protected model fetcher (`model_downloader.py`).
|
| 51 |
+
- **Frontend**: Minimal WebRTC client (`static/`).
|
| 52 |
|
| 53 |
+
## 📊 Performance Targets
|
| 54 |
|
| 55 |
+
- **Processing Window**: <50ms typical swap @ 512px (A10G) w/ single face.
|
| 56 |
+
- **End-to-end Latency Goal**: <250ms (capture → swap → enhancement → return).
|
| 57 |
+
- **Adaptive Scale**: Frames >512px longest side are downscaled before inference.
|
| 58 |
+
- **Enhancement Overhead**: CodeFormer ~18–35ms (A10G, single face, 512px) – approximate; adjust fidelity to trade quality vs latency.
|
|
|
|
| 59 |
|
| 60 |
+
## 🚀 Quick Start (Hugging Face Space)
|
| 61 |
|
| 62 |
+
1. Open the Space UI and allow camera/microphone.
|
| 63 |
+
2. Click **Initialize** – triggers model download (if not already cached) & pipeline load.
|
| 64 |
+
3. Upload a clear, front-facing reference image (only largest face is used).
|
| 65 |
+
4. Start streaming – swapped frames appear in the preview.
|
| 66 |
+
5. (Optional) Provide CodeFormer weights (`models/codeformer/codeformer.pth`) for enhancement.
|
| 67 |
+
6. Use the virtual camera integration locally (if running self-hosted) to broadcast swapped output to Zoom/OBS.
|
| 68 |
|
| 69 |
## 🔧 Technical Details
|
| 70 |
|
|
|
|
| 74 |
- GPU memory management and cleanup
|
| 75 |
- Audio-video synchronization within 150ms
|
| 76 |
|
| 77 |
+
### Model Flow
|
| 78 |
+
1. Capture frame → optional downscale to <=512 max side
|
| 79 |
+
2. InsightFace detector+embedding obtains face bboxes + identity vectors
|
| 80 |
+
3. InSwapper ONNX performs identity replacement using source embedding
|
| 81 |
+
4. Optional CodeFormer enhancer refines facial region
|
| 82 |
+
5. Frame returned to WebRTC outbound track
|
| 83 |
|
| 84 |
### Real-time Features
|
| 85 |
+
- WebRTC (aiortc) low-latency transport.
|
| 86 |
+
- Asynchronous frame processing (background tasks) to avoid blocking capture.
|
| 87 |
+
- Adaptive pre-inference downscale heuristic (cap largest dimension to 512).
|
| 88 |
+
- Metrics-driven latency tracking for dynamic future pacing.
|
| 89 |
|
| 90 |
## 📱 Virtual Camera Integration
|
| 91 |
|
|
|
|
| 96 |
- **Social Media**: WhatsApp Desktop, Skype, Facebook Messenger
|
| 97 |
- **Gaming**: Steam, Discord voice channels
|
| 98 |
|
| 99 |
+
## ⚡ Metrics & Observability
|
| 100 |
+
|
| 101 |
+
Key endpoints (base URL: running server root):
|
| 102 |
+
|
| 103 |
+
| Endpoint | Description |
|
| 104 |
+
|----------|-------------|
|
| 105 |
+
| `/metrics` | Core video/audio latency & FPS stats |
|
| 106 |
+
| `/gpu` | GPU presence + memory usage (torch / nvidia-smi) |
|
| 107 |
+
| `/webrtc/ping` | WebRTC router availability & TURN status |
|
| 108 |
+
| `/pipeline_status` (if implemented) | High-level pipeline readiness |
|
| 109 |
+
|
| 110 |
+
Pipeline stats (subset) from swap pipeline:
|
| 111 |
+
```json
|
| 112 |
+
{
|
| 113 |
+
"frames": 240,
|
| 114 |
+
"avg_latency_ms": 42.7,
|
| 115 |
+
"swap_faces_last": 1,
|
| 116 |
+
"enhanced_frames": 180,
|
| 117 |
+
"enhancer": "codeformer",
|
| 118 |
+
"codeformer_fidelity": 0.75,
|
| 119 |
+
"codeformer_loaded": true
|
| 120 |
+
}
|
| 121 |
+
```
|
| 122 |
|
| 123 |
## 🔒 Privacy & Security
|
| 124 |
|
| 125 |
+
- No reference image persisted to disk (processed in-memory).
|
| 126 |
+
- Only model weights are cached; media frames are transient.
|
| 127 |
+
- Optional API key enforcement via `MIRAGE_API_KEY` + `MIRAGE_REQUIRE_API_KEY=1`.
|
|
|
|
| 128 |
|
| 129 |
+
## 🔧 Environment Variables (Face Swap & Enhancers)
|
| 130 |
|
| 131 |
+
| Variable | Purpose | Default |
|
| 132 |
+
|----------|---------|---------|
|
| 133 |
+
| `MIRAGE_DOWNLOAD_MODELS` | Auto download required models on startup | `1` |
|
| 134 |
+
| `MIRAGE_INSWAPPER_URL` | Override InSwapper ONNX URL | internal default |
|
| 135 |
+
| `MIRAGE_CODEFORMER_URL` | Override CodeFormer weight URL | 0.1 release |
|
| 136 |
+
| `MIRAGE_CODEFORMER_FIDELITY` | 0.0=more detail recovery, 1.0=preserve input | `0.75` |
|
| 137 |
+
| `MIRAGE_MAX_FACES` | Swap up to N largest faces per frame | `1` |
|
| 138 |
+
| `MIRAGE_CUDA_ONLY` | Restrict ONNX to CUDA EP + CPU fallback | unset |
|
| 139 |
+
| `MIRAGE_API_KEY` | Shared secret for control / TURN token | unset |
|
| 140 |
+
| `MIRAGE_REQUIRE_API_KEY` | Enforce API key if set | `0` |
|
| 141 |
+
| `MIRAGE_TOKEN_TTL` | Signed token lifetime (seconds) | `300` |
|
| 142 |
+
| `MIRAGE_STUN_URLS` | Comma list of STUN servers | Google defaults |
|
| 143 |
+
| `MIRAGE_TURN_URL` | TURN URI(s) (comma separated) | unset |
|
| 144 |
+
| `MIRAGE_TURN_USER` | TURN username | unset |
|
| 145 |
+
| `MIRAGE_TURN_PASS` | TURN credential | unset |
|
| 146 |
+
| `MIRAGE_FORCE_RELAY` | Force relay-only traffic | `0` |
|
| 147 |
+
| `MIRAGE_TURN_TLS_ONLY` | Filter TURN to TLS/TCP | `1` |
|
| 148 |
+
| `MIRAGE_PREFER_H264` | Prefer H264 codec in SDP munging | `0` |
|
| 149 |
+
| `MIRAGE_VOICE_ENABLE` | Enable voice processor stub | `0` |
|
| 150 |
+
|
| 151 |
+
CodeFormer fidelity example:
|
| 152 |
+
```bash
|
| 153 |
+
MIRAGE_CODEFORMER_FIDELITY=0.6
|
| 154 |
+
```
|
| 155 |
|
| 156 |
## 📋 Requirements
|
| 157 |
|
| 158 |
+
- **GPU**: NVIDIA (Ampere+ recommended). CPU-only will be extremely slow.
|
| 159 |
+
- **VRAM**: ~3–4GB baseline (swap + detector) + optional enhancer overhead.
|
| 160 |
+
- **RAM**: 8GB+ (12–16GB recommended for multitasking).
|
| 161 |
+
- **Browser**: Chromium-based / Firefox with WebRTC.
|
| 162 |
+
- **Reference Image**: Clear, frontal, good lighting, minimal occlusions.
|
| 163 |
|
| 164 |
+
## 🛠️ Development / Running Locally
|
| 165 |
|
| 166 |
+
Download models & start server:
|
| 167 |
+
```bash
|
| 168 |
+
python model_downloader.py # or set MIRAGE_DOWNLOAD_MODELS=1 and let startup handle
|
| 169 |
+
uvicorn app:app --port 7860 --host 0.0.0.0
|
| 170 |
+
```
|
| 171 |
+
Open the browser client at `http://localhost:7860`.
|
| 172 |
+
|
| 173 |
+
Set a reference image via UI (Base64 upload path) then begin WebRTC session. Inspect `/metrics` for swap latency and `webrtc/debug_state` for connection internals.
|
| 174 |
|
| 175 |
## 📄 License
|
| 176 |
|
|
|
|
| 178 |
|
| 179 |
## 🙏 Acknowledgments
|
| 180 |
|
| 181 |
+
- [InsightFace](https://github.com/deepinsight/insightface) (detection + swap)
|
| 182 |
+
- [CodeFormer](https://github.com/sczhou/CodeFormer) (fidelity-controllable enhancement)
|
| 183 |
+
- Hugging Face (inference infra)
|
| 184 |
+
|
| 185 |
+
## Metrics Endpoints (Current Subset)
|
| 186 |
+
- `GET /metrics`
|
| 187 |
+
- `GET /gpu`
|
| 188 |
+
- `GET /webrtc/ping`
|
| 189 |
+
- `GET /webrtc/debug_state`
|
| 190 |
+
- (Legacy endpoints referenced in SPEC may be pruned in future refactors.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 191 |
|
| 192 |
## Voice Stub Activation
|
| 193 |
+
Set `MIRAGE_VOICE_ENABLE=1` to route audio through the placeholder voice processor. Current behavior is pass‑through while preserving structural hooks for future RVC model integration.
|
|
|
|
|
|
|
|
|
|
| 194 |
|
| 195 |
## Future Parameterization
|
| 196 |
- Frontend will fetch a `/config` endpoint to align `chunk_ms` and `video_max_fps` dynamically.
|
|
|
|
| 227 |
|
| 228 |
If problems persist, capture the Container log stack trace and open an issue.
|
| 229 |
|
| 230 |
+
## Model Auto-Download
|
| 231 |
+
`model_downloader.py` manages required weights with atomic file locks. It supports overriding sources via env variables and gracefully continues if optional enhancers fail to download.
|
|
|
|
|
|
|
| 232 |
|
| 233 |
### Endpoints Recap
|
| 234 |
See Metrics Endpoints section above. Typical usage examples:
|
|
|
|
| 248 |
### Motion Magnitude
|
| 249 |
Aggregated from per-frame keypoint motion vectors; higher values trigger more frequent face detection to avoid drift. Low motion stretches automatically reduce detection frequency to save compute.
|
| 250 |
|
| 251 |
+
### Enhancer Fidelity (CodeFormer)
|
| 252 |
+
Fidelity weight (`w`):
|
| 253 |
+
- Lower (e.g. 0.3–0.5): More aggressive restoration, may alter identity details.
|
| 254 |
+
- Higher (0.7–0.9): Preserve more original swapped structure, less smoothing.
|
| 255 |
+
Tune with `MIRAGE_CODEFORMER_FIDELITY`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 256 |
|
| 257 |
### Latency Histogram Snapshots
|
| 258 |
`/metrics/stage_histogram` exposes periodic snapshots (e.g. every N frames) of stage latency distribution to help identify tail regressions. Use to tune pacing thresholds or decide on model quantization.
|
| 259 |
|
| 260 |
+
## Security Notes
|
| 261 |
+
If exposing publicly:
|
| 262 |
+
- Set `MIRAGE_API_KEY` and `MIRAGE_REQUIRE_API_KEY=1`.
|
| 263 |
+
- Serve behind TLS (reverse proxy like Caddy / Nginx for certificate management).
|
| 264 |
+
- Optionally restrict TURN server usage or enforce relay only for stricter NAT traversal control.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 265 |
|
| 266 |
+
## Planned Voice Pipeline (Future)
|
| 267 |
+
Placeholder directories exist for future real-time voice conversion integration.
|
| 268 |
|
| 269 |
```
|
| 270 |
models/
|
Repair/INCREMENTAL_INTEGRATION.md
DELETED
|
@@ -1,119 +0,0 @@
|
|
| 1 |
-
# Incremental Model Integration Guide
|
| 2 |
-
|
| 3 |
-
## Respecting Your Existing Architecture
|
| 4 |
-
|
| 5 |
-
Your team made excellent decisions to avoid wholesale replacement. Here's how to safely integrate AI models:
|
| 6 |
-
|
| 7 |
-
## Phase 1: Add Feature Flags (Zero Risk)
|
| 8 |
-
|
| 9 |
-
Add to your environment or startup:
|
| 10 |
-
```bash
|
| 11 |
-
# Start with models disabled
|
| 12 |
-
export MIRAGE_ENABLE_SCRFD=0
|
| 13 |
-
export MIRAGE_ENABLE_LIVEPORTRAIT=0
|
| 14 |
-
|
| 15 |
-
# Enable gradually
|
| 16 |
-
export MIRAGE_ENABLE_SCRFD=1 # Enable face detection first
|
| 17 |
-
export MIRAGE_ENABLE_LIVEPORTRAIT=1 # Enable animation second
|
| 18 |
-
```
|
| 19 |
-
|
| 20 |
-
## Phase 2: Integrate Safe Model Loader
|
| 21 |
-
|
| 22 |
-
In your existing `avatar_pipeline.py`, add:
|
| 23 |
-
```python
|
| 24 |
-
# At the top
|
| 25 |
-
from safe_model_integration import get_safe_model_loader
|
| 26 |
-
|
| 27 |
-
class RealTimeAvatarPipeline:
|
| 28 |
-
def __init__(self):
|
| 29 |
-
# Your existing code...
|
| 30 |
-
|
| 31 |
-
# Add safe model loader
|
| 32 |
-
self.safe_loader = get_safe_model_loader()
|
| 33 |
-
|
| 34 |
-
async def initialize(self):
|
| 35 |
-
# Your existing initialization...
|
| 36 |
-
|
| 37 |
-
# Add safe model loading
|
| 38 |
-
await self.safe_loader.safe_load_scrfd()
|
| 39 |
-
await self.safe_loader.safe_load_liveportrait()
|
| 40 |
-
|
| 41 |
-
def process_video_frame(self, frame, frame_idx):
|
| 42 |
-
# Your existing code...
|
| 43 |
-
|
| 44 |
-
# Enhanced face detection (graceful fallback)
|
| 45 |
-
bbox = self.safe_loader.safe_detect_face(frame)
|
| 46 |
-
|
| 47 |
-
# Enhanced animation (graceful fallback to pass-through)
|
| 48 |
-
if self.reference_frame is not None:
|
| 49 |
-
result = self.safe_loader.safe_animate_face(self.reference_frame, frame)
|
| 50 |
-
else:
|
| 51 |
-
result = frame # Keep existing pass-through logic
|
| 52 |
-
|
| 53 |
-
return result
|
| 54 |
-
```
|
| 55 |
-
|
| 56 |
-
## Phase 3: Enhanced Metrics (Drop-in)
|
| 57 |
-
|
| 58 |
-
In your existing `get_performance_stats()`:
|
| 59 |
-
```python
|
| 60 |
-
from enhanced_metrics import enhance_existing_stats
|
| 61 |
-
|
| 62 |
-
def get_performance_stats(self):
|
| 63 |
-
# Your existing stats collection...
|
| 64 |
-
base_stats = {
|
| 65 |
-
"models_loaded": self.loaded,
|
| 66 |
-
# ... your existing metrics
|
| 67 |
-
}
|
| 68 |
-
|
| 69 |
-
# Enhance with percentiles
|
| 70 |
-
return enhance_existing_stats(base_stats)
|
| 71 |
-
```
|
| 72 |
-
|
| 73 |
-
## Phase 4: Optional Model Download
|
| 74 |
-
|
| 75 |
-
When you want models:
|
| 76 |
-
```bash
|
| 77 |
-
# Check what's needed
|
| 78 |
-
python3 scripts/optional_download_models.py --status
|
| 79 |
-
|
| 80 |
-
# Download only when features are enabled
|
| 81 |
-
MIRAGE_ENABLE_SCRFD=1 python3 scripts/optional_download_models.py --download-needed
|
| 82 |
-
```
|
| 83 |
-
|
| 84 |
-
## Phase 5: WebRTC Monitoring (Optional)
|
| 85 |
-
|
| 86 |
-
In your existing `webrtc_server.py`:
|
| 87 |
-
```python
|
| 88 |
-
from webrtc_connection_monitoring import add_connection_monitoring
|
| 89 |
-
|
| 90 |
-
# After creating your router
|
| 91 |
-
add_connection_monitoring(router, _peer_state)
|
| 92 |
-
```
|
| 93 |
-
|
| 94 |
-
## Validation Steps
|
| 95 |
-
|
| 96 |
-
1. **Feature Flags Off**: System works exactly as before
|
| 97 |
-
2. **SCRFD Enabled**: Face detection works, falls back gracefully
|
| 98 |
-
3. **LivePortrait Enabled**: Animation works, falls back to pass-through
|
| 99 |
-
4. **Metrics Enhanced**: More detailed latency tracking
|
| 100 |
-
5. **Models Optional**: Download only when needed
|
| 101 |
-
|
| 102 |
-
## Rollback Strategy
|
| 103 |
-
|
| 104 |
-
At any point:
|
| 105 |
-
```bash
|
| 106 |
-
# Disable all features
|
| 107 |
-
export MIRAGE_ENABLE_SCRFD=0
|
| 108 |
-
export MIRAGE_ENABLE_LIVEPORTRAIT=0
|
| 109 |
-
|
| 110 |
-
# System returns to existing pass-through behavior
|
| 111 |
-
```
|
| 112 |
-
|
| 113 |
-
This approach:
|
| 114 |
-
- ✅ Keeps your token auth intact
|
| 115 |
-
- ✅ Preserves existing WebRTC message schema
|
| 116 |
-
- ✅ Maintains Docker compatibility
|
| 117 |
-
- ✅ Allows gradual rollout with instant rollback
|
| 118 |
-
- ✅ No background tasks at import time
|
| 119 |
-
- ✅ Compatible with your A10G + CUDA 12.1 setup
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Repair/QUICK_CHECKLIST.md
DELETED
|
@@ -1,77 +0,0 @@
|
|
| 1 |
-
# 🚀 QUICK ACTION CHECKLIST
|
| 2 |
-
|
| 3 |
-
## Immediate Actions (Today - 30 minutes)
|
| 4 |
-
|
| 5 |
-
### ✅ Step 1: Add Enhanced Metrics (5 minutes)
|
| 6 |
-
```python
|
| 7 |
-
# Copy enhanced_metrics.py to your project
|
| 8 |
-
# In your existing avatar_pipeline.py get_performance_stats():
|
| 9 |
-
from enhanced_metrics import enhance_existing_stats
|
| 10 |
-
return enhance_existing_stats(base_stats)
|
| 11 |
-
```
|
| 12 |
-
|
| 13 |
-
### ✅ Step 2: Add Safe Model Integration (10 minutes)
|
| 14 |
-
```python
|
| 15 |
-
# Copy safe_model_integration.py to your project
|
| 16 |
-
# In your existing avatar_pipeline.py __init__():
|
| 17 |
-
from safe_model_integration import get_safe_model_loader
|
| 18 |
-
self.safe_loader = get_safe_model_loader()
|
| 19 |
-
|
| 20 |
-
# In your existing initialize():
|
| 21 |
-
await self.safe_loader.safe_load_scrfd()
|
| 22 |
-
await self.safe_loader.safe_load_liveportrait()
|
| 23 |
-
|
| 24 |
-
# In your process_video_frame():
|
| 25 |
-
bbox = self.safe_loader.safe_detect_face(frame)
|
| 26 |
-
if self.reference_frame is not None:
|
| 27 |
-
result = self.safe_loader.safe_animate_face(self.reference_frame, frame)
|
| 28 |
-
else:
|
| 29 |
-
result = frame
|
| 30 |
-
```
|
| 31 |
-
|
| 32 |
-
### ✅ Step 3: Test with Features Disabled (5 minutes)
|
| 33 |
-
```bash
|
| 34 |
-
export MIRAGE_ENABLE_SCRFD=0
|
| 35 |
-
export MIRAGE_ENABLE_LIVEPORTRAIT=0
|
| 36 |
-
# Verify system works exactly as before
|
| 37 |
-
curl /health && curl /metrics
|
| 38 |
-
```
|
| 39 |
-
|
| 40 |
-
### ✅ Step 4: Enable SCRFD Gradually (5 minutes)
|
| 41 |
-
```bash
|
| 42 |
-
export MIRAGE_ENABLE_SCRFD=1
|
| 43 |
-
# Test face detection
|
| 44 |
-
curl -X POST /initialize
|
| 45 |
-
curl /metrics # Check for face detection timing
|
| 46 |
-
```
|
| 47 |
-
|
| 48 |
-
### ✅ Step 5: Enable LivePortrait (5 minutes)
|
| 49 |
-
```bash
|
| 50 |
-
export MIRAGE_ENABLE_LIVEPORTRAIT=1
|
| 51 |
-
# Test animation
|
| 52 |
-
curl /metrics # Check for animation timing
|
| 53 |
-
```
|
| 54 |
-
|
| 55 |
-
## Success Indicators
|
| 56 |
-
|
| 57 |
-
- [ ] Enhanced metrics show P50/P95 latency percentiles
|
| 58 |
-
- [ ] SCRFD=1 enables face detection, fallback works on errors
|
| 59 |
-
- [ ] LIVEPORTRAIT=1 enables animation, fallback works on errors
|
| 60 |
-
- [ ] System maintains existing pass-through behavior
|
| 61 |
-
- [ ] /health endpoint shows models_loaded status
|
| 62 |
-
- [ ] Token auth and message schemas unchanged
|
| 63 |
-
|
| 64 |
-
## Instant Rollback
|
| 65 |
-
```bash
|
| 66 |
-
export MIRAGE_ENABLE_SCRFD=0
|
| 67 |
-
export MIRAGE_ENABLE_LIVEPORTRAIT=0
|
| 68 |
-
# Returns to exact previous behavior
|
| 69 |
-
```
|
| 70 |
-
|
| 71 |
-
## Files to Copy
|
| 72 |
-
- [x] `safe_model_integration.py` → Your project root
|
| 73 |
-
- [x] `enhanced_metrics.py` → Your project root
|
| 74 |
-
- [x] `scripts/optional_download_models.py` → Your scripts/ folder
|
| 75 |
-
- [x] `webrtc_connection_monitoring.py` → Optional for /webrtc/connections
|
| 76 |
-
|
| 77 |
-
**🎯 Total Time: 30 minutes for safe AI integration**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Repair/TARGETED_RECOMMENDATIONS.md
DELETED
|
@@ -1,207 +0,0 @@
|
|
| 1 |
-
# 🎯 TARGETED RECOMMENDATIONS - SAFE AI INTEGRATION
|
| 2 |
-
|
| 3 |
-
## Assessment: Your Dev Team is Absolutely Right
|
| 4 |
-
|
| 5 |
-
Your team's analysis shows **excellent engineering judgment**. Wholesale replacement would introduce unnecessary risks to a working system. Here are targeted improvements that respect your architecture:
|
| 6 |
-
|
| 7 |
-
## ✅ IMMEDIATE WINS (Zero Risk)
|
| 8 |
-
|
| 9 |
-
### 1. Enhanced Metrics (Drop-in Compatible)
|
| 10 |
-
**File**: `enhanced_metrics.py`
|
| 11 |
-
**Integration**: Add to existing `get_performance_stats()`
|
| 12 |
-
```python
|
| 13 |
-
from enhanced_metrics import enhance_existing_stats
|
| 14 |
-
return enhance_existing_stats(existing_stats)
|
| 15 |
-
```
|
| 16 |
-
**Benefits**:
|
| 17 |
-
- P50/P95/P99 latency percentiles
|
| 18 |
-
- Component-level timing breakdown
|
| 19 |
-
- GPU memory monitoring
|
| 20 |
-
- **Zero breaking changes**
|
| 21 |
-
|
| 22 |
-
### 2. Feature-Flagged Model Loading
|
| 23 |
-
**File**: `safe_model_integration.py`
|
| 24 |
-
**Integration**: Import in existing pipeline
|
| 25 |
-
```bash
|
| 26 |
-
export MIRAGE_ENABLE_SCRFD=0 # Start disabled
|
| 27 |
-
export MIRAGE_ENABLE_LIVEPORTRAIT=0
|
| 28 |
-
```
|
| 29 |
-
**Benefits**:
|
| 30 |
-
- Graceful fallback to pass-through
|
| 31 |
-
- Enable/disable models instantly
|
| 32 |
-
- No changes to existing message schemas
|
| 33 |
-
- **Complete rollback capability**
|
| 34 |
-
|
| 35 |
-
## 🚀 MEDIUM-TERM ADDITIONS (Low Risk)
|
| 36 |
-
|
| 37 |
-
### 3. Connection Monitoring Endpoint
|
| 38 |
-
**File**: `webrtc_connection_monitoring.py`
|
| 39 |
-
**Integration**: Add to existing WebRTC router
|
| 40 |
-
```python
|
| 41 |
-
add_connection_monitoring(router, _peer_state)
|
| 42 |
-
```
|
| 43 |
-
**Benefits**:
|
| 44 |
-
- `/webrtc/connections` diagnostic endpoint
|
| 45 |
-
- Works with single-peer architecture
|
| 46 |
-
- **No auth changes required**
|
| 47 |
-
|
| 48 |
-
### 4. Optional Model Download Utility
|
| 49 |
-
**File**: `scripts/optional_download_models.py`
|
| 50 |
-
**Usage**: On-demand only (not in Docker build)
|
| 51 |
-
```bash
|
| 52 |
-
python3 scripts/optional_download_models.py --status
|
| 53 |
-
```
|
| 54 |
-
**Benefits**:
|
| 55 |
-
- Download models when features are enabled
|
| 56 |
-
- Conservative model list (SCRFD + LivePortrait basics)
|
| 57 |
-
- **Not baked into Docker build**
|
| 58 |
-
|
| 59 |
-
## 🎯 RESPECTS YOUR ARCHITECTURE DECISIONS
|
| 60 |
-
|
| 61 |
-
### ✅ What We're NOT Changing
|
| 62 |
-
- **Docker Base**: Keep your CUDA 12.1.1 + cuDNN 8 runtime
|
| 63 |
-
- **Token Auth**: Preserve your WebRTC authentication system
|
| 64 |
-
- **Message Schema**: Keep `image_jpeg_base64` format
|
| 65 |
-
- **Entry Point**: Keep your `original_fastapi_app.py`
|
| 66 |
-
- **Background Tasks**: No import-time tasks
|
| 67 |
-
- **Router Integration**: Keep your existing WebRTC setup
|
| 68 |
-
|
| 69 |
-
### ✅ What We're Safely Adding
|
| 70 |
-
- **Feature flags** for gradual AI model rollout
|
| 71 |
-
- **Enhanced metrics** for better observability
|
| 72 |
-
- **Graceful fallbacks** that maintain pass-through behavior
|
| 73 |
-
- **Optional utilities** for model management
|
| 74 |
-
- **Diagnostic endpoints** for connection monitoring
|
| 75 |
-
|
| 76 |
-
## 📊 EXPECTED RESULTS WITH SAFE INTEGRATION
|
| 77 |
-
|
| 78 |
-
### Phase 1: Metrics Enhanced (Day 1)
|
| 79 |
-
```
|
| 80 |
-
Before: Basic latency averages
|
| 81 |
-
After: P50/P95/P99 percentiles + component breakdown
|
| 82 |
-
Risk: Zero (pure addition)
|
| 83 |
-
```
|
| 84 |
-
|
| 85 |
-
### Phase 2: SCRFD Enabled (Day 2-3)
|
| 86 |
-
```
|
| 87 |
-
Before: No face detection
|
| 88 |
-
After: Real face detection with pass-through fallback
|
| 89 |
-
Risk: Low (feature flag controlled)
|
| 90 |
-
Command: MIRAGE_ENABLE_SCRFD=1
|
| 91 |
-
```
|
| 92 |
-
|
| 93 |
-
### Phase 3: LivePortrait Enabled (Day 4-7)
|
| 94 |
-
```
|
| 95 |
-
Before: Pass-through video
|
| 96 |
-
After: Real face animation with pass-through fallback
|
| 97 |
-
Risk: Low (feature flag controlled)
|
| 98 |
-
Command: MIRAGE_ENABLE_LIVEPORTRAIT=1
|
| 99 |
-
```
|
| 100 |
-
|
| 101 |
-
## 🔧 INTEGRATION SEQUENCE
|
| 102 |
-
|
| 103 |
-
### Step 1: Add Enhanced Metrics (5 minutes)
|
| 104 |
-
```python
|
| 105 |
-
# In your existing pipeline get_performance_stats()
|
| 106 |
-
from enhanced_metrics import enhance_existing_stats
|
| 107 |
-
return enhance_existing_stats(base_stats)
|
| 108 |
-
```
|
| 109 |
-
|
| 110 |
-
### Step 2: Add Safe Model Loader (10 minutes)
|
| 111 |
-
```python
|
| 112 |
-
# In your existing pipeline __init__()
|
| 113 |
-
from safe_model_integration import get_safe_model_loader
|
| 114 |
-
self.safe_loader = get_safe_model_loader()
|
| 115 |
-
|
| 116 |
-
# In your existing initialize()
|
| 117 |
-
await self.safe_loader.safe_load_scrfd()
|
| 118 |
-
await self.safe_loader.safe_load_liveportrait()
|
| 119 |
-
```
|
| 120 |
-
|
| 121 |
-
### Step 3: Enable Features Gradually
|
| 122 |
-
```bash
|
| 123 |
-
# Test SCRFD first
|
| 124 |
-
export MIRAGE_ENABLE_SCRFD=1
|
| 125 |
-
# Verify face detection works, fallback to pass-through on errors
|
| 126 |
-
|
| 127 |
-
# Test LivePortrait second
|
| 128 |
-
export MIRAGE_ENABLE_LIVEPORTRAIT=1
|
| 129 |
-
# Verify animation works, fallback to pass-through on errors
|
| 130 |
-
```
|
| 131 |
-
|
| 132 |
-
### Step 4: Monitor and Validate
|
| 133 |
-
```bash
|
| 134 |
-
curl /metrics # Check enhanced metrics
|
| 135 |
-
curl /webrtc/connections # Check connection status
|
| 136 |
-
curl /health # Verify system health
|
| 137 |
-
```
|
| 138 |
-
|
| 139 |
-
## ⚡ INSTANT ROLLBACK STRATEGY
|
| 140 |
-
|
| 141 |
-
At any point, disable features:
|
| 142 |
-
```bash
|
| 143 |
-
export MIRAGE_ENABLE_SCRFD=0
|
| 144 |
-
export MIRAGE_ENABLE_LIVEPORTRAIT=0
|
| 145 |
-
# System immediately returns to existing pass-through behavior
|
| 146 |
-
```
|
| 147 |
-
|
| 148 |
-
## 🎉 BENEFITS OF THIS APPROACH
|
| 149 |
-
|
| 150 |
-
### Technical Benefits
|
| 151 |
-
- **Zero breaking changes** to existing working code
|
| 152 |
-
- **Instant rollback** capability with feature flags
|
| 153 |
-
- **Incremental validation** of each AI component
|
| 154 |
-
- **Enhanced observability** with detailed metrics
|
| 155 |
-
- **Compatible** with your CUDA 12.1 + A10G setup
|
| 156 |
-
|
| 157 |
-
### Business Benefits
|
| 158 |
-
- **Reduced risk** of system downtime
|
| 159 |
-
- **Faster iteration** with safe feature toggles
|
| 160 |
-
- **Better debugging** with component-level metrics
|
| 161 |
-
- **Proven stability** before full AI rollout
|
| 162 |
-
|
| 163 |
-
## 📋 FILES PROVIDED
|
| 164 |
-
|
| 165 |
-
| File | Purpose | Integration Risk |
|
| 166 |
-
|------|---------|------------------|
|
| 167 |
-
| `safe_model_integration.py` | Feature-flagged AI models | **Low** - Graceful fallbacks |
|
| 168 |
-
| `enhanced_metrics.py` | P50/P95 performance tracking | **Zero** - Pure addition |
|
| 169 |
-
| `webrtc_connection_monitoring.py` | Connection diagnostics | **Low** - Read-only endpoint |
|
| 170 |
-
| `scripts/optional_download_models.py` | On-demand model utility | **Zero** - Manual use only |
|
| 171 |
-
| `INCREMENTAL_INTEGRATION.md` | Step-by-step guide | **Zero** - Documentation |
|
| 172 |
-
|
| 173 |
-
## 🚀 RECOMMENDED NEXT STEPS
|
| 174 |
-
|
| 175 |
-
### Today (30 minutes)
|
| 176 |
-
1. Add `enhanced_metrics.py` to your pipeline
|
| 177 |
-
2. Verify metrics show P50/P95 latencies
|
| 178 |
-
3. Add `safe_model_integration.py` with flags disabled
|
| 179 |
-
4. Test that system works exactly as before
|
| 180 |
-
|
| 181 |
-
### This Week
|
| 182 |
-
1. Enable SCRFD: `MIRAGE_ENABLE_SCRFD=1`
|
| 183 |
-
2. Verify face detection works with fallbacks
|
| 184 |
-
3. Monitor enhanced metrics for performance impact
|
| 185 |
-
4. Enable LivePortrait: `MIRAGE_ENABLE_LIVEPORTRAIT=1`
|
| 186 |
-
|
| 187 |
-
### Next Week
|
| 188 |
-
1. Validate end-to-end AI pipeline performance
|
| 189 |
-
2. Fine-tune model parameters if needed
|
| 190 |
-
3. Consider adding connection monitoring endpoint
|
| 191 |
-
4. Plan gradual rollout to production users
|
| 192 |
-
|
| 193 |
-
---
|
| 194 |
-
|
| 195 |
-
## 🎯 CONCLUSION
|
| 196 |
-
|
| 197 |
-
Your team's approach is **architecturally sound**. These targeted improvements provide:
|
| 198 |
-
|
| 199 |
-
- ✅ **Real AI model integration** with safety guardrails
|
| 200 |
-
- ✅ **Enhanced observability** for performance debugging
|
| 201 |
-
- ✅ **Zero risk** to existing stability and auth systems
|
| 202 |
-
- ✅ **Instant rollback** capability at any point
|
| 203 |
-
- ✅ **Incremental validation** of each component
|
| 204 |
-
|
| 205 |
-
**This is the right way to add AI to a production system.**
|
| 206 |
-
|
| 207 |
-
Your current working foundation + these safe additions = Production-ready AI avatar system with <200ms latency.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
SPEC.md
CHANGED
|
@@ -1,3 +1,44 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
## Goals
|
| 2 |
- End-to-end audio latency < 250 ms (capture -> inference -> playback)
|
| 3 |
- Video pipeline: 512x512 @ ≥20 FPS target under load
|
|
|
|
| 1 |
+
## Architectural Reassessment (September 2025)
|
| 2 |
+
|
| 3 |
+
The initial implementation adopted a motion-driven portrait reenactment stack (LivePortrait ONNX models + custom alignment & smoothing) which is misaligned with the updated product goal: low-latency real-time face swapping with optional enhancement.
|
| 4 |
+
|
| 5 |
+
### Misalignment Summary
|
| 6 |
+
|
| 7 |
+
| Target Need | LivePortrait Path | Impact |
|
| 8 |
+
|-------------|-------------------|--------|
|
| 9 |
+
| Direct identity substitution | Motion reenactment of a canonicalized reference | Unnecessary motion keypoint pipeline |
|
| 10 |
+
| Minimal per-frame latency (<80ms) | ~500–600ms generator stages logged | Fails real-time threshold |
|
| 11 |
+
| Simple detector→swap flow | Multi-stage appearance + motion + generator | Complexity & fragile compositing |
|
| 12 |
+
| Artifact cleanup (optional) | No enhancement stage | Lower visual fidelity |
|
| 13 |
+
| Multi-face capability | Single-face canonical reenactment focus | Limits scalability |
|
| 14 |
+
|
| 15 |
+
### New Model Stack
|
| 16 |
+
1. Detector / embeddings: insightface FaceAnalysis (buffalo_l pack → SCRFD_10G_KPS + recognition)
|
| 17 |
+
2. Swapper: inswapper_128_fp16.onnx
|
| 18 |
+
3. Enhancement (optional):
|
| 19 |
+
- CodeFormer (codeformer.pth) for fidelity‑controllable restoration
|
| 20 |
+
|
| 21 |
+
### New Processing Loop
|
| 22 |
+
1. Capture frame
|
| 23 |
+
2. Detect faces (FaceAnalysis)
|
| 24 |
+
3. For each target face (top-N): apply InSwapper with pre-extracted source identity
|
| 25 |
+
4. (Optional) Run CodeFormer enhancer on final composited frame (if weights present)
|
| 26 |
+
5. Emit frame to WebRTC
|
| 27 |
+
|
| 28 |
+
### Environment Variables (Video / Enhancer)
|
| 29 |
+
| Variable | Values | Description |
|
| 30 |
+
|----------|--------|-------------|
|
| 31 |
+
| MIRAGE_MAX_FACES | int (default 1) | Swap up to N largest faces |
|
| 32 |
+
| MIRAGE_CODEFORMER_FIDELITY | 0.0–1.0 (default 0.75) | Balance identity (1.0) vs reconstruction sharpness |
|
| 33 |
+
| MIRAGE_INSWAPPER_URL | URL | Override InSwapper model source |
|
| 34 |
+
| MIRAGE_CODEFORMER_URL | URL | Override CodeFormer model source |
|
| 35 |
+
|
| 36 |
+
### Deprecated / To Remove
|
| 37 |
+
liveportrait_engine.py, avatar_pipeline.py, alignment.py, smoothing.py, realtime_optimizer.py, virtual_camera.py (current unused), enhanced_metrics.py, landmark_reenactor.py, safe_model_integration.py, debug_mediapipe.py
|
| 38 |
+
|
| 39 |
+
These abstractions are reenactment-specific (appearance feature caching, keypoint smoothing, inverse warp compositing) and will be replaced by a concise `swap_pipeline.py`.
|
| 40 |
+
|
| 41 |
+
---
|
| 42 |
## Goals
|
| 43 |
- End-to-end audio latency < 250 ms (capture -> inference -> playback)
|
| 44 |
- Video pipeline: 512x512 @ ≥20 FPS target under load
|
model_downloader.py
CHANGED
|
@@ -1,18 +1,20 @@
|
|
| 1 |
-
"""
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
|
|
|
| 10 |
|
| 11 |
-
|
| 12 |
"""
|
| 13 |
import os
|
| 14 |
import sys
|
| 15 |
import shutil
|
|
|
|
| 16 |
from pathlib import Path
|
| 17 |
import time
|
| 18 |
from typing import Optional
|
|
@@ -39,7 +41,8 @@ try:
|
|
| 39 |
except Exception:
|
| 40 |
hf_hub_download = None
|
| 41 |
|
| 42 |
-
|
|
|
|
| 43 |
HF_HOME = Path(os.getenv('HF_HOME', Path(__file__).parent / '.cache' / 'huggingface'))
|
| 44 |
HF_HOME.mkdir(parents=True, exist_ok=True)
|
| 45 |
|
|
@@ -177,9 +180,9 @@ class _FileLock:
|
|
| 177 |
|
| 178 |
def _audit(event: str, **extra):
|
| 179 |
try:
|
| 180 |
-
|
| 181 |
-
|
| 182 |
-
audit_path =
|
| 183 |
payload = {
|
| 184 |
'ts': time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime()),
|
| 185 |
'event': event,
|
|
@@ -193,166 +196,60 @@ def _audit(event: str, **extra):
|
|
| 193 |
|
| 194 |
|
| 195 |
def maybe_download() -> bool:
|
| 196 |
-
if os.getenv('MIRAGE_DOWNLOAD_MODELS', '1').lower() not in ('1',
|
| 197 |
print('[downloader] MIRAGE_DOWNLOAD_MODELS disabled')
|
| 198 |
_audit('disabled')
|
| 199 |
return False
|
| 200 |
-
|
| 201 |
-
app_url = os.getenv('MIRAGE_LP_APPEARANCE_URL')
|
| 202 |
-
motion_url = os.getenv('MIRAGE_LP_MOTION_URL')
|
| 203 |
-
success = True
|
| 204 |
_audit('start')
|
| 205 |
-
|
| 206 |
-
|
| 207 |
-
|
| 208 |
-
|
| 209 |
-
|
| 210 |
-
|
| 211 |
-
|
| 212 |
-
|
| 213 |
-
|
| 214 |
-
|
| 215 |
-
|
| 216 |
-
if
|
| 217 |
-
|
| 218 |
-
|
| 219 |
-
|
| 220 |
-
|
| 221 |
-
|
| 222 |
-
|
| 223 |
-
|
| 224 |
-
|
| 225 |
-
|
| 226 |
-
|
| 227 |
-
|
| 228 |
-
|
| 229 |
-
|
| 230 |
-
|
| 231 |
-
|
| 232 |
-
|
| 233 |
-
|
| 234 |
-
|
| 235 |
-
|
| 236 |
-
print(f'[downloader] ✅
|
| 237 |
-
_audit('
|
| 238 |
-
|
| 239 |
-
|
| 240 |
-
|
| 241 |
-
|
| 242 |
-
|
| 243 |
-
|
| 244 |
-
|
| 245 |
-
with _FileLock(dest):
|
| 246 |
-
if not dest.exists():
|
| 247 |
-
_download(motion_url, dest)
|
| 248 |
-
converted = _maybe_convert_opset_to_19(dest)
|
| 249 |
-
if converted != dest:
|
| 250 |
-
try:
|
| 251 |
-
shutil.copyfile(converted, dest)
|
| 252 |
-
print(f"[downloader] Replaced motion with opset19: {converted.name}")
|
| 253 |
-
except Exception:
|
| 254 |
-
pass
|
| 255 |
-
print(f'[downloader] ✅ Downloaded: {dest}')
|
| 256 |
-
_audit('download_ok', model='motion', path=str(dest))
|
| 257 |
-
except Exception as e:
|
| 258 |
-
print(f'[downloader] ❌ Failed to download motion extractor: {e}')
|
| 259 |
-
_audit('download_error', model='motion', error=str(e))
|
| 260 |
-
success = False
|
| 261 |
-
else:
|
| 262 |
-
converted = _maybe_convert_opset_to_19(dest)
|
| 263 |
-
if converted != dest:
|
| 264 |
-
try:
|
| 265 |
-
shutil.copyfile(converted, dest)
|
| 266 |
-
print(f"[downloader] Updated cached motion to opset19")
|
| 267 |
-
except Exception:
|
| 268 |
-
pass
|
| 269 |
-
print(f'[downloader] ✅ Motion extractor already exists: {dest}')
|
| 270 |
-
_audit('exists', model='motion', path=str(dest))
|
| 271 |
-
|
| 272 |
-
# Download additional models (generator required in neural-only mode)
|
| 273 |
-
generator_url = os.getenv('MIRAGE_LP_GENERATOR_URL')
|
| 274 |
-
if generator_url:
|
| 275 |
-
dest = LP_DIR / 'generator.onnx'
|
| 276 |
-
if not dest.exists():
|
| 277 |
-
try:
|
| 278 |
-
print(f'[downloader] Downloading generator model...')
|
| 279 |
-
with _FileLock(dest):
|
| 280 |
-
if not dest.exists():
|
| 281 |
-
_download(generator_url, dest)
|
| 282 |
-
if not _is_valid_onnx(dest):
|
| 283 |
-
print(f"[downloader] ❌ Generator ONNX validation failed for {generator_url}")
|
| 284 |
-
try:
|
| 285 |
-
dest.unlink()
|
| 286 |
-
except Exception:
|
| 287 |
-
pass
|
| 288 |
-
raise RuntimeError('generator download invalid')
|
| 289 |
-
print(f'[downloader] ✅ Downloaded: {dest}')
|
| 290 |
-
_audit('download_ok', model='generator', path=str(dest))
|
| 291 |
-
except Exception as e:
|
| 292 |
-
print(f'[downloader] ❌ Failed to download generator (required): {e}')
|
| 293 |
-
_audit('download_error', model='generator', error=str(e))
|
| 294 |
-
success = False
|
| 295 |
-
else:
|
| 296 |
-
if not _is_valid_onnx(dest):
|
| 297 |
-
try:
|
| 298 |
-
print(f"[downloader] Existing generator is invalid, removing and retrying download")
|
| 299 |
-
dest.unlink()
|
| 300 |
-
except Exception:
|
| 301 |
-
pass
|
| 302 |
-
try:
|
| 303 |
-
print(f'[downloader] Downloading generator model...')
|
| 304 |
-
with _FileLock(dest):
|
| 305 |
-
if not dest.exists():
|
| 306 |
-
_download(generator_url, dest)
|
| 307 |
-
if not _is_valid_onnx(dest):
|
| 308 |
-
raise RuntimeError(f'generator invalid after re-download: {generator_url}')
|
| 309 |
-
print(f'[downloader] ✅ Downloaded: {dest}')
|
| 310 |
-
_audit('download_ok', model='generator', path=str(dest), refreshed=True)
|
| 311 |
-
except Exception as e2:
|
| 312 |
-
print(f'[downloader] ❌ Failed to refresh invalid generator: {e2}')
|
| 313 |
-
_audit('download_error', model='generator', error=str(e2), refreshed=True)
|
| 314 |
-
success = False
|
| 315 |
-
else:
|
| 316 |
-
print(f'[downloader] ✅ Generator already exists: {dest}')
|
| 317 |
-
_audit('exists', model='generator', path=str(dest))
|
| 318 |
-
# Optional stitching model
|
| 319 |
-
stitching_url = os.getenv('MIRAGE_LP_STITCHING_URL')
|
| 320 |
-
if stitching_url:
|
| 321 |
-
dest = LP_DIR / 'stitching.onnx'
|
| 322 |
-
if not dest.exists():
|
| 323 |
-
try:
|
| 324 |
-
print(f'[downloader] Downloading stitching model...')
|
| 325 |
-
_download(stitching_url, dest)
|
| 326 |
-
print(f'[downloader] ✅ Downloaded: {dest}')
|
| 327 |
-
_audit('download_ok', model='stitching', path=str(dest))
|
| 328 |
-
except Exception as e:
|
| 329 |
-
print(f'[downloader] ⚠️ Failed to download stitching (optional): {e}')
|
| 330 |
-
_audit('download_error', model='stitching', error=str(e))
|
| 331 |
-
|
| 332 |
-
# Optional custom ops plugin for GridSample 3D used by some generator variants
|
| 333 |
-
grid_plugin_url = os.getenv('MIRAGE_LP_GRID_PLUGIN_URL')
|
| 334 |
-
if grid_plugin_url:
|
| 335 |
-
dest = LP_DIR / 'libgrid_sample_3d_plugin.so'
|
| 336 |
-
if not dest.exists():
|
| 337 |
-
try:
|
| 338 |
-
print(f'[downloader] Downloading grid sample plugin...')
|
| 339 |
-
_download(grid_plugin_url, dest)
|
| 340 |
-
print(f'[downloader] ✅ Downloaded: {dest}')
|
| 341 |
-
_audit('download_ok', model='grid_plugin', path=str(dest))
|
| 342 |
-
except Exception as e:
|
| 343 |
-
print(f'[downloader] ⚠️ Failed to download grid plugin (optional): {e}')
|
| 344 |
-
_audit('download_error', model='grid_plugin', error=str(e))
|
| 345 |
-
|
| 346 |
_audit('complete', success=success)
|
| 347 |
return success
|
| 348 |
|
| 349 |
|
| 350 |
if __name__ == '__main__':
|
| 351 |
-
"
|
| 352 |
-
|
| 353 |
-
|
| 354 |
-
|
| 355 |
-
print("✅ All required models downloaded successfully")
|
| 356 |
else:
|
| 357 |
-
print("❌ Some model downloads failed")
|
| 358 |
sys.exit(1)
|
|
|
|
| 1 |
+
"""Model downloader for face swap stack (InSwapper + CodeFormer).
|
| 2 |
+
|
| 3 |
+
Environment:
|
| 4 |
+
MIRAGE_DOWNLOAD_MODELS=1|0
|
| 5 |
+
MIRAGE_INSWAPPER_URL (default HF inswapper 128)
|
| 6 |
+
MIRAGE_CODEFORMER_URL (default CodeFormer official release)
|
| 7 |
+
|
| 8 |
+
Models are stored under:
|
| 9 |
+
models/inswapper/inswapper_128_fp16.onnx
|
| 10 |
+
models/codeformer/codeformer.pth
|
| 11 |
|
| 12 |
+
Download priority: requests -> huggingface_hub heuristic. Safe across parallel processes via file locks.
|
| 13 |
"""
|
| 14 |
import os
|
| 15 |
import sys
|
| 16 |
import shutil
|
| 17 |
+
import json
|
| 18 |
from pathlib import Path
|
| 19 |
import time
|
| 20 |
from typing import Optional
|
|
|
|
| 41 |
except Exception:
|
| 42 |
hf_hub_download = None
|
| 43 |
|
| 44 |
+
INSWAPPER_DIR = Path(__file__).parent / 'models' / 'inswapper'
|
| 45 |
+
CODEFORMER_DIR = Path(__file__).parent / 'models' / 'codeformer'
|
| 46 |
HF_HOME = Path(os.getenv('HF_HOME', Path(__file__).parent / '.cache' / 'huggingface'))
|
| 47 |
HF_HOME.mkdir(parents=True, exist_ok=True)
|
| 48 |
|
|
|
|
| 180 |
|
| 181 |
def _audit(event: str, **extra):
|
| 182 |
try:
|
| 183 |
+
audit_dir = Path(__file__).parent / 'models' / '_logs'
|
| 184 |
+
audit_dir.mkdir(parents=True, exist_ok=True)
|
| 185 |
+
audit_path = audit_dir / 'download_audit.jsonl'
|
| 186 |
payload = {
|
| 187 |
'ts': time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime()),
|
| 188 |
'event': event,
|
|
|
|
| 196 |
|
| 197 |
|
| 198 |
def maybe_download() -> bool:
|
| 199 |
+
if os.getenv('MIRAGE_DOWNLOAD_MODELS', '1').lower() not in ('1','true','yes','on'):
|
| 200 |
print('[downloader] MIRAGE_DOWNLOAD_MODELS disabled')
|
| 201 |
_audit('disabled')
|
| 202 |
return False
|
|
|
|
|
|
|
|
|
|
|
|
|
| 203 |
_audit('start')
|
| 204 |
+
success = True
|
| 205 |
+
|
| 206 |
+
inswapper_url = os.getenv('MIRAGE_INSWAPPER_URL', 'https://huggingface.co/deepinsight/inswapper/resolve/main/inswapper_128_fp16.onnx')
|
| 207 |
+
codeformer_url = os.getenv('MIRAGE_CODEFORMER_URL', 'https://github.com/TencentARC/CodeFormer/releases/download/v0.1.0/codeformer.pth')
|
| 208 |
+
|
| 209 |
+
# InSwapper
|
| 210 |
+
inswapper_dest = INSWAPPER_DIR / 'inswapper_128_fp16.onnx'
|
| 211 |
+
if not inswapper_dest.exists():
|
| 212 |
+
try:
|
| 213 |
+
print('[downloader] Downloading InSwapper model...')
|
| 214 |
+
with _FileLock(inswapper_dest):
|
| 215 |
+
if not inswapper_dest.exists():
|
| 216 |
+
_download(inswapper_url, inswapper_dest)
|
| 217 |
+
print(f'[downloader] ✅ InSwapper ready: {inswapper_dest}')
|
| 218 |
+
_audit('download_ok', model='inswapper', path=str(inswapper_dest))
|
| 219 |
+
except Exception as e:
|
| 220 |
+
print(f'[downloader] ❌ InSwapper download failed: {e}')
|
| 221 |
+
_audit('download_error', model='inswapper', error=str(e))
|
| 222 |
+
success = False
|
| 223 |
+
else:
|
| 224 |
+
print(f'[downloader] ✅ InSwapper exists: {inswapper_dest}')
|
| 225 |
+
_audit('exists', model='inswapper', path=str(inswapper_dest))
|
| 226 |
+
|
| 227 |
+
# CodeFormer (optional)
|
| 228 |
+
codef_dest = CODEFORMER_DIR / 'codeformer.pth'
|
| 229 |
+
if not codef_dest.exists():
|
| 230 |
+
try:
|
| 231 |
+
print('[downloader] Downloading CodeFormer model...')
|
| 232 |
+
with _FileLock(codef_dest):
|
| 233 |
+
if not codef_dest.exists():
|
| 234 |
+
_download(codeformer_url, codef_dest)
|
| 235 |
+
print(f'[downloader] ✅ CodeFormer ready: {codef_dest}')
|
| 236 |
+
_audit('download_ok', model='codeformer', path=str(codef_dest))
|
| 237 |
+
except Exception as e:
|
| 238 |
+
print(f'[downloader] ⚠️ CodeFormer download failed (continuing): {e}')
|
| 239 |
+
_audit('download_error', model='codeformer', error=str(e))
|
| 240 |
+
else:
|
| 241 |
+
print(f'[downloader] ✅ CodeFormer exists: {codef_dest}')
|
| 242 |
+
_audit('exists', model='codeformer', path=str(codef_dest))
|
| 243 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 244 |
_audit('complete', success=success)
|
| 245 |
return success
|
| 246 |
|
| 247 |
|
| 248 |
if __name__ == '__main__':
|
| 249 |
+
print("=== Model Downloader (InSwapper + CodeFormer) ===")
|
| 250 |
+
ok = maybe_download()
|
| 251 |
+
if ok:
|
| 252 |
+
print("✅ All required models downloaded successfully (some optional)")
|
|
|
|
| 253 |
else:
|
| 254 |
+
print("❌ Some required model downloads failed")
|
| 255 |
sys.exit(1)
|
requirements.txt
CHANGED
|
@@ -1,21 +1,14 @@
|
|
| 1 |
fastapi==0.104.1
|
| 2 |
uvicorn[standard]==0.24.0
|
| 3 |
-
# Torch packages are installed via Dockerfile with CUDA 11.8 wheels; avoid conflicting pins here
|
| 4 |
aiortc==1.6.0
|
| 5 |
websockets==11.0.3
|
| 6 |
numpy==1.24.4
|
| 7 |
opencv-python==4.8.1.78
|
| 8 |
Pillow==10.0.1
|
| 9 |
-
librosa==0.10.1
|
| 10 |
-
soundfile==0.12.1
|
| 11 |
insightface==0.7.3
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
# ORT GPU pinned to a CUDA 11.x-friendly build (supports opset 20). 1.18.1 has CUDA 12 wheels too; ensure Docker CUDA base matches.
|
| 15 |
-
onnxruntime-gpu==1.18.1
|
| 16 |
-
onnxruntime-extensions==0.12.0
|
| 17 |
-
huggingface-hub==0.24.5
|
| 18 |
python-multipart==0.0.9
|
| 19 |
av==11.0.0
|
| 20 |
psutil==5.9.8
|
| 21 |
-
|
|
|
|
| 1 |
fastapi==0.104.1
|
| 2 |
uvicorn[standard]==0.24.0
|
|
|
|
| 3 |
aiortc==1.6.0
|
| 4 |
websockets==11.0.3
|
| 5 |
numpy==1.24.4
|
| 6 |
opencv-python==4.8.1.78
|
| 7 |
Pillow==10.0.1
|
|
|
|
|
|
|
| 8 |
insightface==0.7.3
|
| 9 |
+
basicsr==1.4.2
|
| 10 |
+
timm==0.9.12
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
python-multipart==0.0.9
|
| 12 |
av==11.0.0
|
| 13 |
psutil==5.9.8
|
| 14 |
+
huggingface-hub==0.24.5
|
swap_pipeline.py
ADDED
|
@@ -0,0 +1,210 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
import io
|
| 3 |
+
import time
|
| 4 |
+
import logging
|
| 5 |
+
from typing import Optional, Dict, Any, List
|
| 6 |
+
|
| 7 |
+
import numpy as np
|
| 8 |
+
import cv2
|
| 9 |
+
from PIL import Image
|
| 10 |
+
|
| 11 |
+
import insightface # ensures model pack loading
|
| 12 |
+
from insightface.app import FaceAnalysis
|
| 13 |
+
|
| 14 |
+
logger = logging.getLogger(__name__)
|
| 15 |
+
|
| 16 |
+
INSWAPPER_ONNX_PATH = os.path.join('models', 'inswapper', 'inswapper_128_fp16.onnx')
|
| 17 |
+
CODEFORMER_PATH = os.path.join('models', 'codeformer', 'codeformer.pth')
|
| 18 |
+
|
| 19 |
+
class FaceSwapPipeline:
|
| 20 |
+
"""Direct face swap + optional enhancement pipeline.
|
| 21 |
+
|
| 22 |
+
Lifecycle:
|
| 23 |
+
1. initialize() -> loads detector/recognizer (buffalo_l) and inswapper onnx
|
| 24 |
+
2. set_source_image(image_bytes|np.array) -> extracts source identity face object
|
| 25 |
+
3. process_frame(frame) -> swap all or top-N faces using source face
|
| 26 |
+
4. (optional) CodeFormer enhancement (always attempted if model present)
|
| 27 |
+
"""
|
| 28 |
+
def __init__(self):
|
| 29 |
+
self.initialized = False
|
| 30 |
+
self.source_face = None
|
| 31 |
+
self.source_img_meta = {}
|
| 32 |
+
# Single enhancer path: CodeFormer (optional)
|
| 33 |
+
self.max_faces = int(os.getenv('MIRAGE_MAX_FACES', '1'))
|
| 34 |
+
self._stats = {
|
| 35 |
+
'frames': 0,
|
| 36 |
+
'last_latency_ms': None,
|
| 37 |
+
'avg_latency_ms': None,
|
| 38 |
+
'swap_faces_last': 0,
|
| 39 |
+
'enhanced_frames': 0
|
| 40 |
+
}
|
| 41 |
+
self._lat_hist: List[float] = []
|
| 42 |
+
self.app: Optional[FaceAnalysis] = None
|
| 43 |
+
self.swapper = None
|
| 44 |
+
self.codeformer = None
|
| 45 |
+
self.codeformer_fidelity = float(os.getenv('MIRAGE_CODEFORMER_FIDELITY', '0.75'))
|
| 46 |
+
self.codeformer_loaded = False
|
| 47 |
+
|
| 48 |
+
def initialize(self):
|
| 49 |
+
if self.initialized:
|
| 50 |
+
return True
|
| 51 |
+
providers = None
|
| 52 |
+
try:
|
| 53 |
+
# Let insightface choose; can restrict with env MIRAGE_CUDA_ONLY
|
| 54 |
+
if os.getenv('MIRAGE_CUDA_ONLY'):
|
| 55 |
+
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
|
| 56 |
+
except Exception:
|
| 57 |
+
providers = None
|
| 58 |
+
self.app = FaceAnalysis(name='buffalo_l', providers=providers)
|
| 59 |
+
self.app.prepare(ctx_id=0, det_size=(640,640))
|
| 60 |
+
# Load swapper
|
| 61 |
+
if not os.path.isfile(INSWAPPER_ONNX_PATH):
|
| 62 |
+
raise FileNotFoundError(f"Missing inswapper model at {INSWAPPER_ONNX_PATH}")
|
| 63 |
+
self.swapper = insightface.model_zoo.get_model(INSWAPPER_ONNX_PATH, providers=providers)
|
| 64 |
+
# Optional CodeFormer enhancer
|
| 65 |
+
try:
|
| 66 |
+
# CodeFormer dependencies
|
| 67 |
+
from basicsr.utils import imwrite # noqa: F401
|
| 68 |
+
from basicsr.archs.rrdbnet_arch import RRDBNet # noqa: F401
|
| 69 |
+
import torch
|
| 70 |
+
from torchvision import transforms # noqa: F401
|
| 71 |
+
from collections import OrderedDict
|
| 72 |
+
# Lazy import codeformer util packaged structure (user expected to mount model)
|
| 73 |
+
if not os.path.isfile(CODEFORMER_PATH):
|
| 74 |
+
logger.warning(f"CodeFormer selected but model file missing: {CODEFORMER_PATH}")
|
| 75 |
+
else:
|
| 76 |
+
# Minimal inline loader (avoid full repo clone)
|
| 77 |
+
from torch import nn
|
| 78 |
+
class CodeFormerWrapper:
|
| 79 |
+
def __init__(self, model_path: str, fidelity: float):
|
| 80 |
+
from codeformer.archs.codeformer_arch import CodeFormer # type: ignore
|
| 81 |
+
self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
|
| 82 |
+
self.net = CodeFormer(dim_embd=512, codebook_size=1024, n_head=8, n_layers=9,
|
| 83 |
+
connect_list=['32','64','128','256']).to(self.device)
|
| 84 |
+
ckpt = torch.load(model_path, map_location='cpu')
|
| 85 |
+
if 'params_ema' in ckpt:
|
| 86 |
+
self.net.load_state_dict(ckpt['params_ema'], strict=False)
|
| 87 |
+
else:
|
| 88 |
+
self.net.load_state_dict(ckpt['state_dict'], strict=False)
|
| 89 |
+
self.net.eval()
|
| 90 |
+
self.fidelity = min(max(fidelity, 0.0), 1.0)
|
| 91 |
+
@torch.no_grad()
|
| 92 |
+
def enhance(self, img_bgr: np.ndarray) -> np.ndarray:
|
| 93 |
+
import torch.nn.functional as F
|
| 94 |
+
img = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
|
| 95 |
+
tensor = torch.from_numpy(img).float().to(self.device) / 255.0
|
| 96 |
+
tensor = tensor.permute(2,0,1).unsqueeze(0)
|
| 97 |
+
# CodeFormer forward expects (B,C,H,W)
|
| 98 |
+
try:
|
| 99 |
+
out = self.net(tensor, w=self.fidelity, adain=True)[0]
|
| 100 |
+
except Exception:
|
| 101 |
+
# Some variants return tuple
|
| 102 |
+
out = self.net(tensor, w=self.fidelity)[0]
|
| 103 |
+
out = (out.clamp(0,1) * 255.0).byte().permute(1,2,0).cpu().numpy()
|
| 104 |
+
return cv2.cvtColor(out, cv2.COLOR_RGB2BGR)
|
| 105 |
+
self.codeformer = CodeFormerWrapper(CODEFORMER_PATH, self.codeformer_fidelity)
|
| 106 |
+
self.codeformer_loaded = True
|
| 107 |
+
logger.info('CodeFormer loaded')
|
| 108 |
+
except Exception as e:
|
| 109 |
+
logger.warning(f"CodeFormer init failed, disabling: {e}")
|
| 110 |
+
self.codeformer = None
|
| 111 |
+
self.initialized = True
|
| 112 |
+
logger.info('FaceSwapPipeline initialized')
|
| 113 |
+
return True
|
| 114 |
+
|
| 115 |
+
def _decode_image(self, data) -> np.ndarray:
|
| 116 |
+
if isinstance(data, bytes):
|
| 117 |
+
arr = np.frombuffer(data, np.uint8)
|
| 118 |
+
img = cv2.imdecode(arr, cv2.IMREAD_COLOR)
|
| 119 |
+
return img
|
| 120 |
+
if isinstance(data, np.ndarray):
|
| 121 |
+
return data
|
| 122 |
+
if hasattr(data, 'read'):
|
| 123 |
+
buff = data.read()
|
| 124 |
+
arr = np.frombuffer(buff, np.uint8)
|
| 125 |
+
return cv2.imdecode(arr, cv2.IMREAD_COLOR)
|
| 126 |
+
raise TypeError('Unsupported image input type')
|
| 127 |
+
|
| 128 |
+
def set_source_image(self, image_input) -> bool:
|
| 129 |
+
if not self.initialized:
|
| 130 |
+
self.initialize()
|
| 131 |
+
img = self._decode_image(image_input)
|
| 132 |
+
if img is None:
|
| 133 |
+
logger.error('Failed to decode source image')
|
| 134 |
+
return False
|
| 135 |
+
faces = self.app.get(img)
|
| 136 |
+
if not faces:
|
| 137 |
+
logger.error('No face detected in source image')
|
| 138 |
+
return False
|
| 139 |
+
# Choose the largest face by bbox area
|
| 140 |
+
def _area(face):
|
| 141 |
+
x1,y1,x2,y2 = face.bbox.astype(int)
|
| 142 |
+
return (x2-x1)*(y2-y1)
|
| 143 |
+
faces.sort(key=_area, reverse=True)
|
| 144 |
+
self.source_face = faces[0]
|
| 145 |
+
self.source_img_meta = {'resolution': img.shape[:2], 'num_faces': len(faces)}
|
| 146 |
+
logger.info('Source face set')
|
| 147 |
+
return True
|
| 148 |
+
|
| 149 |
+
def process_frame(self, frame: np.ndarray) -> np.ndarray:
|
| 150 |
+
if not self.initialized or self.swapper is None or self.app is None or self.source_face is None:
|
| 151 |
+
return frame
|
| 152 |
+
t0 = time.time()
|
| 153 |
+
faces = self.app.get(frame)
|
| 154 |
+
if not faces:
|
| 155 |
+
self._record_latency(time.time() - t0)
|
| 156 |
+
self._stats['swap_faces_last'] = 0
|
| 157 |
+
return frame
|
| 158 |
+
# Sort faces by area and keep top-N
|
| 159 |
+
def _area(face):
|
| 160 |
+
x1,y1,x2,y2 = face.bbox.astype(int)
|
| 161 |
+
return (x2-x1)*(y2-y1)
|
| 162 |
+
faces.sort(key=_area, reverse=True)
|
| 163 |
+
out = frame
|
| 164 |
+
count = 0
|
| 165 |
+
for f in faces[:self.max_faces]:
|
| 166 |
+
try:
|
| 167 |
+
out = self.swapper.get(out, f, self.source_face, paste_back=True)
|
| 168 |
+
count += 1
|
| 169 |
+
except Exception as e:
|
| 170 |
+
logger.debug(f"Swap failed for face: {e}")
|
| 171 |
+
if count > 0 and self.codeformer is not None:
|
| 172 |
+
try:
|
| 173 |
+
out = self.codeformer.enhance(out)
|
| 174 |
+
self._stats['enhanced_frames'] += 1
|
| 175 |
+
except Exception as e:
|
| 176 |
+
logger.debug(f"CodeFormer enhancement failed: {e}")
|
| 177 |
+
self._record_latency(time.time() - t0)
|
| 178 |
+
self._stats['swap_faces_last'] = count
|
| 179 |
+
self._stats['frames'] += 1
|
| 180 |
+
return out
|
| 181 |
+
|
| 182 |
+
def _record_latency(self, dt: float):
|
| 183 |
+
ms = dt * 1000.0
|
| 184 |
+
self._stats['last_latency_ms'] = ms
|
| 185 |
+
self._lat_hist.append(ms)
|
| 186 |
+
if len(self._lat_hist) > 200:
|
| 187 |
+
self._lat_hist.pop(0)
|
| 188 |
+
self._stats['avg_latency_ms'] = float(np.mean(self._lat_hist)) if self._lat_hist else None
|
| 189 |
+
|
| 190 |
+
def get_stats(self) -> Dict[str, Any]:
|
| 191 |
+
return dict(
|
| 192 |
+
self._stats,
|
| 193 |
+
initialized=self.initialized,
|
| 194 |
+
codeformer_fidelity=self.codeformer_fidelity if self.codeformer is not None else None,
|
| 195 |
+
codeformer_loaded=self.codeformer_loaded,
|
| 196 |
+
)
|
| 197 |
+
|
| 198 |
+
# Backwards compatibility for earlier server expecting process_video_frame
|
| 199 |
+
def process_video_frame(self, frame: np.ndarray, frame_idx: int | None = None) -> np.ndarray:
|
| 200 |
+
return self.process_frame(frame)
|
| 201 |
+
|
| 202 |
+
# Singleton access similar to previous pattern
|
| 203 |
+
_pipeline_instance: Optional[FaceSwapPipeline] = None
|
| 204 |
+
|
| 205 |
+
def get_pipeline() -> FaceSwapPipeline:
|
| 206 |
+
global _pipeline_instance
|
| 207 |
+
if _pipeline_instance is None:
|
| 208 |
+
_pipeline_instance = FaceSwapPipeline()
|
| 209 |
+
_pipeline_instance.initialize()
|
| 210 |
+
return _pipeline_instance
|
webrtc_server.py
CHANGED
|
@@ -66,8 +66,7 @@ class _PassThroughPipeline:
|
|
| 66 |
def initialize(self):
|
| 67 |
return True
|
| 68 |
|
| 69 |
-
def
|
| 70 |
-
# No-op reference; return False to indicate not used
|
| 71 |
return False
|
| 72 |
|
| 73 |
def process_video_frame(self, img, frame_idx=None):
|
|
@@ -86,10 +85,10 @@ def get_pipeline(): # type: ignore
|
|
| 86 |
if _pipeline_singleton is not None:
|
| 87 |
return _pipeline_singleton
|
| 88 |
try:
|
| 89 |
-
from
|
| 90 |
_pipeline_singleton = _real_get_pipeline()
|
| 91 |
except Exception as e:
|
| 92 |
-
logger.error(f"
|
| 93 |
_pipeline_singleton = _PassThroughPipeline()
|
| 94 |
return _pipeline_singleton
|
| 95 |
|
|
|
|
| 66 |
def initialize(self):
|
| 67 |
return True
|
| 68 |
|
| 69 |
+
def set_source_image(self, img):
|
|
|
|
| 70 |
return False
|
| 71 |
|
| 72 |
def process_video_frame(self, img, frame_idx=None):
|
|
|
|
| 85 |
if _pipeline_singleton is not None:
|
| 86 |
return _pipeline_singleton
|
| 87 |
try:
|
| 88 |
+
from swap_pipeline import get_pipeline as _real_get_pipeline
|
| 89 |
_pipeline_singleton = _real_get_pipeline()
|
| 90 |
except Exception as e:
|
| 91 |
+
logger.error(f"swap_pipeline unavailable, using pass-through: {e}")
|
| 92 |
_pipeline_singleton = _PassThroughPipeline()
|
| 93 |
return _pipeline_singleton
|
| 94 |
|