Spaces:

Islamckennon
/

mirage

Paused

App Files Files Community

MacBook pro commited on Sep 24

Commit

ba8225a

1 Parent(s): 32226d2

Pivot: remove GFPGAN + reenactment stack; CodeFormer-only enhancement, purge legacy files, update docs & downloader

Browse files

Files changed (11) hide show

.github/copilot-instructions.md +24 -0
.github/instructions.md +17 -0
README.md +133 -144
Repair/INCREMENTAL_INTEGRATION.md +0 -119
Repair/QUICK_CHECKLIST.md +0 -77
Repair/TARGETED_RECOMMENDATIONS.md +0 -207
SPEC.md +41 -0
model_downloader.py +63 -166
requirements.txt +3 -10
swap_pipeline.py +210 -0
webrtc_server.py +3 -4

.github/copilot-instructions.md ADDED Viewed

	@@ -0,0 +1,24 @@

+Prime Directive:
+Deliver production-ready, systemic solutions to root causes. Prioritize core utility and absolute system integrity. There is zero tolerance for surface patches, brittle fixes, or non-functional code.
+Mandatory Protocol:
+Map the System: Before acting, map all relevant logic flows, data transformations, and dependencies. Identify all side effects.
+Isolate Root Cause: Diagnose the fundamental issue with code-based evidence. Ensure the fix is systemic and permanent.
+Align with Utility: Every change must advance the project's core objective. Reject low-impact optimizations.
+Implementation Mandates:
+Code Integrity: All code must be robust, generalizable, and directly executable. Prohibit all hardcoding, duplicated functionality, and placeholder logic.
+Quality & Security: Enforce static typing, descriptive naming, and strict linting. Validate all I/O, eliminate unsafe calls, and add regression guards.
+Testing: Test coverage must target both the symptom and its root cause. The full test suite must pass without warnings.
+Execution Workflow:
+Analyze system flow.
+Confirm root cause.
+Plan solution.
+Implement the robust fix.
+Validate with all tests.
+Document systemic insights.
+Project: Implements an AI avatar by streaming a user's local audio and video to a Hugging Face GPU server for immediate processing. In the cloud, the system performs simultaneous generative face swapping—animating a source image's identity with the user's live motion—and real-time voice conversion, which morphs the user's speech to a target profile while preserving the original prosody. The fully synchronized audio-visual output is then streamed back to the local machine, functioning as an integrated virtual camera and microphone for seamless use in communication platforms like Zoom and WhatsApp.
+Operational instructions:
+- All implementations must be architected for the huggingface space located at https://huggingface.co/spaces/Islamckennon/mirage
+- After every change, push to github and huggingface, then await user feedback for next steps.
+- All code must be archhitected towards project real-world functionality only.

.github/instructions.md ADDED Viewed

	@@ -0,0 +1,17 @@

+Prime Directive:
+Deliver production-ready, systemic solutions to root causes. Prioritize core utility and absolute system integrity. There is zero tolerance for surface patches, brittle fixes, or non-functional code.
+Mandatory Protocol:
+Map the System: Before acting, map all relevant logic flows, data transformations, and dependencies. Identify all side effects.
+Isolate Root Cause: Diagnose the fundamental issue with code-based evidence. Ensure the fix is systemic and permanent.
+Align with Utility: Every change must advance the project's core objective. Reject low-impact optimizations.
+Implementation Mandates:
+Code Integrity: All code must be robust, generalizable, and directly executable. Prohibit all hardcoding, duplication, and placeholder logic.
+Quality & Security: Enforce static typing, descriptive naming, and strict linting. Validate all I/O, eliminate unsafe calls, and add regression guards.
+Testing: Test coverage must target both the symptom and its root cause. The full test suite must pass without warnings.
+Execution Workflow:
+Analyze system flow.
+Confirm root cause.
+Plan solution.
+Implement the robust fix.
+Validate with all tests.
+Document systemic insights.

README.md CHANGED Viewed

@@ -10,63 +10,61 @@ pinned: false
 license: mit
 hardware: a10g-large
 python_version: "3.10"
-models:
-- KwaiVGI/LivePortrait
-- RVC-Project/Retrieval-based-Voice-Conversion-WebUI
 tags:
 - real-time
 - ai-avatar
-- face-animation
 - voice-conversion
-- live-portrait
-- rvc
 - virtual-camera
-short_description: "Real-time AI avatar with face animation and voice conversion"
 ---
 # 🎭 Mirage: Real-time AI Avatar System
-Transform yourself into an AI avatar in real-time with sub-250ms latency! Perfect for video calls, streaming, and virtual meetings.
 ## 🚀 Features
-- **Real-time Face Animation**: Live portrait animation using state-of-the-art AI
-- **Voice Conversion**: Real-time voice transformation with RVC
-- **Ultra-low Latency**: <250ms end-to-end latency optimized for A10G GPU
-- **Virtual Camera**: Direct integration with Zoom, Teams, Discord, and more
-- **Adaptive Quality**: Automatic quality adjustment to maintain real-time performance
-- **GPU Optimized**: Efficient memory management and CUDA acceleration
 ## 🎯 Use Cases
-- **Video Conferencing**: Use AI avatars in Zoom, Google Meet, Microsoft Teams
-- **Content Creation**: Streaming with animated avatars on Twitch, YouTube
-- **Virtual Meetings**: Professional presentations with consistent avatar appearance
-- **Privacy Protection**: Maintain anonymity while participating in video calls
 ## 🛠️ Technology Stack
-- **Face Animation**: LivePortrait (KwaiVGI)
-- **Voice Conversion**: RVC (Retrieval-based Voice Conversion)
-- **Face Detection**: SCRFD with optimized inference
-- **Backend**: FastAPI with WebRTC (aiortc)
-- **Frontend**: WebRTC-enabled real-time client
-- **GPU**: NVIDIA A10G with CUDA optimization
-## 📊 Performance Specs
-- **Video Resolution**: 512x512 @ 20 FPS (adaptive)
-- **Audio Processing**: 160ms chunks @ 16kHz
-- **End-to-end Latency**: <250ms target
-- **GPU Memory**: ~8GB peak usage on A10G
-- **Face Detection**: SCRFD every 5 frames for efficiency
-## 🚀 Quick Start
-1. **Initialize Pipeline**: Click "Initialize AI Pipeline" to load models
-2. **Set Reference**: Upload your reference image for avatar creation
-3. **Start Capture**: Begin real-time avatar generation
-4. **Enable Virtual Camera**: Use avatar output in third-party apps
 ## 🔧 Technical Details
@@ -76,16 +74,18 @@ Transform yourself into an AI avatar in real-time with sub-250ms latency! Perfec
 - GPU memory management and cleanup
 - Audio-video synchronization within 150ms
-### Model Architecture
-- **LivePortrait**: Efficient portrait animation with stitching control
-- **RVC**: High-quality voice conversion with minimal latency
-- **SCRFD**: Fast face detection with confidence thresholding
 ### Real-time Features
-- WebSocket streaming for minimal overhead
-- Adaptive resolution (512x512 → 384x384 → 256x256)
-- Quality degradation order: Quality → FPS → Resolution
-- Automatic recovery when performance improves
 ## 📱 Virtual Camera Integration
@@ -96,45 +96,81 @@ The system creates a virtual camera device that can be used in:
 - **Social Media**: WhatsApp Desktop, Skype, Facebook Messenger
 - **Gaming**: Steam, Discord voice channels
-## ⚡ Performance Monitoring
-Real-time metrics include:
-- Video FPS and latency
-- GPU memory usage
-- Audio processing time
-- Frame drop statistics
-- System resource utilization
 ## 🔒 Privacy & Security
-- All processing happens locally on the GPU
-- No data is stored or transmitted to external servers
-- Reference images are processed in memory only
-- WebSocket connections use secure protocols
-## 🔧 Advanced Configuration
-The system automatically adapts quality based on performance:
-- **High Performance**: 512x512 @ 20 FPS, full quality
-- **Medium Performance**: 384x384 @ 18 FPS, reduced quality
-- **Low Performance**: 256x256 @ 15 FPS, minimum quality
 ## 📋 Requirements
-- **GPU**: NVIDIA A10G or equivalent (RTX 3080+ recommended)
-- **Memory**: 16GB+ RAM, 8GB+ VRAM
-- **Browser**: Chrome/Edge with WebRTC support
-- **Camera**: Any USB webcam or built-in camera
-## 🛠️ Development
-Built with modern technologies:
-- FastAPI for high-performance backend (Docker entrypoint: uvicorn original_fastapi_app:app)
-- PyTorch with CUDA acceleration
-- OpenCV for image processing
-- WebRTC (aiortc) for real-time media transport
-- Docker for consistent deployment
 ## 📄 License
@@ -142,30 +178,19 @@ MIT License - Feel free to use and modify for your projects!
 ## 🙏 Acknowledgments
-- [LivePortrait](https://github.com/KwaiVGI/LivePortrait) for face animation
-- [RVC Project](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI) for voice conversion
-- [InsightFace](https://github.com/deepinsight/insightface) for face detection
-- HuggingFace for providing A10G GPU infrastructure
-## Metrics Endpoints
-- `GET /metrics` – JSON with audio/video counters, EMAs (loop interval, inference), rolling FPS, frame interval EMA.
-- `GET /gpu` – GPU availability & memory (torch or `nvidia-smi` fallback).
-- `GET /metrics/async` – Async worker stats (frames submitted/processed, queue depth, last latency ms).
-- `GET /metrics/stage_histogram` – Histogram buckets of recent inference stage latencies (snapshot window).
-- `GET /metrics/motion` – Recent motion magnitudes (normalized) plus tail statistics.
-- `GET /metrics/pacing` – Latency EMA and pacing hint multiplier ( >1.0 suggests you can raise FPS, <1.0 suggests throttling ).
-- `POST /smoothing/update` – Runtime update of One Euro keypoint smoothing params. JSON body keys: `min_cutoff`, `beta`, `d_cutoff` (all optional floats).
-Example:
-```bash
-curl -s http://localhost:7860/metrics | jq '.video_fps_rolling, .audio_infer_time_ema_ms'
-```
 ## Voice Stub Activation
-Set `MIRAGE_VOICE_ENABLE=1` to activate the voice processor stub. Behavior:
-- Audio chunks are routed through `voice_processor.process_pcm_int16` (pass-through now).
-- `audio_infer_time_ema_ms` becomes > 0 after a few chunks.
-- When disabled, inference EMA remains 0.0.
 ## Future Parameterization
 - Frontend will fetch a `/config` endpoint to align `chunk_ms` and `video_max_fps` dynamically.
@@ -202,10 +227,8 @@ If the Space shows a perpetual "Restarting" badge:
 If problems persist, capture the Container log stack trace and open an issue.
-## Enable ONNX Model Downloads (Safe LivePortrait)
-## Advanced Real-time Metrics & Control
-New runtime observability & control surfaces were added to tune real-time performance:
 ### Endpoints Recap
 See Metrics Endpoints section above. Typical usage examples:
@@ -225,57 +248,23 @@ curl -s http://localhost:7860/metrics/motion | jq '.recent_motion[-5:]'
 ### Motion Magnitude
 Aggregated from per-frame keypoint motion vectors; higher values trigger more frequent face detection to avoid drift. Low motion stretches automatically reduce detection frequency to save compute.
-### One Euro Smoothing Parameters
-You can initialize or override smoothing parameters via environment variables:
-| Variable | Default | Meaning |
-|----------|---------|---------|
-| `MIRAGE_ONEEURO_MIN_CUTOFF` | 1.0 | Base cutoff frequency controlling overall smoothing strength |
-| `MIRAGE_ONEEURO_BETA` | 0.05 | Speed coefficient (higher reduces lag during fast motion) |
-| `MIRAGE_ONEEURO_D_CUTOFF` | 1.0 | Derivative cutoff for velocity filtering |
-Runtime adjustments:
-```bash
-curl -X POST http://localhost:7860/smoothing/update \
-	-H 'Content-Type: application/json' \
-	-d '{"min_cutoff":0.8, "beta":0.07}'
-```
-Missing keys leave existing values unchanged. The response echoes the active parameters.
 ### Latency Histogram Snapshots
 `/metrics/stage_histogram` exposes periodic snapshots (e.g. every N frames) of stage latency distribution to help identify tail regressions. Use to tune pacing thresholds or decide on model quantization.
-## Environment Variables Summary (New Additions)
-| Name | Purpose | Default |
-|------|---------|---------|
-| `MIRAGE_ONEEURO_MIN_CUTOFF` | One Euro base cutoff | 1.0 |
-| `MIRAGE_ONEEURO_BETA` | One Euro speed coefficient | 0.05 |
-| `MIRAGE_ONEEURO_D_CUTOFF` | One Euro derivative cutoff | 1.0 |
-To pull LivePortrait ONNX files into the container at runtime and enable the safe animation path:
-1) Set these Space secrets/variables in the Settings → Variables panel:
-- `MIRAGE_ENABLE_SCRFD=1` (already default in Dockerfile)
-- `MIRAGE_ENABLE_LIVEPORTRAIT=1`
-- `MIRAGE_DOWNLOAD_MODELS=1`
-- `MIRAGE_LP_APPEARANCE_URL=https://huggingface.co/myn0908/Live-Portrait-ONNX/resolve/main/appearance_feature_extractor.onnx`
-- `MIRAGE_LP_MOTION_URL=https://huggingface.co/myn0908/Live-Portrait-ONNX/resolve/main/motion_extractor.onnx` (optional)
-2) Restart the Space. The server will download models in the background on startup, and also sync once when you hit "Initialize AI Pipeline".
-3) Check `/pipeline_status` or the in-UI metrics to see:
-- `ai_pipeline.animator_available: true`
-- `ai_pipeline.reference_set: true` (after uploading a reference)
-Notes:
-- The safe loader uses onnxruntime-gpu if available, otherwise CPU. This path provides a visible transformation placeholder and validates end-to-end integration.
-- Keep model URLs only to assets you have permission to download.
-## Model Weights (Planned Voice Pipeline)
-The codebase now contains placeholder directories for upcoming audio feature extraction and conversion models.
 ```
 models/

 license: mit
 hardware: a10g-large
 python_version: "3.10"
 tags:
 - real-time
 - ai-avatar
+- face-swap
 - voice-conversion
 - virtual-camera
+short_description: "Real-time AI avatar with face swap + voice conversion"
 ---
 # 🎭 Mirage: Real-time AI Avatar System
+Mirage performs real-time identity-preserving face swap plus optional facial enhancement and (stub) voice conversion, streaming back a virtual camera + microphone feed with sub‑250ms target latency. Designed for live calls, streaming overlays, and privacy where you want a consistent alternate appearance.
 ## 🚀 Features
+- **Real-time Face Swap (InSwapper)**: Identity transfer from a single reference image to your live video.
+- **Enhancement (Optional)**: CodeFormer restoration (fidelity‑controllable) if weights present.
+- **Low Latency WebRTC**: Bi-directional streaming via aiortc (camera + mic) with adaptive frame scaling.
+- **Voice Conversion Stub**: Pluggable path ready for RVC / HuBERT integration (currently pass-through by default).
+- **Virtual Camera**: Output suitable for Zoom, Meet, Discord, OBS (via local virtual camera module).
+- **Model Auto-Provisioning**: Deterministic downloader for required swap + enhancer weights.
+- **Metrics & Health**: JSON endpoints for latency, FPS, GPU memory, and pipeline stats.
 ## 🎯 Use Cases
+- **Video Conferencing Privacy**: Appear as a consistent alternate identity.
+- **Streaming / VTubing**: Lightweight swap + enhancement pipeline for overlays.
+- **A/B Creative Experiments**: Rapid prototyping of face identity transforms.
+- **Data Minimization**: Keep original face private while communicating.
 ## 🛠️ Technology Stack
+- **Face Detection & Embedding**: InsightFace `buffalo_l` (SCRFD + embedding).
+- **Face Swap Core**: `inswapper_128_fp16.onnx` (InSwapper) via InsightFace model zoo.
+- **Enhancer (optional)**: CodeFormer 0.1 (fidelity controllable).
+- **Backend**: FastAPI + aiortc (WebRTC) + asyncio.
+- **Metrics**: Custom endpoints (`/metrics`, `/gpu`) with rolling latency/FPS stats.
+- **Downloader**: Atomic, lock-protected model fetcher (`model_downloader.py`).
+- **Frontend**: Minimal WebRTC client (`static/`).
+## 📊 Performance Targets
+- **Processing Window**: <50ms typical swap @ 512px (A10G) w/ single face.
+- **End-to-end Latency Goal**: <250ms (capture → swap → enhancement → return).
+- **Adaptive Scale**: Frames >512px longest side are downscaled before inference.
+- **Enhancement Overhead**: CodeFormer ~18–35ms (A10G, single face, 512px) – approximate; adjust fidelity to trade quality vs latency.
+## 🚀 Quick Start (Hugging Face Space)
+1. Open the Space UI and allow camera/microphone.
+2. Click **Initialize** – triggers model download (if not already cached) & pipeline load.
+3. Upload a clear, front-facing reference image (only largest face is used).
+4. Start streaming – swapped frames appear in the preview.
+5. (Optional) Provide CodeFormer weights (`models/codeformer/codeformer.pth`) for enhancement.
+6. Use the virtual camera integration locally (if running self-hosted) to broadcast swapped output to Zoom/OBS.
 ## 🔧 Technical Details
 - GPU memory management and cleanup
 - Audio-video synchronization within 150ms
+### Model Flow
+1. Capture frame → optional downscale to <=512 max side
+2. InsightFace detector+embedding obtains face bboxes + identity vectors
+3. InSwapper ONNX performs identity replacement using source embedding
+4. Optional CodeFormer enhancer refines facial region
+5. Frame returned to WebRTC outbound track
 ### Real-time Features
+- WebRTC (aiortc) low-latency transport.
+- Asynchronous frame processing (background tasks) to avoid blocking capture.
+- Adaptive pre-inference downscale heuristic (cap largest dimension to 512).
+- Metrics-driven latency tracking for dynamic future pacing.
 ## 📱 Virtual Camera Integration
 - **Social Media**: WhatsApp Desktop, Skype, Facebook Messenger
 - **Gaming**: Steam, Discord voice channels
+## ⚡ Metrics & Observability
+Key endpoints (base URL: running server root):
+| Endpoint | Description |
+|----------|-------------|
+| `/metrics` | Core video/audio latency & FPS stats |
+| `/gpu` | GPU presence + memory usage (torch / nvidia-smi) |
+| `/webrtc/ping` | WebRTC router availability & TURN status |
+| `/pipeline_status` (if implemented) | High-level pipeline readiness |
+Pipeline stats (subset) from swap pipeline:
+```json
+{
+	"frames": 240,
+	"avg_latency_ms": 42.7,
+	"swap_faces_last": 1,
+	"enhanced_frames": 180,
+	"enhancer": "codeformer",
+	"codeformer_fidelity": 0.75,
+	"codeformer_loaded": true
+}
+```
 ## 🔒 Privacy & Security
+- No reference image persisted to disk (processed in-memory).
+- Only model weights are cached; media frames are transient.
+- Optional API key enforcement via `MIRAGE_API_KEY` + `MIRAGE_REQUIRE_API_KEY=1`.
+## 🔧 Environment Variables (Face Swap & Enhancers)
+| Variable | Purpose | Default |
+|----------|---------|---------|
+| `MIRAGE_DOWNLOAD_MODELS` | Auto download required models on startup | `1` |
+| `MIRAGE_INSWAPPER_URL` | Override InSwapper ONNX URL | internal default |
+| `MIRAGE_CODEFORMER_URL` | Override CodeFormer weight URL | 0.1 release |
+| `MIRAGE_CODEFORMER_FIDELITY` | 0.0=more detail recovery, 1.0=preserve input | `0.75` |
+| `MIRAGE_MAX_FACES` | Swap up to N largest faces per frame | `1` |
+| `MIRAGE_CUDA_ONLY` | Restrict ONNX to CUDA EP + CPU fallback | unset |
+| `MIRAGE_API_KEY` | Shared secret for control / TURN token | unset |
+| `MIRAGE_REQUIRE_API_KEY` | Enforce API key if set | `0` |
+| `MIRAGE_TOKEN_TTL` | Signed token lifetime (seconds) | `300` |
+| `MIRAGE_STUN_URLS` | Comma list of STUN servers | Google defaults |
+| `MIRAGE_TURN_URL` | TURN URI(s) (comma separated) | unset |
+| `MIRAGE_TURN_USER` | TURN username | unset |
+| `MIRAGE_TURN_PASS` | TURN credential | unset |
+| `MIRAGE_FORCE_RELAY` | Force relay-only traffic | `0` |
+| `MIRAGE_TURN_TLS_ONLY` | Filter TURN to TLS/TCP | `1` |
+| `MIRAGE_PREFER_H264` | Prefer H264 codec in SDP munging | `0` |
+| `MIRAGE_VOICE_ENABLE` | Enable voice processor stub | `0` |
+CodeFormer fidelity example:
+```bash
+MIRAGE_CODEFORMER_FIDELITY=0.6
+```
 ## 📋 Requirements
+- **GPU**: NVIDIA (Ampere+ recommended). CPU-only will be extremely slow.
+- **VRAM**: ~3–4GB baseline (swap + detector) + optional enhancer overhead.
+- **RAM**: 8GB+ (12–16GB recommended for multitasking).
+- **Browser**: Chromium-based / Firefox with WebRTC.
+- **Reference Image**: Clear, frontal, good lighting, minimal occlusions.
+## 🛠️ Development / Running Locally
+Download models & start server:
+```bash
+python model_downloader.py  # or set MIRAGE_DOWNLOAD_MODELS=1 and let startup handle
+uvicorn app:app --port 7860 --host 0.0.0.0
+```
+Open the browser client at `http://localhost:7860`.
+Set a reference image via UI (Base64 upload path) then begin WebRTC session. Inspect `/metrics` for swap latency and `webrtc/debug_state` for connection internals.
 ## 📄 License
 ## 🙏 Acknowledgments
+- [InsightFace](https://github.com/deepinsight/insightface) (detection + swap)
+- [CodeFormer](https://github.com/sczhou/CodeFormer) (fidelity-controllable enhancement)
+- Hugging Face (inference infra)
+## Metrics Endpoints (Current Subset)
+- `GET /metrics`
+- `GET /gpu`
+- `GET /webrtc/ping`
+- `GET /webrtc/debug_state`
+- (Legacy endpoints referenced in SPEC may be pruned in future refactors.)
 ## Voice Stub Activation
+Set `MIRAGE_VOICE_ENABLE=1` to route audio through the placeholder voice processor. Current behavior is pass‑through while preserving structural hooks for future RVC model integration.
 ## Future Parameterization
 - Frontend will fetch a `/config` endpoint to align `chunk_ms` and `video_max_fps` dynamically.
 If problems persist, capture the Container log stack trace and open an issue.
+## Model Auto-Download
+`model_downloader.py` manages required weights with atomic file locks. It supports overriding sources via env variables and gracefully continues if optional enhancers fail to download.
 ### Endpoints Recap
 See Metrics Endpoints section above. Typical usage examples:
 ### Motion Magnitude
 Aggregated from per-frame keypoint motion vectors; higher values trigger more frequent face detection to avoid drift. Low motion stretches automatically reduce detection frequency to save compute.
+### Enhancer Fidelity (CodeFormer)
+Fidelity weight (`w`):
+- Lower (e.g. 0.3–0.5): More aggressive restoration, may alter identity details.
+- Higher (0.7–0.9): Preserve more original swapped structure, less smoothing.
+Tune with `MIRAGE_CODEFORMER_FIDELITY`.
 ### Latency Histogram Snapshots
 `/metrics/stage_histogram` exposes periodic snapshots (e.g. every N frames) of stage latency distribution to help identify tail regressions. Use to tune pacing thresholds or decide on model quantization.
+## Security Notes
+If exposing publicly:
+- Set `MIRAGE_API_KEY` and `MIRAGE_REQUIRE_API_KEY=1`.
+- Serve behind TLS (reverse proxy like Caddy / Nginx for certificate management).
+- Optionally restrict TURN server usage or enforce relay only for stricter NAT traversal control.
+## Planned Voice Pipeline (Future)
+Placeholder directories exist for future real-time voice conversion integration.
 ```
 models/

Repair/INCREMENTAL_INTEGRATION.md DELETED Viewed

@@ -1,119 +0,0 @@
-# Incremental Model Integration Guide
-## Respecting Your Existing Architecture
-Your team made excellent decisions to avoid wholesale replacement. Here's how to safely integrate AI models:
-## Phase 1: Add Feature Flags (Zero Risk)
-Add to your environment or startup:
-```bash
-# Start with models disabled
-export MIRAGE_ENABLE_SCRFD=0
-export MIRAGE_ENABLE_LIVEPORTRAIT=0
-# Enable gradually
-export MIRAGE_ENABLE_SCRFD=1        # Enable face detection first
-export MIRAGE_ENABLE_LIVEPORTRAIT=1 # Enable animation second
-```
-## Phase 2: Integrate Safe Model Loader
-In your existing `avatar_pipeline.py`, add:
-```python
-# At the top
-from safe_model_integration import get_safe_model_loader
-class RealTimeAvatarPipeline:
-    def __init__(self):
-        # Your existing code...
-        # Add safe model loader
-        self.safe_loader = get_safe_model_loader()
-    async def initialize(self):
-        # Your existing initialization...
-        # Add safe model loading
-        await self.safe_loader.safe_load_scrfd()
-        await self.safe_loader.safe_load_liveportrait()
-    def process_video_frame(self, frame, frame_idx):
-        # Your existing code...
-        # Enhanced face detection (graceful fallback)
-        bbox = self.safe_loader.safe_detect_face(frame)
-        # Enhanced animation (graceful fallback to pass-through)
-        if self.reference_frame is not None:
-            result = self.safe_loader.safe_animate_face(self.reference_frame, frame)
-        else:
-            result = frame  # Keep existing pass-through logic
-        return result
-```
-## Phase 3: Enhanced Metrics (Drop-in)
-In your existing `get_performance_stats()`:
-```python
-from enhanced_metrics import enhance_existing_stats
-def get_performance_stats(self):
-    # Your existing stats collection...
-    base_stats = {
-        "models_loaded": self.loaded,
-        # ... your existing metrics
-    }
-    # Enhance with percentiles
-    return enhance_existing_stats(base_stats)
-```
-## Phase 4: Optional Model Download
-When you want models:
-```bash
-# Check what's needed
-python3 scripts/optional_download_models.py --status
-# Download only when features are enabled
-MIRAGE_ENABLE_SCRFD=1 python3 scripts/optional_download_models.py --download-needed
-```
-## Phase 5: WebRTC Monitoring (Optional)
-In your existing `webrtc_server.py`:
-```python
-from webrtc_connection_monitoring import add_connection_monitoring
-# After creating your router
-add_connection_monitoring(router, _peer_state)
-```
-## Validation Steps
-1. **Feature Flags Off**: System works exactly as before
-2. **SCRFD Enabled**: Face detection works, falls back gracefully
-3. **LivePortrait Enabled**: Animation works, falls back to pass-through
-4. **Metrics Enhanced**: More detailed latency tracking
-5. **Models Optional**: Download only when needed
-## Rollback Strategy
-At any point:
-```bash
-# Disable all features
-export MIRAGE_ENABLE_SCRFD=0
-export MIRAGE_ENABLE_LIVEPORTRAIT=0
-# System returns to existing pass-through behavior
-```
-This approach:
-- ✅ Keeps your token auth intact
-- ✅ Preserves existing WebRTC message schema
-- ✅ Maintains Docker compatibility
-- ✅ Allows gradual rollout with instant rollback
-- ✅ No background tasks at import time
-- ✅ Compatible with your A10G + CUDA 12.1 setup

Repair/QUICK_CHECKLIST.md DELETED Viewed

@@ -1,77 +0,0 @@
-# 🚀 QUICK ACTION CHECKLIST
-## Immediate Actions (Today - 30 minutes)
-### ✅ Step 1: Add Enhanced Metrics (5 minutes)
-```python
-# Copy enhanced_metrics.py to your project
-# In your existing avatar_pipeline.py get_performance_stats():
-from enhanced_metrics import enhance_existing_stats
-return enhance_existing_stats(base_stats)
-```
-### ✅ Step 2: Add Safe Model Integration (10 minutes)
-```python
-# Copy safe_model_integration.py to your project
-# In your existing avatar_pipeline.py __init__():
-from safe_model_integration import get_safe_model_loader
-self.safe_loader = get_safe_model_loader()
-# In your existing initialize():
-await self.safe_loader.safe_load_scrfd()
-await self.safe_loader.safe_load_liveportrait()
-# In your process_video_frame():
-bbox = self.safe_loader.safe_detect_face(frame)
-if self.reference_frame is not None:
-    result = self.safe_loader.safe_animate_face(self.reference_frame, frame)
-else:
-    result = frame
-```
-### ✅ Step 3: Test with Features Disabled (5 minutes)
-```bash
-export MIRAGE_ENABLE_SCRFD=0
-export MIRAGE_ENABLE_LIVEPORTRAIT=0
-# Verify system works exactly as before
-curl /health && curl /metrics
-```
-### ✅ Step 4: Enable SCRFD Gradually (5 minutes)
-```bash
-export MIRAGE_ENABLE_SCRFD=1
-# Test face detection
-curl -X POST /initialize
-curl /metrics  # Check for face detection timing
-```
-### ✅ Step 5: Enable LivePortrait (5 minutes)
-```bash
-export MIRAGE_ENABLE_LIVEPORTRAIT=1
-# Test animation
-curl /metrics  # Check for animation timing
-```
-## Success Indicators
-- [ ] Enhanced metrics show P50/P95 latency percentiles
-- [ ] SCRFD=1 enables face detection, fallback works on errors
-- [ ] LIVEPORTRAIT=1 enables animation, fallback works on errors
-- [ ] System maintains existing pass-through behavior
-- [ ] /health endpoint shows models_loaded status
-- [ ] Token auth and message schemas unchanged
-## Instant Rollback
-```bash
-export MIRAGE_ENABLE_SCRFD=0
-export MIRAGE_ENABLE_LIVEPORTRAIT=0
-# Returns to exact previous behavior
-```
-## Files to Copy
-- [x] `safe_model_integration.py` → Your project root
-- [x] `enhanced_metrics.py` → Your project root
-- [x] `scripts/optional_download_models.py` → Your scripts/ folder
-- [x] `webrtc_connection_monitoring.py` → Optional for /webrtc/connections
-**🎯 Total Time: 30 minutes for safe AI integration**

Repair/TARGETED_RECOMMENDATIONS.md DELETED Viewed

@@ -1,207 +0,0 @@
-# 🎯 TARGETED RECOMMENDATIONS - SAFE AI INTEGRATION
-## Assessment: Your Dev Team is Absolutely Right
-Your team's analysis shows **excellent engineering judgment**. Wholesale replacement would introduce unnecessary risks to a working system. Here are targeted improvements that respect your architecture:
-## ✅ IMMEDIATE WINS (Zero Risk)
-### 1. Enhanced Metrics (Drop-in Compatible)
-**File**: `enhanced_metrics.py`
-**Integration**: Add to existing `get_performance_stats()`
-```python
-from enhanced_metrics import enhance_existing_stats
-return enhance_existing_stats(existing_stats)
-```
-**Benefits**:
-- P50/P95/P99 latency percentiles
-- Component-level timing breakdown
-- GPU memory monitoring
-- **Zero breaking changes**
-### 2. Feature-Flagged Model Loading
-**File**: `safe_model_integration.py`
-**Integration**: Import in existing pipeline
-```bash
-export MIRAGE_ENABLE_SCRFD=0  # Start disabled
-export MIRAGE_ENABLE_LIVEPORTRAIT=0
-```
-**Benefits**:
-- Graceful fallback to pass-through
-- Enable/disable models instantly
-- No changes to existing message schemas
-- **Complete rollback capability**
-## 🚀 MEDIUM-TERM ADDITIONS (Low Risk)
-### 3. Connection Monitoring Endpoint
-**File**: `webrtc_connection_monitoring.py`
-**Integration**: Add to existing WebRTC router
-```python
-add_connection_monitoring(router, _peer_state)
-```
-**Benefits**:
-- `/webrtc/connections` diagnostic endpoint
-- Works with single-peer architecture
-- **No auth changes required**
-### 4. Optional Model Download Utility
-**File**: `scripts/optional_download_models.py`
-**Usage**: On-demand only (not in Docker build)
-```bash
-python3 scripts/optional_download_models.py --status
-```
-**Benefits**:
-- Download models when features are enabled
-- Conservative model list (SCRFD + LivePortrait basics)
-- **Not baked into Docker build**
-## 🎯 RESPECTS YOUR ARCHITECTURE DECISIONS
-### ✅ What We're NOT Changing
-- **Docker Base**: Keep your CUDA 12.1.1 + cuDNN 8 runtime
-- **Token Auth**: Preserve your WebRTC authentication system
-- **Message Schema**: Keep `image_jpeg_base64` format
-- **Entry Point**: Keep your `original_fastapi_app.py`
-- **Background Tasks**: No import-time tasks
-- **Router Integration**: Keep your existing WebRTC setup
-### ✅ What We're Safely Adding
-- **Feature flags** for gradual AI model rollout
-- **Enhanced metrics** for better observability
-- **Graceful fallbacks** that maintain pass-through behavior
-- **Optional utilities** for model management
-- **Diagnostic endpoints** for connection monitoring
-## 📊 EXPECTED RESULTS WITH SAFE INTEGRATION
-### Phase 1: Metrics Enhanced (Day 1)
-```
-Before: Basic latency averages
-After:  P50/P95/P99 percentiles + component breakdown
-Risk:   Zero (pure addition)
-```
-### Phase 2: SCRFD Enabled (Day 2-3)
-```
-Before: No face detection
-After:  Real face detection with pass-through fallback
-Risk:   Low (feature flag controlled)
-Command: MIRAGE_ENABLE_SCRFD=1
-```
-### Phase 3: LivePortrait Enabled (Day 4-7)
-```
-Before: Pass-through video
-After:  Real face animation with pass-through fallback
-Risk:   Low (feature flag controlled)
-Command: MIRAGE_ENABLE_LIVEPORTRAIT=1
-```
-## 🔧 INTEGRATION SEQUENCE
-### Step 1: Add Enhanced Metrics (5 minutes)
-```python
-# In your existing pipeline get_performance_stats()
-from enhanced_metrics import enhance_existing_stats
-return enhance_existing_stats(base_stats)
-```
-### Step 2: Add Safe Model Loader (10 minutes)
-```python
-# In your existing pipeline __init__()
-from safe_model_integration import get_safe_model_loader
-self.safe_loader = get_safe_model_loader()
-# In your existing initialize()
-await self.safe_loader.safe_load_scrfd()
-await self.safe_loader.safe_load_liveportrait()
-```
-### Step 3: Enable Features Gradually
-```bash
-# Test SCRFD first
-export MIRAGE_ENABLE_SCRFD=1
-# Verify face detection works, fallback to pass-through on errors
-# Test LivePortrait second
-export MIRAGE_ENABLE_LIVEPORTRAIT=1
-# Verify animation works, fallback to pass-through on errors
-```
-### Step 4: Monitor and Validate
-```bash
-curl /metrics  # Check enhanced metrics
-curl /webrtc/connections  # Check connection status
-curl /health   # Verify system health
-```
-## ⚡ INSTANT ROLLBACK STRATEGY
-At any point, disable features:
-```bash
-export MIRAGE_ENABLE_SCRFD=0
-export MIRAGE_ENABLE_LIVEPORTRAIT=0
-# System immediately returns to existing pass-through behavior
-```
-## 🎉 BENEFITS OF THIS APPROACH
-### Technical Benefits
-- **Zero breaking changes** to existing working code
-- **Instant rollback** capability with feature flags
-- **Incremental validation** of each AI component
-- **Enhanced observability** with detailed metrics
-- **Compatible** with your CUDA 12.1 + A10G setup
-### Business Benefits
-- **Reduced risk** of system downtime
-- **Faster iteration** with safe feature toggles
-- **Better debugging** with component-level metrics
-- **Proven stability** before full AI rollout
-## 📋 FILES PROVIDED
-| File | Purpose | Integration Risk |
-|------|---------|------------------|
-| `safe_model_integration.py` | Feature-flagged AI models | **Low** - Graceful fallbacks |
-| `enhanced_metrics.py` | P50/P95 performance tracking | **Zero** - Pure addition |
-| `webrtc_connection_monitoring.py` | Connection diagnostics | **Low** - Read-only endpoint |
-| `scripts/optional_download_models.py` | On-demand model utility | **Zero** - Manual use only |
-| `INCREMENTAL_INTEGRATION.md` | Step-by-step guide | **Zero** - Documentation |
-## 🚀 RECOMMENDED NEXT STEPS
-### Today (30 minutes)
-1. Add `enhanced_metrics.py` to your pipeline
-2. Verify metrics show P50/P95 latencies
-3. Add `safe_model_integration.py` with flags disabled
-4. Test that system works exactly as before
-### This Week
-1. Enable SCRFD: `MIRAGE_ENABLE_SCRFD=1`
-2. Verify face detection works with fallbacks
-3. Monitor enhanced metrics for performance impact
-4. Enable LivePortrait: `MIRAGE_ENABLE_LIVEPORTRAIT=1`
-### Next Week
-1. Validate end-to-end AI pipeline performance
-2. Fine-tune model parameters if needed
-3. Consider adding connection monitoring endpoint
-4. Plan gradual rollout to production users
----
-## 🎯 CONCLUSION
-Your team's approach is **architecturally sound**. These targeted improvements provide:
-- ✅ **Real AI model integration** with safety guardrails
-- ✅ **Enhanced observability** for performance debugging
-- ✅ **Zero risk** to existing stability and auth systems
-- ✅ **Instant rollback** capability at any point
-- ✅ **Incremental validation** of each component
-**This is the right way to add AI to a production system.**
-Your current working foundation + these safe additions = Production-ready AI avatar system with <200ms latency.

SPEC.md CHANGED Viewed

@@ -1,3 +1,44 @@
 ## Goals
 - End-to-end audio latency < 250 ms (capture -> inference -> playback)
 - Video pipeline: 512x512 @ ≥20 FPS target under load

+## Architectural Reassessment (September 2025)
+The initial implementation adopted a motion-driven portrait reenactment stack (LivePortrait ONNX models + custom alignment & smoothing) which is misaligned with the updated product goal: low-latency real-time face swapping with optional enhancement.
+### Misalignment Summary
+| Target Need | LivePortrait Path | Impact |
+|-------------|-------------------|--------|
+| Direct identity substitution | Motion reenactment of a canonicalized reference | Unnecessary motion keypoint pipeline |
+| Minimal per-frame latency (<80ms) | ~500–600ms generator stages logged | Fails real-time threshold |
+| Simple detector→swap flow | Multi-stage appearance + motion + generator | Complexity & fragile compositing |
+| Artifact cleanup (optional) | No enhancement stage | Lower visual fidelity |
+| Multi-face capability | Single-face canonical reenactment focus | Limits scalability |
+### New Model Stack
+1. Detector / embeddings: insightface FaceAnalysis (buffalo_l pack → SCRFD_10G_KPS + recognition)
+2. Swapper: inswapper_128_fp16.onnx
+3. Enhancement (optional):
+  - CodeFormer (codeformer.pth) for fidelity‑controllable restoration
+### New Processing Loop
+1. Capture frame
+2. Detect faces (FaceAnalysis)
+3. For each target face (top-N): apply InSwapper with pre-extracted source identity
+4. (Optional) Run CodeFormer enhancer on final composited frame (if weights present)
+5. Emit frame to WebRTC
+### Environment Variables (Video / Enhancer)
+| Variable | Values | Description |
+|----------|--------|-------------|
+| MIRAGE_MAX_FACES | int (default 1) | Swap up to N largest faces |
+| MIRAGE_CODEFORMER_FIDELITY | 0.0–1.0 (default 0.75) | Balance identity (1.0) vs reconstruction sharpness |
+| MIRAGE_INSWAPPER_URL | URL | Override InSwapper model source |
+| MIRAGE_CODEFORMER_URL | URL | Override CodeFormer model source |
+### Deprecated / To Remove
+liveportrait_engine.py, avatar_pipeline.py, alignment.py, smoothing.py, realtime_optimizer.py, virtual_camera.py (current unused), enhanced_metrics.py, landmark_reenactor.py, safe_model_integration.py, debug_mediapipe.py
+These abstractions are reenactment-specific (appearance feature caching, keypoint smoothing, inverse warp compositing) and will be replaced by a concise `swap_pipeline.py`.
+---
 ## Goals
 - End-to-end audio latency < 250 ms (capture -> inference -> playback)
 - Video pipeline: 512x512 @ ≥20 FPS target under load

model_downloader.py CHANGED Viewed

@@ -1,18 +1,20 @@
-"""
-Optional model downloader for deterministic builds.
-- Controlled with env MIRAGE_DOWNLOAD_MODELS=1
-- LivePortrait ONNX URLs controlled via env:
-    * MIRAGE_LP_APPEARANCE_URL
-    * MIRAGE_LP_MOTION_URL (required for motion)
-    * MIRAGE_LP_GENERATOR_URL (optional; enables full neural synthesis)
-    * MIRAGE_LP_STITCHING_URL (optional; some pipelines include extra stitching stage)
-- InsightFace models will still use the package cache; SCRFD will populate on first run.
-More robust with retries and alternative download methods (requests, huggingface_hub).
 """
 import os
 import sys
 import shutil
 from pathlib import Path
 import time
 from typing import Optional
@@ -39,7 +41,8 @@ try:
 except Exception:
         hf_hub_download = None
-LP_DIR = Path(__file__).parent / 'models' / 'liveportrait'
 HF_HOME = Path(os.getenv('HF_HOME', Path(__file__).parent / '.cache' / 'huggingface'))
 HF_HOME.mkdir(parents=True, exist_ok=True)
@@ -177,9 +180,9 @@ class _FileLock:
 def _audit(event: str, **extra):
     try:
-        lp_dir = LP_DIR
-        lp_dir.mkdir(parents=True, exist_ok=True)
-        audit_path = lp_dir / '_download_audit.jsonl'
         payload = {
             'ts': time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime()),
             'event': event,
@@ -193,166 +196,60 @@ def _audit(event: str, **extra):
 def maybe_download() -> bool:
-    if os.getenv('MIRAGE_DOWNLOAD_MODELS', '1').lower() not in ('1', 'true', 'yes', 'on'):
         print('[downloader] MIRAGE_DOWNLOAD_MODELS disabled')
         _audit('disabled')
         return False
-    app_url = os.getenv('MIRAGE_LP_APPEARANCE_URL')
-    motion_url = os.getenv('MIRAGE_LP_MOTION_URL')
-    success = True
     _audit('start')
-    # Download LivePortrait appearance extractor
-    if app_url:
-        dest = LP_DIR / 'appearance_feature_extractor.onnx'
-        if not dest.exists():
-            try:
-                print(f'[downloader] Downloading appearance extractor...')
-                with _FileLock(dest):
-                    if not dest.exists():
-                        _download(app_url, dest)
-                converted = _maybe_convert_opset_to_19(dest)
-                if converted != dest:
-                    try:
-                        shutil.copyfile(converted, dest)
-                        print(f"[downloader] Replaced appearance with opset19: {converted.name}")
-                    except Exception:
-                        pass
-                print(f'[downloader] ✅ Downloaded: {dest}')
-                _audit('download_ok', model='appearance', path=str(dest))
-            except Exception as e:
-                print(f'[downloader] ❌ Failed to download appearance extractor: {e}')
-                _audit('download_error', model='appearance', error=str(e))
-                success = False
-        else:
-            converted = _maybe_convert_opset_to_19(dest)
-            if converted != dest:
-                try:
-                    shutil.copyfile(converted, dest)
-                    print(f"[downloader] Updated cached appearance to opset19")
-                except Exception:
-                    pass
-            print(f'[downloader] ✅ Appearance extractor already exists: {dest}')
-            _audit('exists', model='appearance', path=str(dest))
-    # Download LivePortrait motion extractor
-    if motion_url:
-        dest = LP_DIR / 'motion_extractor.onnx'
-        if not dest.exists():
-            try:
-                print(f'[downloader] Downloading motion extractor...')
-                with _FileLock(dest):
-                    if not dest.exists():
-                        _download(motion_url, dest)
-                converted = _maybe_convert_opset_to_19(dest)
-                if converted != dest:
-                    try:
-                        shutil.copyfile(converted, dest)
-                        print(f"[downloader] Replaced motion with opset19: {converted.name}")
-                    except Exception:
-                        pass
-                print(f'[downloader] ✅ Downloaded: {dest}')
-                _audit('download_ok', model='motion', path=str(dest))
-            except Exception as e:
-                print(f'[downloader] ❌ Failed to download motion extractor: {e}')
-                _audit('download_error', model='motion', error=str(e))
-                success = False
-        else:
-            converted = _maybe_convert_opset_to_19(dest)
-            if converted != dest:
-                try:
-                    shutil.copyfile(converted, dest)
-                    print(f"[downloader] Updated cached motion to opset19")
-                except Exception:
-                    pass
-            print(f'[downloader] ✅ Motion extractor already exists: {dest}')
-            _audit('exists', model='motion', path=str(dest))
-    # Download additional models (generator required in neural-only mode)
-    generator_url = os.getenv('MIRAGE_LP_GENERATOR_URL')
-    if generator_url:
-        dest = LP_DIR / 'generator.onnx'
-        if not dest.exists():
-            try:
-                print(f'[downloader] Downloading generator model...')
-                with _FileLock(dest):
-                    if not dest.exists():
-                        _download(generator_url, dest)
-                if not _is_valid_onnx(dest):
-                    print(f"[downloader] ❌ Generator ONNX validation failed for {generator_url}")
-                    try:
-                        dest.unlink()
-                    except Exception:
-                        pass
-                    raise RuntimeError('generator download invalid')
-                print(f'[downloader] ✅ Downloaded: {dest}')
-                _audit('download_ok', model='generator', path=str(dest))
-            except Exception as e:
-                print(f'[downloader] ❌ Failed to download generator (required): {e}')
-                _audit('download_error', model='generator', error=str(e))
-                success = False
-        else:
-            if not _is_valid_onnx(dest):
-                try:
-                    print(f"[downloader] Existing generator is invalid, removing and retrying download")
-                    dest.unlink()
-                except Exception:
-                    pass
-                try:
-                    print(f'[downloader] Downloading generator model...')
-                    with _FileLock(dest):
-                        if not dest.exists():
-                            _download(generator_url, dest)
-                    if not _is_valid_onnx(dest):
-                        raise RuntimeError(f'generator invalid after re-download: {generator_url}')
-                    print(f'[downloader] ✅ Downloaded: {dest}')
-                    _audit('download_ok', model='generator', path=str(dest), refreshed=True)
-                except Exception as e2:
-                    print(f'[downloader] ❌ Failed to refresh invalid generator: {e2}')
-                    _audit('download_error', model='generator', error=str(e2), refreshed=True)
-                    success = False
-            else:
-                print(f'[downloader] ✅ Generator already exists: {dest}')
-                _audit('exists', model='generator', path=str(dest))
-    # Optional stitching model
-    stitching_url = os.getenv('MIRAGE_LP_STITCHING_URL')
-    if stitching_url:
-        dest = LP_DIR / 'stitching.onnx'
-        if not dest.exists():
-            try:
-                print(f'[downloader] Downloading stitching model...')
-                _download(stitching_url, dest)
-                print(f'[downloader] ✅ Downloaded: {dest}')
-                _audit('download_ok', model='stitching', path=str(dest))
-            except Exception as e:
-                print(f'[downloader] ⚠️ Failed to download stitching (optional): {e}')
-                _audit('download_error', model='stitching', error=str(e))
-    # Optional custom ops plugin for GridSample 3D used by some generator variants
-    grid_plugin_url = os.getenv('MIRAGE_LP_GRID_PLUGIN_URL')
-    if grid_plugin_url:
-        dest = LP_DIR / 'libgrid_sample_3d_plugin.so'
-        if not dest.exists():
-            try:
-                print(f'[downloader] Downloading grid sample plugin...')
-                _download(grid_plugin_url, dest)
-                print(f'[downloader] ✅ Downloaded: {dest}')
-                _audit('download_ok', model='grid_plugin', path=str(dest))
-            except Exception as e:
-                print(f'[downloader] ⚠️ Failed to download grid plugin (optional): {e}')
-                _audit('download_error', model='grid_plugin', error=str(e))
     _audit('complete', success=success)
     return success
 if __name__ == '__main__':
-    """Direct execution for debugging"""
-    print("=== LivePortrait Model Downloader ===")
-    success = maybe_download()
-    if success:
-        print("✅ All required models downloaded successfully")
     else:
-        print("❌ Some model downloads failed")
         sys.exit(1)

+"""Model downloader for face swap stack (InSwapper + CodeFormer).
+Environment:
+    MIRAGE_DOWNLOAD_MODELS=1|0
+    MIRAGE_INSWAPPER_URL  (default HF inswapper 128)
+    MIRAGE_CODEFORMER_URL (default CodeFormer official release)
+Models are stored under:
+    models/inswapper/inswapper_128_fp16.onnx
+    models/codeformer/codeformer.pth
+Download priority: requests -> huggingface_hub heuristic. Safe across parallel processes via file locks.
 """
 import os
 import sys
 import shutil
+import json
 from pathlib import Path
 import time
 from typing import Optional
 except Exception:
         hf_hub_download = None
+INSWAPPER_DIR = Path(__file__).parent / 'models' / 'inswapper'
+CODEFORMER_DIR = Path(__file__).parent / 'models' / 'codeformer'
 HF_HOME = Path(os.getenv('HF_HOME', Path(__file__).parent / '.cache' / 'huggingface'))
 HF_HOME.mkdir(parents=True, exist_ok=True)
 def _audit(event: str, **extra):
     try:
+        audit_dir = Path(__file__).parent / 'models' / '_logs'
+        audit_dir.mkdir(parents=True, exist_ok=True)
+        audit_path = audit_dir / 'download_audit.jsonl'
         payload = {
             'ts': time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime()),
             'event': event,
 def maybe_download() -> bool:
+    if os.getenv('MIRAGE_DOWNLOAD_MODELS', '1').lower() not in ('1','true','yes','on'):
         print('[downloader] MIRAGE_DOWNLOAD_MODELS disabled')
         _audit('disabled')
         return False
     _audit('start')
+    success = True
+    inswapper_url = os.getenv('MIRAGE_INSWAPPER_URL', 'https://huggingface.co/deepinsight/inswapper/resolve/main/inswapper_128_fp16.onnx')
+    codeformer_url = os.getenv('MIRAGE_CODEFORMER_URL', 'https://github.com/TencentARC/CodeFormer/releases/download/v0.1.0/codeformer.pth')
+    # InSwapper
+    inswapper_dest = INSWAPPER_DIR / 'inswapper_128_fp16.onnx'
+    if not inswapper_dest.exists():
+        try:
+            print('[downloader] Downloading InSwapper model...')
+            with _FileLock(inswapper_dest):
+                if not inswapper_dest.exists():
+                    _download(inswapper_url, inswapper_dest)
+            print(f'[downloader] ✅ InSwapper ready: {inswapper_dest}')
+            _audit('download_ok', model='inswapper', path=str(inswapper_dest))
+        except Exception as e:
+            print(f'[downloader] ❌ InSwapper download failed: {e}')
+            _audit('download_error', model='inswapper', error=str(e))
+            success = False
+    else:
+        print(f'[downloader] ✅ InSwapper exists: {inswapper_dest}')
+        _audit('exists', model='inswapper', path=str(inswapper_dest))
+    # CodeFormer (optional)
+    codef_dest = CODEFORMER_DIR / 'codeformer.pth'
+    if not codef_dest.exists():
+        try:
+            print('[downloader] Downloading CodeFormer model...')
+            with _FileLock(codef_dest):
+                if not codef_dest.exists():
+                    _download(codeformer_url, codef_dest)
+            print(f'[downloader] ✅ CodeFormer ready: {codef_dest}')
+            _audit('download_ok', model='codeformer', path=str(codef_dest))
+        except Exception as e:
+            print(f'[downloader] ⚠️ CodeFormer download failed (continuing): {e}')
+            _audit('download_error', model='codeformer', error=str(e))
+    else:
+        print(f'[downloader] ✅ CodeFormer exists: {codef_dest}')
+        _audit('exists', model='codeformer', path=str(codef_dest))
     _audit('complete', success=success)
     return success
 if __name__ == '__main__':
+    print("=== Model Downloader (InSwapper + CodeFormer) ===")
+    ok = maybe_download()
+    if ok:
+        print("✅ All required models downloaded successfully (some optional)")
     else:
+        print("❌ Some required model downloads failed")
         sys.exit(1)

requirements.txt CHANGED Viewed

@@ -1,21 +1,14 @@
 fastapi==0.104.1
 uvicorn[standard]==0.24.0
-# Torch packages are installed via Dockerfile with CUDA 11.8 wheels; avoid conflicting pins here
 aiortc==1.6.0
 websockets==11.0.3
 numpy==1.24.4
 opencv-python==4.8.1.78
 Pillow==10.0.1
-librosa==0.10.1
-soundfile==0.12.1
 insightface==0.7.3
-transformers==4.44.2
-onnx==1.16.1
-# ORT GPU pinned to a CUDA 11.x-friendly build (supports opset 20). 1.18.1 has CUDA 12 wheels too; ensure Docker CUDA base matches.
-onnxruntime-gpu==1.18.1
-onnxruntime-extensions==0.12.0
-huggingface-hub==0.24.5
 python-multipart==0.0.9
 av==11.0.0
 psutil==5.9.8
-mediapipe==0.10.7

 fastapi==0.104.1
 uvicorn[standard]==0.24.0
 aiortc==1.6.0
 websockets==11.0.3
 numpy==1.24.4
 opencv-python==4.8.1.78
 Pillow==10.0.1
 insightface==0.7.3
+basicsr==1.4.2
+timm==0.9.12
 python-multipart==0.0.9
 av==11.0.0
 psutil==5.9.8
+huggingface-hub==0.24.5

swap_pipeline.py ADDED Viewed

	@@ -0,0 +1,210 @@

+import os
+import io
+import time
+import logging
+from typing import Optional, Dict, Any, List
+import numpy as np
+import cv2
+from PIL import Image
+import insightface  # ensures model pack loading
+from insightface.app import FaceAnalysis
+logger = logging.getLogger(__name__)
+INSWAPPER_ONNX_PATH = os.path.join('models', 'inswapper', 'inswapper_128_fp16.onnx')
+CODEFORMER_PATH = os.path.join('models', 'codeformer', 'codeformer.pth')
+class FaceSwapPipeline:
+    """Direct face swap + optional enhancement pipeline.
+        Lifecycle:
+            1. initialize() -> loads detector/recognizer (buffalo_l) and inswapper onnx
+            2. set_source_image(image_bytes|np.array) -> extracts source identity face object
+            3. process_frame(frame) -> swap all or top-N faces using source face
+            4. (optional) CodeFormer enhancement (always attempted if model present)
+    """
+    def __init__(self):
+        self.initialized = False
+        self.source_face = None
+        self.source_img_meta = {}
+        # Single enhancer path: CodeFormer (optional)
+        self.max_faces = int(os.getenv('MIRAGE_MAX_FACES', '1'))
+        self._stats = {
+            'frames': 0,
+            'last_latency_ms': None,
+            'avg_latency_ms': None,
+            'swap_faces_last': 0,
+            'enhanced_frames': 0
+        }
+        self._lat_hist: List[float] = []
+        self.app: Optional[FaceAnalysis] = None
+        self.swapper = None
+        self.codeformer = None
+        self.codeformer_fidelity = float(os.getenv('MIRAGE_CODEFORMER_FIDELITY', '0.75'))
+        self.codeformer_loaded = False
+    def initialize(self):
+        if self.initialized:
+            return True
+        providers = None
+        try:
+            # Let insightface choose; can restrict with env MIRAGE_CUDA_ONLY
+            if os.getenv('MIRAGE_CUDA_ONLY'):
+                providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
+        except Exception:
+            providers = None
+        self.app = FaceAnalysis(name='buffalo_l', providers=providers)
+        self.app.prepare(ctx_id=0, det_size=(640,640))
+        # Load swapper
+        if not os.path.isfile(INSWAPPER_ONNX_PATH):
+            raise FileNotFoundError(f"Missing inswapper model at {INSWAPPER_ONNX_PATH}")
+        self.swapper = insightface.model_zoo.get_model(INSWAPPER_ONNX_PATH, providers=providers)
+        # Optional CodeFormer enhancer
+        try:
+            # CodeFormer dependencies
+            from basicsr.utils import imwrite  # noqa: F401
+            from basicsr.archs.rrdbnet_arch import RRDBNet  # noqa: F401
+            import torch
+            from torchvision import transforms  # noqa: F401
+            from collections import OrderedDict
+            # Lazy import codeformer util packaged structure (user expected to mount model)
+            if not os.path.isfile(CODEFORMER_PATH):
+                logger.warning(f"CodeFormer selected but model file missing: {CODEFORMER_PATH}")
+            else:
+                # Minimal inline loader (avoid full repo clone)
+                from torch import nn
+                class CodeFormerWrapper:
+                    def __init__(self, model_path: str, fidelity: float):
+                        from codeformer.archs.codeformer_arch import CodeFormer  # type: ignore
+                        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
+                        self.net = CodeFormer(dim_embd=512, codebook_size=1024, n_head=8, n_layers=9,
+                                              connect_list=['32','64','128','256']).to(self.device)
+                        ckpt = torch.load(model_path, map_location='cpu')
+                        if 'params_ema' in ckpt:
+                            self.net.load_state_dict(ckpt['params_ema'], strict=False)
+                        else:
+                            self.net.load_state_dict(ckpt['state_dict'], strict=False)
+                        self.net.eval()
+                        self.fidelity = min(max(fidelity, 0.0), 1.0)
+                    @torch.no_grad()
+                    def enhance(self, img_bgr: np.ndarray) -> np.ndarray:
+                        import torch.nn.functional as F
+                        img = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
+                        tensor = torch.from_numpy(img).float().to(self.device) / 255.0
+                        tensor = tensor.permute(2,0,1).unsqueeze(0)
+                        # CodeFormer forward expects (B,C,H,W)
+                        try:
+                            out = self.net(tensor, w=self.fidelity, adain=True)[0]
+                        except Exception:
+                            # Some variants return tuple
+                            out = self.net(tensor, w=self.fidelity)[0]
+                        out = (out.clamp(0,1) * 255.0).byte().permute(1,2,0).cpu().numpy()
+                        return cv2.cvtColor(out, cv2.COLOR_RGB2BGR)
+                self.codeformer = CodeFormerWrapper(CODEFORMER_PATH, self.codeformer_fidelity)
+                self.codeformer_loaded = True
+                logger.info('CodeFormer loaded')
+        except Exception as e:
+            logger.warning(f"CodeFormer init failed, disabling: {e}")
+            self.codeformer = None
+        self.initialized = True
+        logger.info('FaceSwapPipeline initialized')
+        return True
+    def _decode_image(self, data) -> np.ndarray:
+        if isinstance(data, bytes):
+            arr = np.frombuffer(data, np.uint8)
+            img = cv2.imdecode(arr, cv2.IMREAD_COLOR)
+            return img
+        if isinstance(data, np.ndarray):
+            return data
+        if hasattr(data, 'read'):
+            buff = data.read()
+            arr = np.frombuffer(buff, np.uint8)
+            return cv2.imdecode(arr, cv2.IMREAD_COLOR)
+        raise TypeError('Unsupported image input type')
+    def set_source_image(self, image_input) -> bool:
+        if not self.initialized:
+            self.initialize()
+        img = self._decode_image(image_input)
+        if img is None:
+            logger.error('Failed to decode source image')
+            return False
+        faces = self.app.get(img)
+        if not faces:
+            logger.error('No face detected in source image')
+            return False
+        # Choose the largest face by bbox area
+        def _area(face):
+            x1,y1,x2,y2 = face.bbox.astype(int)
+            return (x2-x1)*(y2-y1)
+        faces.sort(key=_area, reverse=True)
+        self.source_face = faces[0]
+        self.source_img_meta = {'resolution': img.shape[:2], 'num_faces': len(faces)}
+        logger.info('Source face set')
+        return True
+    def process_frame(self, frame: np.ndarray) -> np.ndarray:
+        if not self.initialized or self.swapper is None or self.app is None or self.source_face is None:
+            return frame
+        t0 = time.time()
+        faces = self.app.get(frame)
+        if not faces:
+            self._record_latency(time.time() - t0)
+            self._stats['swap_faces_last'] = 0
+            return frame
+        # Sort faces by area and keep top-N
+        def _area(face):
+            x1,y1,x2,y2 = face.bbox.astype(int)
+            return (x2-x1)*(y2-y1)
+        faces.sort(key=_area, reverse=True)
+        out = frame
+        count = 0
+        for f in faces[:self.max_faces]:
+            try:
+                out = self.swapper.get(out, f, self.source_face, paste_back=True)
+                count += 1
+            except Exception as e:
+                logger.debug(f"Swap failed for face: {e}")
+        if count > 0 and self.codeformer is not None:
+            try:
+                out = self.codeformer.enhance(out)
+                self._stats['enhanced_frames'] += 1
+            except Exception as e:
+                logger.debug(f"CodeFormer enhancement failed: {e}")
+        self._record_latency(time.time() - t0)
+        self._stats['swap_faces_last'] = count
+        self._stats['frames'] += 1
+        return out
+    def _record_latency(self, dt: float):
+        ms = dt * 1000.0
+        self._stats['last_latency_ms'] = ms
+        self._lat_hist.append(ms)
+        if len(self._lat_hist) > 200:
+            self._lat_hist.pop(0)
+        self._stats['avg_latency_ms'] = float(np.mean(self._lat_hist)) if self._lat_hist else None
+    def get_stats(self) -> Dict[str, Any]:
+        return dict(
+            self._stats,
+            initialized=self.initialized,
+            codeformer_fidelity=self.codeformer_fidelity if self.codeformer is not None else None,
+            codeformer_loaded=self.codeformer_loaded,
+        )
+    # Backwards compatibility for earlier server expecting process_video_frame
+    def process_video_frame(self, frame: np.ndarray, frame_idx: int | None = None) -> np.ndarray:
+        return self.process_frame(frame)
+# Singleton access similar to previous pattern
+_pipeline_instance: Optional[FaceSwapPipeline] = None
+def get_pipeline() -> FaceSwapPipeline:
+    global _pipeline_instance
+    if _pipeline_instance is None:
+        _pipeline_instance = FaceSwapPipeline()
+        _pipeline_instance.initialize()
+    return _pipeline_instance

webrtc_server.py CHANGED Viewed

@@ -66,8 +66,7 @@ class _PassThroughPipeline:
     def initialize(self):
         return True
-    def set_reference_frame(self, img):
-        # No-op reference; return False to indicate not used
         return False
     def process_video_frame(self, img, frame_idx=None):
@@ -86,10 +85,10 @@ def get_pipeline():  # type: ignore
     if _pipeline_singleton is not None:
         return _pipeline_singleton
     try:
-        from avatar_pipeline import get_pipeline as _real_get_pipeline
         _pipeline_singleton = _real_get_pipeline()
     except Exception as e:
-        logger.error(f"avatar_pipeline unavailable, using pass-through: {e}")
         _pipeline_singleton = _PassThroughPipeline()
     return _pipeline_singleton

     def initialize(self):
         return True
+    def set_source_image(self, img):
         return False
     def process_video_frame(self, img, frame_idx=None):
     if _pipeline_singleton is not None:
         return _pipeline_singleton
     try:
+        from swap_pipeline import get_pipeline as _real_get_pipeline
         _pipeline_singleton = _real_get_pipeline()
     except Exception as e:
+        logger.error(f"swap_pipeline unavailable, using pass-through: {e}")
         _pipeline_singleton = _PassThroughPipeline()
     return _pipeline_singleton