Spaces:

JackIsNotInTheBox
/

watermark_remover

Paused

BoxOfColors Claude Opus 4.7 (1M context) commited on 18 days ago

Commit

b859009

1 Parent(s): 5d79cd0

fix: list-audit fixes — LaMa duration, multi-stream audio, prewarm wait

Acted on the items from the gap-list audit that were real bugs/risks:

C. LaMa @spaces.GPU duration was tight for high-fps clips
180 s budget vs 60 fps × 15 s = 900 frames × ~0.3 s/frame = ~270 s
worst case. Bumped to 240 s — covers typical 60 fps loads with
~30 s headroom. (VACE stays at 300 s; per-mode budget split intact.)

H. attach_audio dropped extra audio streams
``-map 1:a:0`` only kept the first audio stream — sources with
commentary tracks / alternate languages / 5.1 + stereo dual-mix
silently lost everything but the main mix. Now uses ``-map 1:a``
(preserves all audio streams). Re-encode fallback path applies
AAC@192k to each independently.

N. Prewarm wait happened inside the GPU lease
_get_pipe()'s ``_prewarm_thread.join()`` could block for several
minutes if the user clicked Quality before the 75 GB prewarm
finished — eating into the @spaces.GPU(duration=300) budget and
risking timeout. New wait_for_prewarm() / is_prewarm_done() helpers
let run_pipeline do the wait CPU-side before acquiring the GPU,
surfacing a "Waiting for VACE checkpoint cache to finish prewarming"
progress message instead of an apparent hang. Backstop join inside
_get_pipe is kept for direct-call safety.

AC. README upstream-protection section
Updated to reflect the local_files_only=True enforcement and the
hardcoded LAMA_MODEL_URL default added in 5d79cd0. Now explicitly
states that runtime fetches never reach upstream Wan-AI / lightx2v /
GitHub releases regardless of their state.

Skipped from the audit list: theoretical concurrency races (ZeroGPU
serializes GPU calls), purely-cosmetic items (font sizes, color-only
signaling), missing UI features (this is a personal-use Space),
and items that can only be tested with real GPU + 75 GB cache.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (4) hide show

README.md +7 -5
app.py +13 -2
pipeline/vace.py +21 -4
pipeline/video.py +4 -1

README.md CHANGED Viewed

@@ -49,13 +49,15 @@ V-Log / HDR colour metadata is preserved via FFmpeg flag passthrough (10-bit H.2
 ## Upstream protection
-Both models are served from a private mirror at [JackIsNotInTheBox/Video_Watermark_Remover_Checkpoints](https://huggingface.co/JackIsNotInTheBox/Video_Watermark_Remover_Checkpoints) so the Space stays functional even if upstream Wan-AI / lightx2v / GitHub releases disappear:
-- LaMa: `lama/big-lama.pt` — pre-fetched into `torch.hub` cache via `LAMA_MODEL_URL`
-- VACE-14B: `vace-14b/` — full diffusers package
-- Distill LoRA: `loras/wan2.1_t2v_14b_lora_rank64_lightx2v_4step.safetensors`
-The first time the Space starts, ~75 GB of VACE weights are downloaded to the persistent cache in a background thread (Quality mode is unavailable until that finishes; Fast mode works immediately).
 ## License

 ## Upstream protection
+All model files come from a private mirror at [JackIsNotInTheBox/Video_Watermark_Remover_Checkpoints](https://huggingface.co/JackIsNotInTheBox/Video_Watermark_Remover_Checkpoints):
+- LaMa: `lama/big-lama.pt` — `LAMA_MODEL_URL` defaults to the mirror, prefetched into `torch.hub` cache before `simple_lama_inpainting` can reach for its hardcoded GitHub release URL
+- VACE-14B: `vace-14b/` — full diffusers package, loaded with `local_files_only=True` so any cache miss errors loudly instead of silently fetching from upstream HF Hub
+- Distill LoRA: `loras/wan2.1_t2v_14b_lora_rank64_lightx2v_4step.safetensors` — same `local_files_only=True` enforcement
+The Space stays functional even if upstream Wan-AI / lightx2v / GitHub release sources are deleted: at runtime nothing reaches for them.
+On the first deploy, ~75 GB of VACE weights are downloaded from the mirror to the persistent cache in a background thread. Fast mode works immediately; Quality mode blocks until prewarm finishes (the UI shows a progress message during the wait).
 ## License

app.py CHANGED Viewed

@@ -45,7 +45,10 @@ from pipeline.crop import (
     mask_to_bbox,
 )
 from pipeline.lama import inpaint_frames_lama_stream
-from pipeline.vace import inpaint_frames_vace_stream, prewarm_vace_cache
 from pipeline.video import (
     VideoMeta, VideoWorkspace,
     attach_audio, extract_first_frame_array, extract_frames, frames_to_video, probe,
@@ -395,7 +398,7 @@ def on_snap_to_rectangle(editor_value: dict | None):
     )
-@spaces.GPU(duration=180)
 def _gpu_inpaint_lama(
     frame_paths: list,
     crop_region: CropRegion,
@@ -508,6 +511,14 @@ def run_pipeline(
                 ws.out_frames_dir, total, progress,
             )
         else:  # MODE_QUALITY (already validated above)
             _gpu_inpaint_vace(
                 frame_paths, crop_region, inpaint_mask,
                 ws.out_frames_dir, progress,

     mask_to_bbox,
 )
 from pipeline.lama import inpaint_frames_lama_stream
+from pipeline.vace import (
+    inpaint_frames_vace_stream, is_prewarm_done, prewarm_vace_cache,
+    wait_for_prewarm,
+)
 from pipeline.video import (
     VideoMeta, VideoWorkspace,
     attach_audio, extract_first_frame_array, extract_frames, frames_to_video, probe,
     )
+@spaces.GPU(duration=240)
 def _gpu_inpaint_lama(
     frame_paths: list,
     crop_region: CropRegion,
                 ws.out_frames_dir, total, progress,
             )
         else:  # MODE_QUALITY (already validated above)
+            # If the prewarm thread is still downloading, wait for it
+            # CPU-side rather than burning the @spaces.GPU(duration=300)
+            # budget on the wait. On a fresh deploy where the user clicks
+            # Quality before prewarm finishes, this could be several
+            # minutes; the progress message tells them what's happening.
+            if not is_prewarm_done():
+                progress(0.16, desc="Waiting for VACE checkpoint cache to finish prewarming…")
+                wait_for_prewarm()
             _gpu_inpaint_vace(
                 frame_paths, crop_region, inpaint_mask,
                 ws.out_frames_dir, progress,

pipeline/vace.py CHANGED Viewed

@@ -167,6 +167,22 @@ def prewarm_vace_cache() -> None:
     _prewarm_thread.start()
 # ---------------------------------------------------------------------------
 # Pipeline singleton (cold load is expensive — keep it warm across calls)
 # ---------------------------------------------------------------------------
@@ -183,11 +199,12 @@ def _get_pipe():
     if _vace_pipe is not None and _vace_device == current_device:
         return _vace_pipe
-    # If a prewarm thread is in flight, wait for it before loading. The
-    # download is the expensive part; the from_pretrained calls below
-    # become near-instant disk reads once the cache is populated.
     if _prewarm_thread is not None and _prewarm_thread.is_alive():
-        print("[VACE] Waiting for prewarm thread to finish…")
         _prewarm_thread.join()
     from diffusers import AutoencoderKLWan, WanVACEPipeline

     _prewarm_thread.start()
+def wait_for_prewarm(timeout: float | None = None) -> None:
+    """Block (CPU-side) until the prewarm thread finishes.
+    Call this from the orchestrator *before* acquiring the GPU lease, so the
+    download time isn't billed against the @spaces.GPU duration budget.
+    No-op if prewarm wasn't started or already finished.
+    """
+    if _prewarm_thread is not None and _prewarm_thread.is_alive():
+        _prewarm_thread.join(timeout=timeout)
+def is_prewarm_done() -> bool:
+    """True if prewarm wasn't started, or has already finished."""
+    return _prewarm_thread is None or not _prewarm_thread.is_alive()
 # ---------------------------------------------------------------------------
 # Pipeline singleton (cold load is expensive — keep it warm across calls)
 # ---------------------------------------------------------------------------
     if _vace_pipe is not None and _vace_device == current_device:
         return _vace_pipe
+    # Backstop: app.run_pipeline already calls wait_for_prewarm() on the
+    # CPU side before invoking _gpu_inpaint_vace, so this should be a
+    # no-op in practice. Kept defensively in case someone calls _get_pipe
+    # directly without going through the orchestrator.
     if _prewarm_thread is not None and _prewarm_thread.is_alive():
+        print("[VACE] _get_pipe waiting on prewarm (should have been done CPU-side)…")
         _prewarm_thread.join()
     from diffusers import AutoencoderKLWan, WanVACEPipeline

pipeline/video.py CHANGED Viewed

@@ -376,7 +376,10 @@ def attach_audio(
         "-i", str(silent_video),   # stream 0: video
         "-i", str(source_video),   # stream 1: audio donor
     ]
-    map_flags = ["-map", "0:v:0", "-map", "1:a:0", "-shortest"]
     out = [str(out_path)]
     # First try stream-copy (no re-encode) — fast and lossless when the

         "-i", str(silent_video),   # stream 0: video
         "-i", str(source_video),   # stream 1: audio donor
     ]
+    # ``-map 1:a`` (no stream index) preserves *all* audio streams from the
+    # source — main mix, commentary, alternate language tracks, etc. The
+    # earlier ``-map 1:a:0`` silently dropped everything past the first.
+    map_flags = ["-map", "0:v:0", "-map", "1:a", "-shortest"]
     out = [str(out_path)]
     # First try stream-copy (no re-encode) — fast and lossless when the