fix: list-audit fixes — LaMa duration, multi-stream audio, prewarm wait
Browse filesActed on the items from the gap-list audit that were real bugs/risks:
C. LaMa @spaces.GPU duration was tight for high-fps clips
180 s budget vs 60 fps × 15 s = 900 frames × ~0.3 s/frame = ~270 s
worst case. Bumped to 240 s — covers typical 60 fps loads with
~30 s headroom. (VACE stays at 300 s; per-mode budget split intact.)
H. attach_audio dropped extra audio streams
``-map 1:a:0`` only kept the first audio stream — sources with
commentary tracks / alternate languages / 5.1 + stereo dual-mix
silently lost everything but the main mix. Now uses ``-map 1:a``
(preserves all audio streams). Re-encode fallback path applies
AAC@192k to each independently.
N. Prewarm wait happened inside the GPU lease
_get_pipe()'s ``_prewarm_thread.join()`` could block for several
minutes if the user clicked Quality before the 75 GB prewarm
finished — eating into the @spaces.GPU(duration=300) budget and
risking timeout. New wait_for_prewarm() / is_prewarm_done() helpers
let run_pipeline do the wait CPU-side before acquiring the GPU,
surfacing a "Waiting for VACE checkpoint cache to finish prewarming"
progress message instead of an apparent hang. Backstop join inside
_get_pipe is kept for direct-call safety.
AC. README upstream-protection section
Updated to reflect the local_files_only=True enforcement and the
hardcoded LAMA_MODEL_URL default added in 5d79cd0. Now explicitly
states that runtime fetches never reach upstream Wan-AI / lightx2v /
GitHub releases regardless of their state.
Skipped from the audit list: theoretical concurrency races (ZeroGPU
serializes GPU calls), purely-cosmetic items (font sizes, color-only
signaling), missing UI features (this is a personal-use Space),
and items that can only be tested with real GPU + 75 GB cache.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README.md +7 -5
- app.py +13 -2
- pipeline/vace.py +21 -4
- pipeline/video.py +4 -1
|
@@ -49,13 +49,15 @@ V-Log / HDR colour metadata is preserved via FFmpeg flag passthrough (10-bit H.2
|
|
| 49 |
|
| 50 |
## Upstream protection
|
| 51 |
|
| 52 |
-
|
| 53 |
|
| 54 |
-
- LaMa: `lama/big-lama.pt` —
|
| 55 |
-
- VACE-14B: `vace-14b/` — full diffusers package
|
| 56 |
-
- Distill LoRA: `loras/wan2.1_t2v_14b_lora_rank64_lightx2v_4step.safetensors`
|
| 57 |
|
| 58 |
-
The
|
|
|
|
|
|
|
| 59 |
|
| 60 |
## License
|
| 61 |
|
|
|
|
| 49 |
|
| 50 |
## Upstream protection
|
| 51 |
|
| 52 |
+
All model files come from a private mirror at [JackIsNotInTheBox/Video_Watermark_Remover_Checkpoints](https://huggingface.co/JackIsNotInTheBox/Video_Watermark_Remover_Checkpoints):
|
| 53 |
|
| 54 |
+
- LaMa: `lama/big-lama.pt` — `LAMA_MODEL_URL` defaults to the mirror, prefetched into `torch.hub` cache before `simple_lama_inpainting` can reach for its hardcoded GitHub release URL
|
| 55 |
+
- VACE-14B: `vace-14b/` — full diffusers package, loaded with `local_files_only=True` so any cache miss errors loudly instead of silently fetching from upstream HF Hub
|
| 56 |
+
- Distill LoRA: `loras/wan2.1_t2v_14b_lora_rank64_lightx2v_4step.safetensors` — same `local_files_only=True` enforcement
|
| 57 |
|
| 58 |
+
The Space stays functional even if upstream Wan-AI / lightx2v / GitHub release sources are deleted: at runtime nothing reaches for them.
|
| 59 |
+
|
| 60 |
+
On the first deploy, ~75 GB of VACE weights are downloaded from the mirror to the persistent cache in a background thread. Fast mode works immediately; Quality mode blocks until prewarm finishes (the UI shows a progress message during the wait).
|
| 61 |
|
| 62 |
## License
|
| 63 |
|
|
@@ -45,7 +45,10 @@ from pipeline.crop import (
|
|
| 45 |
mask_to_bbox,
|
| 46 |
)
|
| 47 |
from pipeline.lama import inpaint_frames_lama_stream
|
| 48 |
-
from pipeline.vace import
|
|
|
|
|
|
|
|
|
|
| 49 |
from pipeline.video import (
|
| 50 |
VideoMeta, VideoWorkspace,
|
| 51 |
attach_audio, extract_first_frame_array, extract_frames, frames_to_video, probe,
|
|
@@ -395,7 +398,7 @@ def on_snap_to_rectangle(editor_value: dict | None):
|
|
| 395 |
)
|
| 396 |
|
| 397 |
|
| 398 |
-
@spaces.GPU(duration=
|
| 399 |
def _gpu_inpaint_lama(
|
| 400 |
frame_paths: list,
|
| 401 |
crop_region: CropRegion,
|
|
@@ -508,6 +511,14 @@ def run_pipeline(
|
|
| 508 |
ws.out_frames_dir, total, progress,
|
| 509 |
)
|
| 510 |
else: # MODE_QUALITY (already validated above)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 511 |
_gpu_inpaint_vace(
|
| 512 |
frame_paths, crop_region, inpaint_mask,
|
| 513 |
ws.out_frames_dir, progress,
|
|
|
|
| 45 |
mask_to_bbox,
|
| 46 |
)
|
| 47 |
from pipeline.lama import inpaint_frames_lama_stream
|
| 48 |
+
from pipeline.vace import (
|
| 49 |
+
inpaint_frames_vace_stream, is_prewarm_done, prewarm_vace_cache,
|
| 50 |
+
wait_for_prewarm,
|
| 51 |
+
)
|
| 52 |
from pipeline.video import (
|
| 53 |
VideoMeta, VideoWorkspace,
|
| 54 |
attach_audio, extract_first_frame_array, extract_frames, frames_to_video, probe,
|
|
|
|
| 398 |
)
|
| 399 |
|
| 400 |
|
| 401 |
+
@spaces.GPU(duration=240)
|
| 402 |
def _gpu_inpaint_lama(
|
| 403 |
frame_paths: list,
|
| 404 |
crop_region: CropRegion,
|
|
|
|
| 511 |
ws.out_frames_dir, total, progress,
|
| 512 |
)
|
| 513 |
else: # MODE_QUALITY (already validated above)
|
| 514 |
+
# If the prewarm thread is still downloading, wait for it
|
| 515 |
+
# CPU-side rather than burning the @spaces.GPU(duration=300)
|
| 516 |
+
# budget on the wait. On a fresh deploy where the user clicks
|
| 517 |
+
# Quality before prewarm finishes, this could be several
|
| 518 |
+
# minutes; the progress message tells them what's happening.
|
| 519 |
+
if not is_prewarm_done():
|
| 520 |
+
progress(0.16, desc="Waiting for VACE checkpoint cache to finish prewarming…")
|
| 521 |
+
wait_for_prewarm()
|
| 522 |
_gpu_inpaint_vace(
|
| 523 |
frame_paths, crop_region, inpaint_mask,
|
| 524 |
ws.out_frames_dir, progress,
|
|
@@ -167,6 +167,22 @@ def prewarm_vace_cache() -> None:
|
|
| 167 |
_prewarm_thread.start()
|
| 168 |
|
| 169 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 170 |
# ---------------------------------------------------------------------------
|
| 171 |
# Pipeline singleton (cold load is expensive — keep it warm across calls)
|
| 172 |
# ---------------------------------------------------------------------------
|
|
@@ -183,11 +199,12 @@ def _get_pipe():
|
|
| 183 |
if _vace_pipe is not None and _vace_device == current_device:
|
| 184 |
return _vace_pipe
|
| 185 |
|
| 186 |
-
#
|
| 187 |
-
#
|
| 188 |
-
#
|
|
|
|
| 189 |
if _prewarm_thread is not None and _prewarm_thread.is_alive():
|
| 190 |
-
print("[VACE]
|
| 191 |
_prewarm_thread.join()
|
| 192 |
|
| 193 |
from diffusers import AutoencoderKLWan, WanVACEPipeline
|
|
|
|
| 167 |
_prewarm_thread.start()
|
| 168 |
|
| 169 |
|
| 170 |
+
def wait_for_prewarm(timeout: float | None = None) -> None:
|
| 171 |
+
"""Block (CPU-side) until the prewarm thread finishes.
|
| 172 |
+
|
| 173 |
+
Call this from the orchestrator *before* acquiring the GPU lease, so the
|
| 174 |
+
download time isn't billed against the @spaces.GPU duration budget.
|
| 175 |
+
No-op if prewarm wasn't started or already finished.
|
| 176 |
+
"""
|
| 177 |
+
if _prewarm_thread is not None and _prewarm_thread.is_alive():
|
| 178 |
+
_prewarm_thread.join(timeout=timeout)
|
| 179 |
+
|
| 180 |
+
|
| 181 |
+
def is_prewarm_done() -> bool:
|
| 182 |
+
"""True if prewarm wasn't started, or has already finished."""
|
| 183 |
+
return _prewarm_thread is None or not _prewarm_thread.is_alive()
|
| 184 |
+
|
| 185 |
+
|
| 186 |
# ---------------------------------------------------------------------------
|
| 187 |
# Pipeline singleton (cold load is expensive — keep it warm across calls)
|
| 188 |
# ---------------------------------------------------------------------------
|
|
|
|
| 199 |
if _vace_pipe is not None and _vace_device == current_device:
|
| 200 |
return _vace_pipe
|
| 201 |
|
| 202 |
+
# Backstop: app.run_pipeline already calls wait_for_prewarm() on the
|
| 203 |
+
# CPU side before invoking _gpu_inpaint_vace, so this should be a
|
| 204 |
+
# no-op in practice. Kept defensively in case someone calls _get_pipe
|
| 205 |
+
# directly without going through the orchestrator.
|
| 206 |
if _prewarm_thread is not None and _prewarm_thread.is_alive():
|
| 207 |
+
print("[VACE] _get_pipe waiting on prewarm (should have been done CPU-side)…")
|
| 208 |
_prewarm_thread.join()
|
| 209 |
|
| 210 |
from diffusers import AutoencoderKLWan, WanVACEPipeline
|
|
@@ -376,7 +376,10 @@ def attach_audio(
|
|
| 376 |
"-i", str(silent_video), # stream 0: video
|
| 377 |
"-i", str(source_video), # stream 1: audio donor
|
| 378 |
]
|
| 379 |
-
|
|
|
|
|
|
|
|
|
|
| 380 |
out = [str(out_path)]
|
| 381 |
|
| 382 |
# First try stream-copy (no re-encode) — fast and lossless when the
|
|
|
|
| 376 |
"-i", str(silent_video), # stream 0: video
|
| 377 |
"-i", str(source_video), # stream 1: audio donor
|
| 378 |
]
|
| 379 |
+
# ``-map 1:a`` (no stream index) preserves *all* audio streams from the
|
| 380 |
+
# source — main mix, commentary, alternate language tracks, etc. The
|
| 381 |
+
# earlier ``-map 1:a:0`` silently dropped everything past the first.
|
| 382 |
+
map_flags = ["-map", "0:v:0", "-map", "1:a", "-shortest"]
|
| 383 |
out = [str(out_path)]
|
| 384 |
|
| 385 |
# First try stream-copy (no re-encode) — fast and lossless when the
|