BoxOfColors Claude Opus 4.7 (1M context) commited on
Commit
b859009
·
1 Parent(s): 5d79cd0

fix: list-audit fixes — LaMa duration, multi-stream audio, prewarm wait

Browse files

Acted on the items from the gap-list audit that were real bugs/risks:

C. LaMa @spaces.GPU duration was tight for high-fps clips
180 s budget vs 60 fps × 15 s = 900 frames × ~0.3 s/frame = ~270 s
worst case. Bumped to 240 s — covers typical 60 fps loads with
~30 s headroom. (VACE stays at 300 s; per-mode budget split intact.)

H. attach_audio dropped extra audio streams
``-map 1:a:0`` only kept the first audio stream — sources with
commentary tracks / alternate languages / 5.1 + stereo dual-mix
silently lost everything but the main mix. Now uses ``-map 1:a``
(preserves all audio streams). Re-encode fallback path applies
AAC@192k to each independently.

N. Prewarm wait happened inside the GPU lease
_get_pipe()'s ``_prewarm_thread.join()`` could block for several
minutes if the user clicked Quality before the 75 GB prewarm
finished — eating into the @spaces.GPU(duration=300) budget and
risking timeout. New wait_for_prewarm() / is_prewarm_done() helpers
let run_pipeline do the wait CPU-side before acquiring the GPU,
surfacing a "Waiting for VACE checkpoint cache to finish prewarming"
progress message instead of an apparent hang. Backstop join inside
_get_pipe is kept for direct-call safety.

AC. README upstream-protection section
Updated to reflect the local_files_only=True enforcement and the
hardcoded LAMA_MODEL_URL default added in 5d79cd0. Now explicitly
states that runtime fetches never reach upstream Wan-AI / lightx2v /
GitHub releases regardless of their state.

Skipped from the audit list: theoretical concurrency races (ZeroGPU
serializes GPU calls), purely-cosmetic items (font sizes, color-only
signaling), missing UI features (this is a personal-use Space),
and items that can only be tested with real GPU + 75 GB cache.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (4) hide show
  1. README.md +7 -5
  2. app.py +13 -2
  3. pipeline/vace.py +21 -4
  4. pipeline/video.py +4 -1
README.md CHANGED
@@ -49,13 +49,15 @@ V-Log / HDR colour metadata is preserved via FFmpeg flag passthrough (10-bit H.2
49
 
50
  ## Upstream protection
51
 
52
- Both models are served from a private mirror at [JackIsNotInTheBox/Video_Watermark_Remover_Checkpoints](https://huggingface.co/JackIsNotInTheBox/Video_Watermark_Remover_Checkpoints) so the Space stays functional even if upstream Wan-AI / lightx2v / GitHub releases disappear:
53
 
54
- - LaMa: `lama/big-lama.pt` — pre-fetched into `torch.hub` cache via `LAMA_MODEL_URL`
55
- - VACE-14B: `vace-14b/` — full diffusers package
56
- - Distill LoRA: `loras/wan2.1_t2v_14b_lora_rank64_lightx2v_4step.safetensors`
57
 
58
- The first time the Space starts, ~75 GB of VACE weights are downloaded to the persistent cache in a background thread (Quality mode is unavailable until that finishes; Fast mode works immediately).
 
 
59
 
60
  ## License
61
 
 
49
 
50
  ## Upstream protection
51
 
52
+ All model files come from a private mirror at [JackIsNotInTheBox/Video_Watermark_Remover_Checkpoints](https://huggingface.co/JackIsNotInTheBox/Video_Watermark_Remover_Checkpoints):
53
 
54
+ - LaMa: `lama/big-lama.pt` — `LAMA_MODEL_URL` defaults to the mirror, prefetched into `torch.hub` cache before `simple_lama_inpainting` can reach for its hardcoded GitHub release URL
55
+ - VACE-14B: `vace-14b/` — full diffusers package, loaded with `local_files_only=True` so any cache miss errors loudly instead of silently fetching from upstream HF Hub
56
+ - Distill LoRA: `loras/wan2.1_t2v_14b_lora_rank64_lightx2v_4step.safetensors` — same `local_files_only=True` enforcement
57
 
58
+ The Space stays functional even if upstream Wan-AI / lightx2v / GitHub release sources are deleted: at runtime nothing reaches for them.
59
+
60
+ On the first deploy, ~75 GB of VACE weights are downloaded from the mirror to the persistent cache in a background thread. Fast mode works immediately; Quality mode blocks until prewarm finishes (the UI shows a progress message during the wait).
61
 
62
  ## License
63
 
app.py CHANGED
@@ -45,7 +45,10 @@ from pipeline.crop import (
45
  mask_to_bbox,
46
  )
47
  from pipeline.lama import inpaint_frames_lama_stream
48
- from pipeline.vace import inpaint_frames_vace_stream, prewarm_vace_cache
 
 
 
49
  from pipeline.video import (
50
  VideoMeta, VideoWorkspace,
51
  attach_audio, extract_first_frame_array, extract_frames, frames_to_video, probe,
@@ -395,7 +398,7 @@ def on_snap_to_rectangle(editor_value: dict | None):
395
  )
396
 
397
 
398
- @spaces.GPU(duration=180)
399
  def _gpu_inpaint_lama(
400
  frame_paths: list,
401
  crop_region: CropRegion,
@@ -508,6 +511,14 @@ def run_pipeline(
508
  ws.out_frames_dir, total, progress,
509
  )
510
  else: # MODE_QUALITY (already validated above)
 
 
 
 
 
 
 
 
511
  _gpu_inpaint_vace(
512
  frame_paths, crop_region, inpaint_mask,
513
  ws.out_frames_dir, progress,
 
45
  mask_to_bbox,
46
  )
47
  from pipeline.lama import inpaint_frames_lama_stream
48
+ from pipeline.vace import (
49
+ inpaint_frames_vace_stream, is_prewarm_done, prewarm_vace_cache,
50
+ wait_for_prewarm,
51
+ )
52
  from pipeline.video import (
53
  VideoMeta, VideoWorkspace,
54
  attach_audio, extract_first_frame_array, extract_frames, frames_to_video, probe,
 
398
  )
399
 
400
 
401
+ @spaces.GPU(duration=240)
402
  def _gpu_inpaint_lama(
403
  frame_paths: list,
404
  crop_region: CropRegion,
 
511
  ws.out_frames_dir, total, progress,
512
  )
513
  else: # MODE_QUALITY (already validated above)
514
+ # If the prewarm thread is still downloading, wait for it
515
+ # CPU-side rather than burning the @spaces.GPU(duration=300)
516
+ # budget on the wait. On a fresh deploy where the user clicks
517
+ # Quality before prewarm finishes, this could be several
518
+ # minutes; the progress message tells them what's happening.
519
+ if not is_prewarm_done():
520
+ progress(0.16, desc="Waiting for VACE checkpoint cache to finish prewarming…")
521
+ wait_for_prewarm()
522
  _gpu_inpaint_vace(
523
  frame_paths, crop_region, inpaint_mask,
524
  ws.out_frames_dir, progress,
pipeline/vace.py CHANGED
@@ -167,6 +167,22 @@ def prewarm_vace_cache() -> None:
167
  _prewarm_thread.start()
168
 
169
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
170
  # ---------------------------------------------------------------------------
171
  # Pipeline singleton (cold load is expensive — keep it warm across calls)
172
  # ---------------------------------------------------------------------------
@@ -183,11 +199,12 @@ def _get_pipe():
183
  if _vace_pipe is not None and _vace_device == current_device:
184
  return _vace_pipe
185
 
186
- # If a prewarm thread is in flight, wait for it before loading. The
187
- # download is the expensive part; the from_pretrained calls below
188
- # become near-instant disk reads once the cache is populated.
 
189
  if _prewarm_thread is not None and _prewarm_thread.is_alive():
190
- print("[VACE] Waiting for prewarm thread to finish…")
191
  _prewarm_thread.join()
192
 
193
  from diffusers import AutoencoderKLWan, WanVACEPipeline
 
167
  _prewarm_thread.start()
168
 
169
 
170
+ def wait_for_prewarm(timeout: float | None = None) -> None:
171
+ """Block (CPU-side) until the prewarm thread finishes.
172
+
173
+ Call this from the orchestrator *before* acquiring the GPU lease, so the
174
+ download time isn't billed against the @spaces.GPU duration budget.
175
+ No-op if prewarm wasn't started or already finished.
176
+ """
177
+ if _prewarm_thread is not None and _prewarm_thread.is_alive():
178
+ _prewarm_thread.join(timeout=timeout)
179
+
180
+
181
+ def is_prewarm_done() -> bool:
182
+ """True if prewarm wasn't started, or has already finished."""
183
+ return _prewarm_thread is None or not _prewarm_thread.is_alive()
184
+
185
+
186
  # ---------------------------------------------------------------------------
187
  # Pipeline singleton (cold load is expensive — keep it warm across calls)
188
  # ---------------------------------------------------------------------------
 
199
  if _vace_pipe is not None and _vace_device == current_device:
200
  return _vace_pipe
201
 
202
+ # Backstop: app.run_pipeline already calls wait_for_prewarm() on the
203
+ # CPU side before invoking _gpu_inpaint_vace, so this should be a
204
+ # no-op in practice. Kept defensively in case someone calls _get_pipe
205
+ # directly without going through the orchestrator.
206
  if _prewarm_thread is not None and _prewarm_thread.is_alive():
207
+ print("[VACE] _get_pipe waiting on prewarm (should have been done CPU-side)…")
208
  _prewarm_thread.join()
209
 
210
  from diffusers import AutoencoderKLWan, WanVACEPipeline
pipeline/video.py CHANGED
@@ -376,7 +376,10 @@ def attach_audio(
376
  "-i", str(silent_video), # stream 0: video
377
  "-i", str(source_video), # stream 1: audio donor
378
  ]
379
- map_flags = ["-map", "0:v:0", "-map", "1:a:0", "-shortest"]
 
 
 
380
  out = [str(out_path)]
381
 
382
  # First try stream-copy (no re-encode) — fast and lossless when the
 
376
  "-i", str(silent_video), # stream 0: video
377
  "-i", str(source_video), # stream 1: audio donor
378
  ]
379
+ # ``-map 1:a`` (no stream index) preserves *all* audio streams from the
380
+ # source — main mix, commentary, alternate language tracks, etc. The
381
+ # earlier ``-map 1:a:0`` silently dropped everything past the first.
382
+ map_flags = ["-map", "0:v:0", "-map", "1:a", "-shortest"]
383
  out = [str(out_path)]
384
 
385
  # First try stream-copy (no re-encode) — fast and lossless when the