BoxOfColors Claude Sonnet 4.6 commited on
Commit
aa53ba5
Β·
1 Parent(s): 01d72dd

Fix HunyuanFoley: save text_feats to disk inside GPU worker

Browse files

ZeroGPU forbids CUDA tensor deserialization in the main process. The previous
fix resolved the ModuleNotFoundError but text_feats contains CUDA tensors;
unpickling them in main triggers torch.cuda._lazy_init() which ZeroGPU blocks.

Fix: save text_feats via torch.save() inside the GPU worker, return the file
path string instead. Main process receives only numpy arrays + a string path.
Update _hunyuan_extras to use the pre-saved path directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Files changed (1) hide show
  1. app.py +11 -5
app.py CHANGED
@@ -1322,7 +1322,13 @@ def _hunyuan_gpu_infer_impl(video_file, prompt, negative_prompt, seed_val,
1322
 
1323
  _log_inference_timing("HunyuanFoley", time.perf_counter() - _t_hny_start,
1324
  len(segments), int(num_steps), HUNYUAN_SECS_PER_STEP)
1325
- results.append((seg_wavs, sr, text_feats))
 
 
 
 
 
 
1326
 
1327
  # Free GPU memory between samples to prevent VRAM fragmentation
1328
  if torch.cuda.is_available():
@@ -1352,10 +1358,10 @@ def generate_hunyuan(video_file, prompt, negative_prompt, seed_val,
1352
 
1353
  # ── CPU post-processing (no GPU needed) ──
1354
  def _hunyuan_extras(sample_idx, result, td):
1355
- _, _sr, text_feats = result
1356
- path = os.path.join(td, f"hunyuan_{sample_idx}_text_feats.pt")
1357
- torch.save(text_feats, path)
1358
- return {"text_feats_path": path}
1359
 
1360
  outputs = _post_process_samples(
1361
  results, model="hunyuan", tmp_dir=tmp_dir,
 
1322
 
1323
  _log_inference_timing("HunyuanFoley", time.perf_counter() - _t_hny_start,
1324
  len(segments), int(num_steps), HUNYUAN_SECS_PER_STEP)
1325
+
1326
+ # Save text_feats to disk inside the GPU worker so we never pickle a CUDA
1327
+ # tensor back to the main process (ZeroGPU forbids CUDA init in main process).
1328
+ text_feats_path = os.path.join(tmp_dir, f"hunyuan_{sample_idx}_text_feats.pt")
1329
+ torch.save(text_feats, text_feats_path)
1330
+ print(f"[HunyuanFoley] text_feats saved to {text_feats_path}")
1331
+ results.append((seg_wavs, sr, text_feats_path))
1332
 
1333
  # Free GPU memory between samples to prevent VRAM fragmentation
1334
  if torch.cuda.is_available():
 
1358
 
1359
  # ── CPU post-processing (no GPU needed) ──
1360
  def _hunyuan_extras(sample_idx, result, td):
1361
+ # text_feats was saved to disk inside the GPU worker (to avoid pickling CUDA
1362
+ # tensors across the ZeroGPU process boundary); result[2] is the file path.
1363
+ _, _sr, text_feats_path = result
1364
+ return {"text_feats_path": text_feats_path}
1365
 
1366
  outputs = _post_process_samples(
1367
  results, model="hunyuan", tmp_dir=tmp_dir,