stack-manifest / stack.yaml
sneakyfree's picture
upload stack.yaml (license: CC0)
f7a2c7a verified
# SceneMachine model stack manifest
#
# Lists the exact upstream weight repos this version of SceneMachine
# depends on. All weights are hosted in the WindstormLabs HF org
# (Windstorm Labs is SceneMachine's parent organization; the same
# weight mirrors are shared by other Windstorm sub-projects).
#
# Update this file when the application code requires a new model
# version. The application reads this manifest at boot to know which
# WindstormLabs/* repos to download from.
manifest_version: 1
generated: 2026-05-13
scenemachine_min_version: "0.1"
# The canonical mirror org for SceneMachine's model dependencies.
# If you fork SceneMachine and want a different mirror, override this.
mirror_org: WindstormLabs
stacks:
# Wan 2.2 family β€” the primary video generation stack.
wan22:
description: |
Alibaba's Wan 2.2 14B family. Three sibling models share the
same VAE / text encoder / CLIP vision encoders.
text_to_video:
hf_repo: WindstormLabs/wan22-t2v-fp8
files:
- wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors
- wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors
vram_gb: 22
use_case: |
Establishing shots, prompt-only scenes, anything without a
character reference or prior-frame continuity.
image_to_video:
hf_repo: WindstormLabs/wan22-i2v-fp8
files:
- wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors
- wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors
vram_gb: 24
use_case: |
Shot-to-shot continuity. Feed the last frame of the prior shot
as the seed image; produces a video that flows visually from it.
animate:
hf_repo: WindstormLabs/wan22-animate-bf16
files:
- wan2.2_animate_14B_bf16.safetensors
vram_gb: 32
use_case: |
Character-ID-preserving generation. Requires a reference image
of the character; the model preserves their identity across the
shot. Validated 1.7 min/shot with the Lightx2v speed LoRA.
shared_encoders:
hf_repo: WindstormLabs/wan22-encoders
files:
- wan_2.1_vae.safetensors # used by all 3 Wan stacks
- umt5_xxl_bf16_from_pth.safetensors # T5 text encoder, all 3
- sigclip_vision_patch14_384.safetensors # CLIP vision, I2V only
- clip_vision_h.safetensors # CLIP-ViT-H, Animate only (1280-dim)
speed_loras:
hf_repo: WindstormLabs/wan22-loras
files:
- Wan_2_2_I2V_A14B_HIGH_lightx2v_4step_lora_260412_rank_64_fp16.safetensors
- wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors
use_case: |
Kijai's Lightx2v 4-step distillation. When enabled, drops the
sampler from 30 steps to 4 with cfg=1.0 β€” 8.3Γ— wallclock speedup
on Wan Animate. Verified to transfer cleanly to Animate when
paired with the correct embed chain + CLIP-ViT-H.
# LTX-2 β€” alternate cinematic stack (slower, comparable quality).
ltx2:
description: |
Lightricks LTX-2 19B Dev FP8 plus the Gemma text encoder.
dev_fp8:
hf_repo: WindstormLabs/ltx2-19b-fp8
files:
- ltx-2-19b-dev-fp8.safetensors
- model-00001-of-00005.safetensors # Gemma encoder shards
- model-00002-of-00005.safetensors
- model-00003-of-00005.safetensors
- model-00004-of-00005.safetensors
- model-00005-of-00005.safetensors
vram_gb: 28
# Hunyuan β€” reserved for Stack B implementation (not yet wired in app)
hunyuan:
description: |
Tencent HunyuanVideo + HunyuanVideo-I2V + HunyuanCustom partial
mirror. Reserved for SceneMachine Stack B (alternate character-
consistency path via Hunyuan's built-in identity preservation,
no LoRA needed). Provider workflow not yet implemented in the
app β€” weights are mirrored for future use.
hf_repo: WindstormLabs/hunyuan
license_note: |
Tencent HunyuanVideo Community License β€” check the upstream
repo for current terms before any commercial use.