upload stack.yaml (license: CC0)

f7a2c7a verified 14 days ago

4.08 kB

	# SceneMachine model stack manifest
	#
	# Lists the exact upstream weight repos this version of SceneMachine
	# depends on. All weights are hosted in the WindstormLabs HF org
	# (Windstorm Labs is SceneMachine's parent organization; the same
	# weight mirrors are shared by other Windstorm sub-projects).
	#
	# Update this file when the application code requires a new model
	# version. The application reads this manifest at boot to know which
	# WindstormLabs/* repos to download from.

	manifest_version: 1
	generated: 2026-05-13
	scenemachine_min_version: "0.1"

	# The canonical mirror org for SceneMachine's model dependencies.
	# If you fork SceneMachine and want a different mirror, override this.
	mirror_org: WindstormLabs

	stacks:

	# Wan 2.2 family — the primary video generation stack.
	wan22:
	description: \|
	Alibaba's Wan 2.2 14B family. Three sibling models share the
	same VAE / text encoder / CLIP vision encoders.

	text_to_video:
	hf_repo: WindstormLabs/wan22-t2v-fp8
	files:
	- wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors
	- wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors
	vram_gb: 22
	use_case: \|
	Establishing shots, prompt-only scenes, anything without a
	character reference or prior-frame continuity.

	image_to_video:
	hf_repo: WindstormLabs/wan22-i2v-fp8
	files:
	- wan2.2_i2v_high_noise_14B_fp8_scaled.safetensors
	- wan2.2_i2v_low_noise_14B_fp8_scaled.safetensors
	vram_gb: 24
	use_case: \|
	Shot-to-shot continuity. Feed the last frame of the prior shot
	as the seed image; produces a video that flows visually from it.

	animate:
	hf_repo: WindstormLabs/wan22-animate-bf16
	files:
	- wan2.2_animate_14B_bf16.safetensors
	vram_gb: 32
	use_case: \|
	Character-ID-preserving generation. Requires a reference image
	of the character; the model preserves their identity across the
	shot. Validated 1.7 min/shot with the Lightx2v speed LoRA.

	shared_encoders:
	hf_repo: WindstormLabs/wan22-encoders
	files:
	- wan_2.1_vae.safetensors # used by all 3 Wan stacks
	- umt5_xxl_bf16_from_pth.safetensors # T5 text encoder, all 3
	- sigclip_vision_patch14_384.safetensors # CLIP vision, I2V only
	- clip_vision_h.safetensors # CLIP-ViT-H, Animate only (1280-dim)

	speed_loras:
	hf_repo: WindstormLabs/wan22-loras
	files:
	- Wan_2_2_I2V_A14B_HIGH_lightx2v_4step_lora_260412_rank_64_fp16.safetensors
	- wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors
	use_case: \|
	Kijai's Lightx2v 4-step distillation. When enabled, drops the
	sampler from 30 steps to 4 with cfg=1.0 — 8.3× wallclock speedup
	on Wan Animate. Verified to transfer cleanly to Animate when
	paired with the correct embed chain + CLIP-ViT-H.

	# LTX-2 — alternate cinematic stack (slower, comparable quality).
	ltx2:
	description: \|
	Lightricks LTX-2 19B Dev FP8 plus the Gemma text encoder.

	dev_fp8:
	hf_repo: WindstormLabs/ltx2-19b-fp8
	files:
	- ltx-2-19b-dev-fp8.safetensors
	- model-00001-of-00005.safetensors # Gemma encoder shards
	- model-00002-of-00005.safetensors
	- model-00003-of-00005.safetensors
	- model-00004-of-00005.safetensors
	- model-00005-of-00005.safetensors
	vram_gb: 28

	# Hunyuan — reserved for Stack B implementation (not yet wired in app)
	hunyuan:
	description: \|
	Tencent HunyuanVideo + HunyuanVideo-I2V + HunyuanCustom partial
	mirror. Reserved for SceneMachine Stack B (alternate character-
	consistency path via Hunyuan's built-in identity preservation,
	no LoRA needed). Provider workflow not yet implemented in the
	app — weights are mirrored for future use.
	hf_repo: WindstormLabs/hunyuan
	license_note: \|
	Tencent HunyuanVideo Community License — check the upstream
	repo for current terms before any commercial use.