metadata
license: apache-2.0
tags:
- diffusion-single-file
- comfyui
- distillation
- NVFP4
- video
- video genration
base_model:
- Wan-AI/Wan2.1-T2V-1.3B
- gdhe17/Self-Forcing
pipeline_tags:
- text-to-video
library_name: diffusers
π¬ Self-Forcing-NVFP4-4Steps Models
NVFP4 Quantization-Aware Step Distillation for Blackwell Architecture
π Table of Contents
- β¨ Features
- π Quick Start
- π¬ Generation Results
- π¦ Installation
- π οΈ Usage
- π§ Project Structure
- β οΈ Notes
- π€ Community
β¨ Features
- β‘ 4-Step Inference: Dramatically accelerated end-to-end generation approaching real-time performance (tested on RTX 5090 single GPU)
- π― NVFP4 Quantization: Reduced memory and bandwidth usage, optimized for Blackwell architecture
- π§ LightX2V Integration: Optimal performance and stability on the official framework
- π High-Quality Generation: Maintains Self-Forcing's superior video quality while achieving unprecedented speed
π Quick Start
# 1. Install LightX2V
git clone https://github.com/ModelTC/LightX2V.git
cd LightX2V
uv pip install -v .
# 2. Install NVFP4 Kernel
pip install scikit_build_core uv
git clone https://github.com/NVIDIA/cutlass.git
cd lightx2v_kernel
MAX_JOBS=$(nproc) CMAKE_BUILD_PARALLEL_LEVEL=$(nproc) \
uv build --wheel \
-Cbuild-dir=build . \
-Ccmake.define.CUTLASS_PATH=/path/to/cutlass \
--verbose --color=always --no-build-isolation
pip install dist/*whl --force-reinstall --no-deps
# 3. Run inference
# config
https://github.com/ModelTC/LightX2V/blob/main/configs/self_forcing/wan_t2v_sf_nvfp4.json
π¬ Generation Results
"A leprechaun, with green hat and traditional Irish attire, standing in a lush forest filled with vib..."
| Self-Forcing-1.3B-BF16 | Self-Forcing-1.3B-NVFP4 |
|---|---|
"A mystical and spiritual scene filled with loving energy emanating from the heavens. The sky is bath..."
| Self-Forcing-1.3B-BF16 | Self-Forcing-1.3B-NVFP4 |
|---|---|
β οΈ Notes
System Requirements
- Required Hardware: NVIDIA RTX 50-series GPUs (RTX 5090/5080/5070/5060) or other Blackwell architecture GPUs
Dependencies
- Prepare T5 / CLIP / VAE components yourself (same as Self-Forcing structure)
Performance Tips
- Use Blackwell + NVFP4 for best performance
- Enable CPU offload for GPUs with limited memory
π€ Community
- π Issues: GitHub Issues
- π€ Models: HuggingFace Hub
- π Documentation: LightX2V Docs
If you find this project helpful, please give us a β on GitHub