Instructions to use Felldude/Wan2.1-Diffusers-HDR-VAE with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Felldude/Wan2.1-Diffusers-HDR-VAE with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Felldude/Wan2.1-Diffusers-HDR-VAE", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
VAE Reconstruction & Color Expansion Evaluation
Wan2.1 Base VAE vs HDR VAE
๐ At-a-Glance Summary
Color expansion (HDR VAE vs GT):
- LAB volume: ~+50%
- Saturation: ~+27%
- Unique colors: ~+30%
- Gradient strength: ~+7%
Structural cost:
- SSIM: ~-2%
- MSE: ~+50%
Key takeaway
HDR VAE significantly increases chromatic richness and color diversity,
but reduces structural fidelity and increases reconstruction error relative to Wan2.1 Base VAE.
๐งช Evaluation Setup
- Task: Image reconstruction (VAE decoding comparison)
- Models:
- Wan2.1 Base VAE
- HDR VAE
- Ground Truth (GT)
- Mode: Deterministic image-to-image reconstruction
- Metrics:
- Color statistics (HSV / LAB / RGB)
- Structural metrics (SSIM, MSE, PSNR)
- Frequency features (edges, gradients)
- Entropy + color diversity
๐ Key Findings
๐จ Color behavior
- Wan2.1 Base VAE: slight compression of GT color space
- HDR VAE: strong expansion of color space (+30% to +50%)
๐งฑ Structure
- Wan2.1 Base VAE: closer to GT, smoother output
- HDR VAE: sharper, more high-frequency artifacts
๐๏ธ Perceptual behavior
- Wan2.1 Base VAE: higher fidelity, lower distortion
- HDR VAE: more vivid but less faithful reconstruction
๐ Quantitative Results vs GT
Wan2.1 Base VAE
- Brightness: -0.71%
- Contrast: -0.23%
- Saturation: -0.11%
- Entropy: -0.08%
- Dynamic range: -0.46%
- Edge density: -15.35%
- Gradient strength: -5.90%
- LAB color volume: -5.00%
- Quantized colors: -6.60%
- Unique colors: +7.90%
- Sharpness: -45.65%
HDR VAE
- Brightness: +0.90%
- Contrast: +0.93%
- Saturation: +26.90%
- Entropy: +0.17%
- Dynamic range: -0.07%
- Edge density: +14.98%
- Gradient strength: +7.10%
- LAB color volume: +48.60%
- Quantized colors: +45.60%
- Unique colors: +29.50%
- Sharpness: +32.15%
๐งฎ Reconstruction Quality
Wan2.1 Base VAE
- SSIM: 0.822
- PSNR: 25.86
- MSE: 180.88
โก๏ธ Best structural fidelity and lowest reconstruction error
HDR VAE
- SSIM: 0.806
- PSNR: 24.06
- MSE: 278.08
โก๏ธ Higher distortion but stronger chromatic expansion
๐ง Interpretation
Color axis โ Winner: HDR VAE
- Largest increase in LAB volume, saturation, and color diversity
Structure axis โ Winner: Wan2.1 Base VAE
- Closer SSIM and lower MSE
Fidelity axis โ Winner: Wan2.1 Base VAE
- More faithful reconstruction of GT distribution
๐ Final Conclusion
HDR VAE acts as a chromatic expansion decoder, increasing color space occupancy and high-frequency detail.
Wan2.1 Base VAE remains closer to the ground-truth manifold, prioritizing structural and perceptual fidelity over color amplification.
- Downloads last month
- 22
Model tree for Felldude/Wan2.1-Diffusers-HDR-VAE
Base model
Wan-AI/Wan2.1-T2V-14B