Instructions to use Felldude/Wan2.2-Diffusers-HDR-VAE with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Felldude/Wan2.2-Diffusers-HDR-VAE with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Felldude/Wan2.2-Diffusers-HDR-VAE", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
VAE Reconstruction & Color Expansion Evaluation
Wan 2.2 Base HDR VAE vs Ground Truth (GT)
๐ At-a-Glance Summary
Color expansion (Wan 2.2 Base HDR VAE vs GT):
- LAB volume: ~+34.09%
- Saturation: ~+22.27%
- Unique colors: ~+23.68%
- Quantized colors: ~+30.32%
- Sharpness: ~+4.25%
Structural cost:
- SSIM: ~-1.42%
- PSNR: ~-2.31%
- MSE: ~+13.09%
Key takeaway
Wan 2.2 Base HDR VAE significantly increases perceptual color richness and detail density, while introducing a mild reduction in structural fidelity and a moderate increase in reconstruction error relative to ground truth.
๐งช Evaluation Setup
- Task: Image reconstruction (VAE decode comparison)
- Model:
- Wan 2.2 Base HDR VAE
- Reference:
- Ground Truth (GT)
- Mode: Deterministic image reconstruction evaluation
- Metrics:
- Color statistics (HSV / LAB / RGB)
- Structural similarity (SSIM)
- Pixel fidelity (PSNR, MSE)
- Texture features (edges, gradients, sharpness)
- Entropy and color diversity measures
๐ Key Findings
๐จ Color behavior
- Wan 2.2 Base HDR VAE significantly expands GT color space
- Strong increases in saturation and LAB distribution volume
- Noticeably richer and more expressive chromatic output
๐งฑ Structure
- Slight reduction in SSIM and PSNR indicates mild structural deviation from GT
- Increased sharpness suggests enhanced edge emphasis and micro-detail amplification
๐๏ธ Perceptual behavior
- Outputs appear more vivid and detailed than GT
- Trade-off between realism fidelity and perceptual enhancement
๐ Quantitative Results vs GT
Wan 2.2 Base HDR VAE
- Brightness: +0.79%
- Contrast: +0.73%
- Saturation: +22.27%
- Entropy: +0.13%
- Dynamic range: -0.13%
- Edge density: -1.41%
- Gradient strength: -0.68%
- LAB color volume: +34.09%
- Quantized colors: +30.32%
- Unique colors: +23.68%
- Sharpness: +4.25%
๐งฎ Reconstruction Quality
Wan 2.2 Base HDR VAE vs GT
- SSIM: -1.42%
- PSNR: -2.31%
- MSE: +13.09%
โก๏ธ Slight reduction in structural fidelity with moderate increase in reconstruction error
๐ง Interpretation
Color axis โ Strong winner: Wan 2.2 Base HDR VAE
- Large gains in LAB volume, saturation, and color diversity
Structure axis โ Slight trade-off
- Small degradation in SSIM/PSNR balanced by increased sharpness
Perceptual axis โ Winner: Wan 2.2 Base HDR VAE
- More visually rich and detailed outputs compared to GT
๐ Final Conclusion
Wan 2.2 Base HDR VAE acts as a perceptual and chromatic enhancement VAE, expanding color richness and visible detail density beyond ground truth.
It trades a small amount of structural fidelity for significantly improved perceptual richness, making it more suitable for downstream generative pipelines where visual expressiveness is prioritized over strict reconstruction accuracy.
- Downloads last month
- 69