Diffusers
Safetensors

VAE Reconstruction & Color Expansion Evaluation

Wan2.1 Base VAE vs HDR VAE


๐Ÿ“Š At-a-Glance Summary

Color expansion (HDR VAE vs GT):

  • LAB volume: ~+50%
  • Saturation: ~+27%
  • Unique colors: ~+30%
  • Gradient strength: ~+7%

Structural cost:

  • SSIM: ~-2%
  • MSE: ~+50%

Key takeaway

HDR VAE significantly increases chromatic richness and color diversity,
but reduces structural fidelity and increases reconstruction error relative to Wan2.1 Base VAE.


๐Ÿงช Evaluation Setup

  • Task: Image reconstruction (VAE decoding comparison)
  • Models:
    • Wan2.1 Base VAE
    • HDR VAE
    • Ground Truth (GT)
  • Mode: Deterministic image-to-image reconstruction
  • Metrics:
    • Color statistics (HSV / LAB / RGB)
    • Structural metrics (SSIM, MSE, PSNR)
    • Frequency features (edges, gradients)
    • Entropy + color diversity

๐Ÿ“ˆ Key Findings

๐ŸŽจ Color behavior

  • Wan2.1 Base VAE: slight compression of GT color space
  • HDR VAE: strong expansion of color space (+30% to +50%)

๐Ÿงฑ Structure

  • Wan2.1 Base VAE: closer to GT, smoother output
  • HDR VAE: sharper, more high-frequency artifacts

๐Ÿ‘๏ธ Perceptual behavior

  • Wan2.1 Base VAE: higher fidelity, lower distortion
  • HDR VAE: more vivid but less faithful reconstruction

๐Ÿ“Š Quantitative Results vs GT

Wan2.1 Base VAE

  • Brightness: -0.71%
  • Contrast: -0.23%
  • Saturation: -0.11%
  • Entropy: -0.08%
  • Dynamic range: -0.46%
  • Edge density: -15.35%
  • Gradient strength: -5.90%
  • LAB color volume: -5.00%
  • Quantized colors: -6.60%
  • Unique colors: +7.90%
  • Sharpness: -45.65%

HDR VAE

  • Brightness: +0.90%
  • Contrast: +0.93%
  • Saturation: +26.90%
  • Entropy: +0.17%
  • Dynamic range: -0.07%
  • Edge density: +14.98%
  • Gradient strength: +7.10%
  • LAB color volume: +48.60%
  • Quantized colors: +45.60%
  • Unique colors: +29.50%
  • Sharpness: +32.15%

๐Ÿงฎ Reconstruction Quality

Wan2.1 Base VAE

  • SSIM: 0.822
  • PSNR: 25.86
  • MSE: 180.88

โžก๏ธ Best structural fidelity and lowest reconstruction error


HDR VAE

  • SSIM: 0.806
  • PSNR: 24.06
  • MSE: 278.08

โžก๏ธ Higher distortion but stronger chromatic expansion


๐Ÿง  Interpretation

Color axis โ†’ Winner: HDR VAE

  • Largest increase in LAB volume, saturation, and color diversity

Structure axis โ†’ Winner: Wan2.1 Base VAE

  • Closer SSIM and lower MSE

Fidelity axis โ†’ Winner: Wan2.1 Base VAE

  • More faithful reconstruction of GT distribution

๐Ÿ Final Conclusion

HDR VAE acts as a chromatic expansion decoder, increasing color space occupancy and high-frequency detail.

Wan2.1 Base VAE remains closer to the ground-truth manifold, prioritizing structural and perceptual fidelity over color amplification.

Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Felldude/Wan2.1-Diffusers-HDR-VAE

Finetuned
(68)
this model