Pacific-Prime
/

diffusion-vae

@@ -1,79 +1,90 @@
----
-license: cc-by-nc-4.0
-tags:
-  - vae
-  - image-generation
-  - diffusion
-  - inl-diffusion
-library_name: pytorch
-pipeline_tag: image-to-image
----
-# INL-Diffusion VAE
-Variational Autoencoder for INL-Diffusion image generation pipeline.
-## Architecture
-**89M parameters** | 256x256 images | 4-channel latent space
-### Encoder
-$$z = \mathcal{E}(x) \in \mathbb{R}^{32 \times 32 \times 4}$$
-Compresses 256x256x3 images to 32x32x4 latents (8x spatial compression).
-### Decoder
-$$\hat{x} = \mathcal{D}(z) \in \mathbb{R}^{256 \times 256 \times 3}$$
-### Loss Function
-$$\mathcal{L} = \mathcal{L}_{\text{recon}} + \beta \cdot D_{KL}(q(z|x) \| p(z)) + \lambda \cdot \mathcal{L}_{\text{perceptual}}$$
-Where:
-- $\mathcal{L}_{\text{recon}} = \|x - \hat{x}\|_1$ (L1 reconstruction)
-- $D_{KL}$ regularizes latent to $\mathcal{N}(0, I)$
-- $\mathcal{L}_{\text{perceptual}}$ uses VGG features
-## Config
-| Parameter | Value |
-|-----------|-------|
-| Image size | 256x256 |
-| Latent dim | 4 |
-| Base channels | 128 |
-| Channel mult | [1, 2, 4, 4] |
-| Res blocks | 2 |
-## Usage
-```python
-from safetensors.torch import load_file
-from inl_diffusion.vae import INLVAE
-# Load
-state_dict = load_file("model.safetensors")
-vae = INLVAE(image_size=256, base_channels=128, latent_dim=4)
-vae.load_state_dict(state_dict)
-# Encode
-latents = vae.encode(images)  # [B, 4, 32, 32]
-# Decode
-reconstructed = vae.decode(latents)  # [B, 3, 256, 256]
-```
-## Training
-Trained on WikiArt (81K images) for 15K steps with:
-- Batch size: 16
-- Learning rate: 1e-4
-- Mixed precision: bf16
-### Training Curves
-![Training Curves](training_curves.png)
-## License
-CC BY-NC 4.0 - Attribution-NonCommercial
-Commercial use requires explicit permission from the author.

+---
+license: cc-by-nc-4.0
+tags:
+  - vae
+  - image-generation
+  - diffusion
+  - complexity-diffusion
+library_name: pytorch
+pipeline_tag: image-to-image
+---
+# Complexity-Diffusion VAE
+Variational Autoencoder for Complexity-Diffusion image generation pipeline.
+## Architecture
+**89M parameters** | 256x256 images | 4-channel latent space
+### Encoder
+$$z = \mathcal{E}(x) \in \mathbb{R}^{32 \times 32 \times 4}$$
+Compresses 256x256x3 images to 32x32x4 latents (8x spatial compression).
+### Decoder
+$$\hat{x} = \mathcal{D}(z) \in \mathbb{R}^{256 \times 256 \times 3}$$
+### Loss Function
+$$\mathcal{L} = \mathcal{L}_{\text{recon}} + \beta \cdot D_{KL}(q(z|x) \| p(z)) + \lambda \cdot \mathcal{L}_{\text{perceptual}}$$
+Where:
+- $\mathcal{L}_{\text{recon}} = \|x - \hat{x}\|_1$ (L1 reconstruction)
+- $D_{KL}$ regularizes latent to $\mathcal{N}(0, I)$
+- $\mathcal{L}_{\text{perceptual}}$ uses VGG features
+## Config
+| Parameter | Value |
+|-----------|-------|
+| Image size | 256x256 |
+| Latent dim | 4 |
+| Base channels | 128 |
+| Channel mult | [1, 2, 4, 4] |
+| Res blocks | 2 |
+## Usage
+```python
+from safetensors.torch import load_file
+from complexity_diffusion.vae import ComplexityVAE
+# Load
+state_dict = load_file("model.safetensors")
+vae = ComplexityVAE(image_size=256, base_channels=128, latent_dim=4)
+vae.load_state_dict(state_dict)
+# Encode
+latents = vae.encode(images)  # [B, 4, 32, 32]
+# Decode
+reconstructed = vae.decode(latents)  # [B, 3, 256, 256]
+```
+## Training
+Trained on WikiArt (81K images) for 15K steps with:
+- Batch size: 16
+- Learning rate: 1e-4
+- Mixed precision: bf16
+### Training Curves
+![Training Curves](training_curves.png)
+## Part of Complexity Deep Ecosystem
+This VAE is designed to work with the Complexity-Diffusion pipeline, leveraging:
+- **INL Dynamics** for stable latent space training
+- **Token-Routed architecture** for efficient processing
+## Links
+- [Complexity Deep](https://huggingface.co/Pacific-Prime)
+- [PyPI Package](https://pypi.org/project/complexity-deep/)
+## License
+CC BY-NC 4.0 - Attribution-NonCommercial
+Commercial use requires explicit permission from the author.