🎨 Custom 8-Channel VAE (f8)

This is a custom-trained Variational Autoencoder (VAE) featuring an 8-channel latent space and an f8 downsampling factor. It was trained from scratch on a combination of ImageNet and CelebA datasets to achieve highly detailed image reconstruction and robust latent representations.

While originally developed as the latent backbone for the NovaFace-DiT model, this VAE is entirely independent and can be used as a drop-in component for any custom Latent Diffusion Model (LDM) or Flow Matching architecture.

Top row: Original Images (Unseen data). Bottom row: 8-Channel VAE Reconstructions.

📊 Model Details

Model Type: Variational Autoencoder (VAE)
Latent Channels: 8
Downsample Factor: 8 (f8)
Parameters: ~100 Million
Training Datasets: ImageNet (1.3M) + CelebA
Max Supported Resolution: up to 1024x1024
License: Creative Commons BY-NC 4.0 (Non-commercial)

🏗️ Architecture Configuration

If you are initializing this model in PyTorch using the official codebase, the architecture parameters are as follows:

model_architecture_config = {
    'in_channels': 3,
    'out_channels': 3,
    'base_channels': 128,
    'channel_multipliers': [1, 2, 4, 4],
    'num_residual_blocks_per_level': [2, 2, 2, 4],
    'z_channels': 8
}

🚀 How to Use

The weights provided here (Nova_ae_f8.safetensors) are intended to be loaded into the custom VAE architecture defined in our GitHub repository.

🔗 Official GitHub Repository (Code & UI): devbnamdar/MM-DiT-From-Scratch

Using with NovaFace-DiT:

Download the .safetensors file from this repository.
Place it in the vae_models/ directory of your cloned GitHub project.
Update the vae_path in config.py (or select it in the Gradio UI).

📄 Citation

If you use this model in your research, please cite:

@misc{namdar2026mmdit,
  author       = {Namdar, Bunyamin},
  title        = {MM-DiT From Scratch: High-Fidelity Diffusion Training on Limited Dataset},
  year         = {2026},
  publisher    = {GitHub},
  url          = {https://github.com/devbnamdar/MM-DiT-From-Scratch}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

devbnamdar
/

Nova_ae_f8

🎨 Custom 8-Channel VAE (f8)

📊 Model Details

🏗️ Architecture Configuration

🚀 How to Use

📄 Citation

Dataset used to train devbnamdar/Nova_ae_f8