AI & ML interests

None defined yet.

Recent Activity

REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers

Xingjian Leng1*   Β·   Jaskirat Singh1*   Β·   Yunzhong Hou1   Β·   Zhenchang Xing2  Β·   Saining Xie3  Β·   Liang Zheng1 

1 Australian National University   2Data61-CSIRO   3New York University  
*Project Leads 

🌐 Project Page   πŸ€— Models   πŸ“ƒ Paper  
PWC

teaser


We address a fundamental question: Can latent diffusion models and their VAE tokenizer be trained end-to-end? While training both components jointly with standard diffusion loss is observed to be ineffective β€” often degrading final performance β€” we show that this limitation can be overcome using a simple representation-alignment (REPA) loss. Our proposed method, REPA-E, enables stable and effective joint training of both the VAE and the diffusion model.

teaser

REPA-E significantly accelerates training β€” achieving over 17Γ— speedup compared to REPA and 45Γ— over the vanilla training recipe. Interestingly, end-to-end tuning also improves the VAE itself: the resulting E2E-VAE provides better latent structure and serves as a drop-in replacement for existing VAEs (e.g., SD-VAE), improving convergence and generation quality across diverse LDM architectures. Our method achieves state-of-the-art FID scores on ImageNet 256Γ—256: 1.26 with CFG and 1.83 without CFG.

Usage and Training

Please refer our Github Repo for detailed notes on end-to-end training and inference using REPA-E.

πŸ“š Citation

@article{leng2025repae,
  title={REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers},
  author={Xingjian Leng and Jaskirat Singh and Yunzhong Hou and Zhenchang Xing and Saining Xie and Liang Zheng},
  year={2025},
  journal={arXiv preprint arXiv:2504.10483},
}