Papers
arxiv:2412.08781

Generative Modeling with Explicit Memory

Published on Dec 11
Authors:
,
,
,

Abstract

Recent studies indicate that the denoising process in deep generative diffusion models implicitly learns and memorizes semantic information from the data distribution. These findings suggest that capturing more complex data distributions requires larger neural networks, leading to a substantial increase in computational demands, which in turn become the primary bottleneck in both training and inference of diffusion models. To this end, we introduce Generative Modeling with Explicit Memory (GMem), leveraging an external memory bank in both training and sampling phases of diffusion models. This approach preserves semantic information from data distributions, reducing reliance on neural network capacity for learning and generalizing across diverse datasets. The results are significant: our GMem enhances both training, sampling efficiency, and generation quality. For instance, on ImageNet at 256 times 256 resolution, GMem accelerates SiT training by over 46.7times, achieving the performance of a SiT model trained for 7M steps in fewer than 150K steps. Compared to the most efficient existing method, REPA, GMem still offers a 16times speedup, attaining an FID score of 5.75 within 250K steps, whereas REPA requires over 4M steps. Additionally, our method achieves state-of-the-art generation quality, with an FID score of {3.56} without classifier-free guidance on ImageNet 256times256. Our code is available at https://github.com/LINs-lab/GMem.

Community

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2412.08781 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2412.08781 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.