arxiv:2402.13144

Neural Network Diffusion

Published on Feb 20

· Submitted by

akhaliq on Feb 21

#1 Paper of the day

Authors:

Kai Wang ,

,

Yukun Zhou ,

,

Trevor Darrell ,

Zhuang Liu ,

Abstract

Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also generate high-performing neural network parameters. Our approach is simple, utilizing an autoencoder and a standard latent diffusion model. The autoencoder extracts latent representations of a subset of the trained network parameters. A diffusion model is then trained to synthesize these latent parameter representations from random noise. It then generates new representations that are passed through the autoencoder's decoder, whose outputs are ready to use as new subsets of network parameters. Across various architectures and datasets, our diffusion process consistently generates models of comparable or improved performance over trained networks, with minimal additional cost. Notably, we empirically find that the generated models perform differently with the trained networks. Our results encourage more exploration on the versatile use of diffusion models.

View arXiv page View PDF Add to collection

Community

Feb 21

Wow... 🤯

Feb 21

predicting loras on the fly, routing in moe, etc. could all use this

JavedA

Feb 21

I read some of the paper and give a very short and brief summary of it: https://twitter.com/JavArButt/status/1760273030540869868

Also, I pose a question, that might be interesting for further research

·

VictorKai1996NUS

Paper author Feb 21

maybe in the next version, we will explore the tech of cross-arch parameter generation. Thanks for your question!

Feb 21

interesting

dwidlee

Feb 22

•

thinking about getting optimal initial state of parameters ahead of training, would reduce pre-training cost.

lettersandpatterns

Feb 22

Shouldn't they compare with https://openreview.net/forum?id=JXkz3zm8gJ more thoroughly?

·

VictorKai1996NUS

Paper author Feb 22

hope this reply (https://x.com/liuzhuang1234/status/1760363128309600607?s=20) can address your question.

Feb 22

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Jun 8

Harnessing Diffusion Models for Superior Neural Network Parameters

Links 🔗:

👉 Subscribe: https://www.youtube.com/@Arxflix
👉 Twitter: https://x.com/arxflix
👉 LMNT (Partner): https://lmnt.com/

By Arxflix

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2402.13144 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2402.13144 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2402.13144 in a Space README.md to link it from this page.

Collections including this paper 27