CodeFusion: A Pre-trained Diffusion Model for Code Generation

Published on Oct 26, 2023
· Featured in Daily Papers on Oct 30, 2023


Imagine a developer who can only change their last line of code, how often would they have to start writing a function from scratch before it is correct? Auto-regressive models for code generation from natural language have a similar limitation: they do not easily allow reconsidering earlier tokens generated. We introduce CodeFusion, a pre-trained diffusion code generation model that addresses this limitation by iteratively denoising a complete program conditioned on the encoded natural language. We evaluate CodeFusion on the task of natural language to code generation for Bash, Python, and Microsoft Excel conditional formatting (CF) rules. Experiments show that CodeFusion (75M parameters) performs on par with state-of-the-art auto-regressive systems (350M-175B parameters) in top-1 accuracy and outperforms them in top-3 and top-5 accuracy due to its better balance in diversity versus quality.


Amazing. Also, it would be intriguing to see the code being non-linearly inferred, would make for some interesting UI effects. Imagine the green rain matrix effect morphing into real source code

Also, according to this paper, gpt3.5 has 20B parameters.

The code in python for deleting unwanted cookie

Proposes CodeFusion: code generation model from diffusion (combined with an encoder-decoder model), conditioned on natural language. Diffusion for text: embedding layer to convert discrete tokens to continuous embeddings, then denoise, then retrieve closest discrete enbedding. Architecture has encoder, diffusion, decoder, and classification head (for code tokens). Two stage training: unsupervised pretraining of denoiser and decoder, and supervised (utterance, code) pairs fine-tuning for encoder, denoiser, and decoder. Loss adapted from GENIE. Benchmarked on Python (CoNaLa), Bash, and conditional rules in MS Excel. Encoder initialized from CodeT5. Better performance than StarCoder, CodeT5+, and GPT-3.5 (Python CodeBERT, bash template, and CF rule execution); also generates more diverse outputs. Appendix has implementation and training details, baseline details, visualization of diffusion process (code with time step), and background (auto regression and diffusion). From Microsoft.

Links: PapersWithCode

Has anyone made a model like this available?


I wondering if the 20B is a typo.
If its true, it should be a BIG news.

I wondering if the 20B is a typo.
If its true, it should be a BIG news.

Maybe it's three open source 7B models stuck together and Microsoft is trolling them 😂

The paper just got withdrawn... Anyone have an archive?

EDIT: Found it on

The paper just got withdrawn... Anyone have an archive?

EDIT: Found it on

You can also check previous version:

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite in a model to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite in a dataset to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite in a Space to link it from this page.

Collections including this paper 36