The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

Published on Nov 16, 2023
Β· Featured in Daily Papers on Nov 17, 2023


Recent advances in text-to-image generation models have unlocked vast potential for visual creativity. However, these models struggle with generation of consistent characters, a crucial aspect for numerous real-world applications such as story visualization, game development asset design, advertising, and more. Current methods typically rely on multiple pre-existing images of the target character or involve labor-intensive manual processes. In this work, we propose a fully automated solution for consistent character generation, with the sole input being a text prompt. We introduce an iterative procedure that, at each stage, identifies a coherent set of images sharing a similar identity and extracts a more consistent identity from this set. Our quantitative analysis demonstrates that our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods, and these findings are reinforced by a user study. To conclude, we showcase several practical applications of our approach. Project page is available at



I've had people contact me through Freelancer to do this. Whoever figures this out is going to be rich. For instance, i want a dress on a model, the exact same dress, with different poses. A magazine generator is what she wanted. I told her she would be better served by a photographer. But what if you had a magic wand tool to exclude certain pixels from being manipulated in subsequent generations. A way to keep elements dialed in. Nevermind rich, this is the crux of inventing a whole new angle of generative imaging, a big milestone.
What if those pixels could not only be made static, but in a dynamic way so that they stay the same for design purposes but can be manipulated internally for scene continuity.
Life Story is hilarious btw




I would assume someone intends to patent this work.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite in a model to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite in a dataset to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite in a Space to link it from this page.

Collections including this paper 29