Spaces:
Runtime error
title: WeavePrompt
emoji: 🎨
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: 5.44.1
pinned: false
license: mit
app_file: app.py
app_port: 7860
WeavePrompt
Iterative prompt refinement for image generation models; by giving a target image, WeavePrompt automatically generates and refines text prompts to make a model's output resemble the target image, using vision-language models and perceptual metrics.
Introduction
WeavePrompt is a research and development project designed to evaluate and refine text-to-image generation prompts across multiple state-of-the-art image generation models. The primary goal is to optimize prompts such that the generated images align closely with a given reference image, improving both fidelity and semantic consistency.
Procedure/Implementation: The process involves generating images from identical prompts using various image generation models, comparing the results to a reference image through a recognition and similarity evaluation pipeline, and iteratively adjusting the prompt to minimize perceptual differences. This feedback loop continues for a set number of iterations, progressively enhancing prompt effectiveness.
To achieve this, WeavePrompt integrates advanced tools:
Image recognition is powered by meta-llama/Llama-4-Scout-17B-16E-Instruct.
Similarity evaluation uses the LPIPS (alex) metric for perceptual comparison.
Image generation models under evaluation include:
- FLUX family: FLUX.1 [pro], [dev], and [schnell]
- Google models: Imagen 4, Imagen 4 Ultra, and Gemini 2.5 Flash Image
- Other models: Stable Diffusion 3.5 Large and Qwen Image
By systematically combining prompt optimization with multi-model evaluation, WeavePrompt aims to advance the understanding of cross-model prompt effectiveness and improve controllability in image generation tasks.
Features
- Upload a target image
- Step-by-step prompt optimization
- View prompt and generated image at each iteration
- Full optimization history
Installation
- Clone the repository:
git clone https://github.com/kevin1kevin1k/WeavePrompt.git cd WeavePrompt - Install dependencies:
uv venv uv sync source .venv/bin/activate - Setup
.envPut the following inside.env:
- API keys
WANDB_API_KEYandFAL_KEY - Weave project name
WEAVE_PROJECT
Usage
Run the demo app:
streamlit run src/app.py
Follow the instructions in the browser to upload an image and step through the optimization process.
Architecture Diagram
Outcome
Use the same prompt as the standard model, the target model yields the similar (high quality) output as a result.
References
- https://arxiv.org/abs/1801.03924 - The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
- https://arxiv.org/abs/2510.06335 - Image Reconstruction from Highly Undersampled Data
- https://arxiv.org/abs/2510.03191 - Product-Quantised Image Representation for High-Quality Image Synthesis

