WeavePrompt / README.md
kevin1kevin1k's picture
Upload folder using huggingface_hub
b1917a1 verified
---
title: WeavePrompt
emoji: 🎨
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: 5.44.1
pinned: false
license: mit
app_file: app.py
app_port: 7860
---
# WeavePrompt
Iterative prompt refinement for image generation models; by giving a target image, **WeavePrompt** automatically generates and refines text prompts to make a model's output resemble the target image, using vision-language models and perceptual metrics.
## Introduction
**WeavePrompt** is a research and development project designed to evaluate and refine text-to-image generation prompts across multiple state-of-the-art image generation models.
The primary goal is to optimize prompts such that the generated images align closely with a given reference image, improving both fidelity and semantic consistency.
**Procedure/Implementation**:
The process involves generating images from identical prompts using various image generation models, comparing the results to a reference image through a recognition and similarity evaluation pipeline, and iteratively adjusting the prompt to minimize perceptual differences.
This feedback loop continues for a set number of iterations, progressively enhancing prompt effectiveness.
To achieve this, **WeavePrompt** integrates advanced tools:
- **Image recognition** is powered by meta-llama/Llama-4-Scout-17B-16E-Instruct.
- **Similarity evaluation** uses the **LPIPS (alex)** metric for perceptual comparison.
- **Image generation models** under evaluation include:
- FLUX family: FLUX.1 [pro], [dev], and [schnell]
- Google models: Imagen 4, Imagen 4 Ultra, and Gemini 2.5 Flash Image
- Other models: Stable Diffusion 3.5 Large and Qwen Image
By systematically combining prompt optimization with multi-model evaluation, **WeavePrompt** aims to advance the understanding of cross-model prompt effectiveness and improve controllability in image generation tasks.
## Features
- Upload a target image
- Step-by-step prompt optimization
- View prompt and generated image at each iteration
- Full optimization history
## Installation
1. Clone the repository:
```bash
git clone https://github.com/kevin1kevin1k/WeavePrompt.git
cd WeavePrompt
```
2. Install dependencies:
```bash
uv venv
uv sync
source .venv/bin/activate
```
3. Setup `.env`
Put the following inside `.env`:
- API keys `WANDB_API_KEY` and `FAL_KEY`
- Weave project name `WEAVE_PROJECT`
## Usage
Run the demo app:
```bash
streamlit run src/app.py
```
Follow the instructions in the browser to upload an image and step through the optimization process.
## Architecture Diagram
![diagram](./diagram.png)
## Outcome
![outcome](./outcome.png)
Use the same prompt as the standard model, the target model yields the similar (high quality) output as a result.
## References
- https://arxiv.org/abs/1801.03924 - The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
- https://arxiv.org/abs/2510.06335 - Image Reconstruction from Highly Undersampled Data
- https://arxiv.org/abs/2510.03191 - Product-Quantised Image Representation for High-Quality Image Synthesis