WeavePrompt / README.md
kevin1kevin1k's picture
Upload folder using huggingface_hub
b1917a1 verified
metadata
title: WeavePrompt
emoji: 🎨
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: 5.44.1
pinned: false
license: mit
app_file: app.py
app_port: 7860

WeavePrompt

Iterative prompt refinement for image generation models; by giving a target image, WeavePrompt automatically generates and refines text prompts to make a model's output resemble the target image, using vision-language models and perceptual metrics.

Introduction

WeavePrompt is a research and development project designed to evaluate and refine text-to-image generation prompts across multiple state-of-the-art image generation models. The primary goal is to optimize prompts such that the generated images align closely with a given reference image, improving both fidelity and semantic consistency.

Procedure/Implementation: The process involves generating images from identical prompts using various image generation models, comparing the results to a reference image through a recognition and similarity evaluation pipeline, and iteratively adjusting the prompt to minimize perceptual differences. This feedback loop continues for a set number of iterations, progressively enhancing prompt effectiveness.

To achieve this, WeavePrompt integrates advanced tools:

  • Image recognition is powered by meta-llama/Llama-4-Scout-17B-16E-Instruct.

  • Similarity evaluation uses the LPIPS (alex) metric for perceptual comparison.

  • Image generation models under evaluation include:

    • FLUX family: FLUX.1 [pro], [dev], and [schnell]
    • Google models: Imagen 4, Imagen 4 Ultra, and Gemini 2.5 Flash Image
    • Other models: Stable Diffusion 3.5 Large and Qwen Image

By systematically combining prompt optimization with multi-model evaluation, WeavePrompt aims to advance the understanding of cross-model prompt effectiveness and improve controllability in image generation tasks.

Features

  • Upload a target image
  • Step-by-step prompt optimization
  • View prompt and generated image at each iteration
  • Full optimization history

Installation

  1. Clone the repository:
    git clone https://github.com/kevin1kevin1k/WeavePrompt.git
    cd WeavePrompt
    
  2. Install dependencies:
    uv venv
    uv sync
    source .venv/bin/activate
    
  3. Setup .env Put the following inside .env:
  • API keys WANDB_API_KEY and FAL_KEY
  • Weave project name WEAVE_PROJECT

Usage

Run the demo app:

streamlit run src/app.py

Follow the instructions in the browser to upload an image and step through the optimization process.

Architecture Diagram

diagram

Outcome

outcome

Use the same prompt as the standard model, the target model yields the similar (high quality) output as a result.

References