Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.1.0
metadata
title: Text to Image Generator
emoji: π¨
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: false
license: mit
ImageGenerator
GEMINI 2.5 PRO PROMPT ENHANCEMENT
Original prompt: a dog in front of a desk
Enhanced prompt: A photorealistic, eye-level shot of a Golden Retriever sitting by a wooden desk. Soft, warm morning light streams from a window, creating gentle highlights and deep shadows. The background is softly blurred, focusing on the dog's detailed fur. A cozy, cinematic atmosphere with an earthy color palette.
Word count: 47
Processing sample 1/1: A photorealistic, eye-level shot of a Golden Retriever sitting by a wooden desk. Soft, warm morning light streams from a window, creating gentle highlights and deep shadows. The background is softly blurred, focusing on the dog's detailed fur. A cozy, cinematic atmosphere with an earthy color palette.
100%|ββββββββββ| 30/30 [00:05<00:00, 5.08it/s]
Saved as: generated_image_0.png
============================================================
DETAILED SAMPLE SCORES
============================================================
Average CLIPScore: 0.2753
Average GenEval Score: 0.6883
Inception Score: 1.0000 Β± 0.0000
Inception Entropy: 7.5517
Text to Image Generation Web App
A Flask-based web application that generates images from text prompts using Stable Diffusion and evaluates them using CLIP scores.
Features
- π¨ Generate images from text descriptions
- π CLIP score evaluation for image-text alignment
- π¬ Chat-style interface
- π Modern dark theme UI
Requirements
- Python 3.8+
- CUDA-capable GPU (recommended) or CPU
- ~10GB disk space for models
Installation
Clone or download the project files
Create a virtual environment (recommended)
python -m venv venv source venv/bin/activate # Linux/Mac # or venv\Scripts\activate # WindowsInstall dependencies
pip install -r requirements.txt
Running the Application
Start the Flask server
python app.pyOpen your browser and go to:
http://localhost:5000Enter a prompt and click "Generate" to create an image!
Project Structure
project/
βββ app.py # Main Flask application
βββ templates/
β βββ index.html # Web interface
βββ static/
β βββ images/ # Static assets (optional)
βββ requirements.txt # Python dependencies
βββ README.md # This file
Notes
- First run: The application will download required AI models (~5-7GB). This happens only once.
- Generation time: Each image takes 30-60 seconds on GPU, longer on CPU.
- Memory: Requires ~8GB VRAM (GPU) or ~16GB RAM (CPU).
Troubleshooting
Out of Memory
If you get CUDA out of memory errors, try:
- Closing other GPU applications
- Reducing
num_inference_stepsinapp.py - Using CPU mode (slower but works with less memory)
Slow Generation
- CPU mode is significantly slower than GPU
- Consider using a cloud GPU service for better performance
API Endpoints
GET /- Web interfacePOST /- Generate image (form data:prompt)GET /health- Health check
Scores Explained
- CLIP Score: Measures how well the image matches the text (0-1, higher is better)
- GenEval Score: Derived metric (CLIP Γ 2.5) for easier interpretation