File size: 8,180 Bytes
edff8f9 557227d edff8f9 bfeeedb 7c84ff6 bfeeedb 8b41055 bfeeedb 8b41055 bfeeedb 8b41055 bfeeedb 8b41055 bfeeedb 8b41055 bfeeedb 8b41055 bfeeedb 8b41055 bfeeedb 8b41055 bfeeedb 8b41055 bfeeedb 8b41055 bfeeedb 8b41055 bfeeedb 8b41055 230d561 557227d 230d561 557227d 230d561 8b41055 bfeeedb 8b41055 bfeeedb 8b41055 bfeeedb 8b41055 bfeeedb 557227d 4302ebf |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 |
---
language: en
license: mit
library_name: diffusers
tags:
- stable-diffusion-xl
- text-to-image
- diffusers
- huggingface-inference-endpoints
- custom-inference
pipeline_tag: text-to-image
inference: true
---
# Hugging Face Inference API for Stable Diffusion XL
This repository contains a text-to-image generation API designed to be deployed on Hugging Face Inference Endpoints, using Stable Diffusion XL models for image generation.
## Features
- Compatible with Hugging Face Inference Endpoints
- Stable Diffusion XL (SDXL) model for high-quality image generation
- Content filtering for safe image generation
- Configurable image dimensions (default: 1024x768)
- Base64-encoded image output
- Performance optimizations (torch.compile, attention processors)
## Project Structure
The codebase has been simplified to only use a single file:
- `handler.py`: Contains the `EndpointHandler` class that implements the Hugging Face Inference Endpoints interface. This file also includes a built-in FastAPI server for local development.
## Configuration
The service is configured via the `app.conf` JSON file with the following parameters:
```json
{
"model_id": "your-huggingface-model-id",
"prompt": "template with {prompt} placeholder",
"negative_prompt": "default negative prompt",
"inference_steps": 30,
"guidance_scale": 7,
"use_safetensors": true,
"width": 1024,
"height": 768
}
```
## API Usage
### Hugging Face Inference Endpoints Format
When deployed to Hugging Face Inference Endpoints, the API accepts requests in the following format:
```json
{
"inputs": "your prompt here",
"parameters": {
"negative_prompt": "optional negative prompt",
"seed": 12345,
"inference_steps": 30,
"guidance_scale": 7,
"width": 1024,
"height": 768
}
}
```
Response format:
```json
[
{
"generated_image": "base64-encoded-image",
"seed": 12345
}
]
```
### Local Development Format
When running locally, you can use the same format as above, or a simplified format:
```json
{
"prompt": "your prompt here",
"negative_prompt": "optional negative prompt",
"seed": 12345,
"inference_steps": 30,
"guidance_scale": 7,
"width": 1024,
"height": 768
}
```
Response format from the local server:
```json
[
{
"generated_image": "base64-encoded-image",
"seed": 12345
}
]
```
## Deployment on Hugging Face Inference Endpoints
### Step 1: Push this repository to Hugging Face Hub
1. Create a new repository on Hugging Face Hub:
```bash
huggingface-cli repo create your-repo-name
```
2. Add the Hugging Face repository as a remote:
```bash
git remote add huggingface https://huggingface.co/username/your-repo-name
```
3. Push your code to the Hugging Face repository:
```bash
git push huggingface your-branch:main
```
### Step 2: Create an Inference Endpoint
1. Go to your repository on Hugging Face Hub: https://huggingface.co/username/your-repo-name
2. Click on "Deploy" in the top menu, then select "Inference Endpoints"
3. Click "Create a new endpoint"
4. Configure your endpoint with the following settings:
- Name: Give your endpoint a name
- Region: Choose a region close to your users (e.g., us-east-1)
- Instance Type: Choose a GPU instance (recommended: at least 16GB VRAM for SDXL)
- Replicas: Start with 1 replica
- Autoscaling: Configure as needed
**IMPORTANT: IF YOU SEE THIS WARNING**:
> "Warning: deploying this model will probably fail because the model's Diffusers pipeline is not set"
5. Click "Continue anyway" - this is expected because you're using a custom handler implementation
6. Under Advanced configuration:
- Make sure "Framework" is set to "Custom"
- Configure "Task" as "Text-to-Image"
7. Click "Create endpoint"
The Hugging Face Inference Endpoints service will automatically detect and use your `EndpointHandler` class in the `handler.py` file.
### Step 3: Test your Inference Endpoint
Once deployed, you can test your endpoint using:
```python
import requests
import json
import base64
from PIL import Image
import io
# Your Hugging Face API token and endpoint URL
API_TOKEN = "your-hugging-face-api-token"
API_URL = "https://api-inference.huggingface.co/models/username/your-repo-name"
# Headers for the request
headers = {
"Authorization": f"Bearer {API_TOKEN}",
"Content-Type": "application/json"
}
# Request payload
payload = {
"inputs": "a beautiful landscape with mountains and a lake",
"parameters": {
"negative_prompt": "blurry, low quality",
"seed": 42,
"inference_steps": 30,
"guidance_scale": 7
}
}
# Send the request
response = requests.post(API_URL, headers=headers, json=payload)
result = response.json()
# Convert the base64-encoded image to a PIL Image
image_bytes = base64.b64decode(result[0]["generated_image"])
image = Image.open(io.BytesIO(image_bytes))
image.save("generated_image.jpg")
print(f"Image saved with seed: {result[0]['seed']}")
```
### Required Files
For deployment on Hugging Face Inference Endpoints, you need:
- `handler.py` - Contains the `EndpointHandler` class implementation
- `requirements.txt` - Lists the Python dependencies
- `app.conf` - Contains configuration parameters
Note: A `Procfile` is not needed for Hugging Face Inference Endpoints deployment, as the service automatically detects and uses the `EndpointHandler` class.
## Local Development
1. Install dependencies: `pip install -r requirements.txt`
2. Run the API locally: `python handler.py [--port PORT] [--host HOST]`
3. The API will be available at http://localhost:8000
The local server uses the FastAPI implementation included in `handler.py` that provides the same functionality as the Hugging Face Inference Endpoints interface.
## Environment Variables
- `PORT`: Port to run the server on (default: 8000)
- `USE_TORCH_COMPILE`: Set to "1" to enable torch.compile for performance (default: "0")
## License
This project is licensed under the terms of the MIT license.
## Testing Your Inference Endpoint
We've included a test script `test_endpoint.py` to help you test your deployed endpoint.
### Prerequisites
- Python 3.7+
- Your Hugging Face API token
- An active Hugging Face Inference Endpoint
### Installation
```bash
pip install requests pillow
```
### Usage
```bash
python test_endpoint.py --token "YOUR_HF_API_TOKEN" --url "YOUR_ENDPOINT_URL" --prompt "your test prompt here"
```
#### Additional Options
```
--negative_prompt TEXT Negative prompt to guide generation
--seed INTEGER Random seed for reproducibility
--steps INTEGER Number of inference steps (default: 30)
--guidance FLOAT Guidance scale (default: 7.0)
--width INTEGER Image width (default: 1024)
--height INTEGER Image height (default: 768)
--output_dir TEXT Directory to save generated images (default: "generated_images")
```
#### Example
```bash
python test_endpoint.py \
--token "hf_..." \
--url "https://api-inference.huggingface.co/models/username/your-repo-name" \
--prompt "beautiful sunset over mountains" \
--negative_prompt "blurry, low quality" \
--seed 42 \
--steps 30 \
--guidance 7.5
```
This will:
1. Send a request to your endpoint
2. Download the generated image
3. Save it to the specified output directory
4. Display the seed used for generation
## Troubleshooting
### Error: "You are trying to load the model files of the `variant=fp16`, but no such modeling files are available"
If you encounter this error when deploying your endpoint, it means the model you're trying to use doesn't have an fp16 variant explicitly available. To fix this:
1. Open `handler.py`
2. Find the `StableDiffusionXLPipeline.from_pretrained` call
3. Remove the `variant="fp16"` parameter
The corrected code should look like:
```python
pipe = StableDiffusionXLPipeline.from_pretrained(
ckpt_dir,
vae=vae,
torch_dtype=torch.float16,
use_safetensors=self.cfg.get("use_safetensors", True)
)
```
This change allows the model to be loaded with fp16 precision without requiring a specific fp16 variant of the model weights. |