|
--- |
|
language: en |
|
license: mit |
|
library_name: diffusers |
|
tags: |
|
- stable-diffusion-xl |
|
- text-to-image |
|
- diffusers |
|
- huggingface-inference-endpoints |
|
- custom-inference |
|
pipeline_tag: text-to-image |
|
inference: true |
|
--- |
|
|
|
# Hugging Face Inference API for Stable Diffusion XL |
|
|
|
This repository contains a text-to-image generation API designed to be deployed on Hugging Face Inference Endpoints, using Stable Diffusion XL models for image generation. |
|
|
|
## Features |
|
|
|
- Compatible with Hugging Face Inference Endpoints |
|
- Stable Diffusion XL (SDXL) model for high-quality image generation |
|
- Content filtering for safe image generation |
|
- Configurable image dimensions (default: 1024x768) |
|
- Base64-encoded image output |
|
- Performance optimizations (torch.compile, attention processors) |
|
|
|
## Project Structure |
|
|
|
The codebase has been simplified to only use a single file: |
|
|
|
- `handler.py`: Contains the `EndpointHandler` class that implements the Hugging Face Inference Endpoints interface. This file also includes a built-in FastAPI server for local development. |
|
|
|
## Configuration |
|
|
|
The service is configured via the `app.conf` JSON file with the following parameters: |
|
|
|
```json |
|
{ |
|
"model_id": "your-huggingface-model-id", |
|
"prompt": "template with {prompt} placeholder", |
|
"negative_prompt": "default negative prompt", |
|
"inference_steps": 30, |
|
"guidance_scale": 7, |
|
"use_safetensors": true, |
|
"width": 1024, |
|
"height": 768 |
|
} |
|
``` |
|
|
|
## API Usage |
|
|
|
### Hugging Face Inference Endpoints Format |
|
|
|
When deployed to Hugging Face Inference Endpoints, the API accepts requests in the following format: |
|
|
|
```json |
|
{ |
|
"inputs": "your prompt here", |
|
"parameters": { |
|
"negative_prompt": "optional negative prompt", |
|
"seed": 12345, |
|
"inference_steps": 30, |
|
"guidance_scale": 7, |
|
"width": 1024, |
|
"height": 768 |
|
} |
|
} |
|
``` |
|
|
|
Response format: |
|
```json |
|
[ |
|
{ |
|
"generated_image": "base64-encoded-image", |
|
"seed": 12345 |
|
} |
|
] |
|
``` |
|
|
|
### Local Development Format |
|
|
|
When running locally, you can use the same format as above, or a simplified format: |
|
|
|
```json |
|
{ |
|
"prompt": "your prompt here", |
|
"negative_prompt": "optional negative prompt", |
|
"seed": 12345, |
|
"inference_steps": 30, |
|
"guidance_scale": 7, |
|
"width": 1024, |
|
"height": 768 |
|
} |
|
``` |
|
|
|
Response format from the local server: |
|
```json |
|
[ |
|
{ |
|
"generated_image": "base64-encoded-image", |
|
"seed": 12345 |
|
} |
|
] |
|
``` |
|
|
|
## Deployment on Hugging Face Inference Endpoints |
|
|
|
### Step 1: Push this repository to Hugging Face Hub |
|
|
|
1. Create a new repository on Hugging Face Hub: |
|
```bash |
|
huggingface-cli repo create your-repo-name |
|
``` |
|
|
|
2. Add the Hugging Face repository as a remote: |
|
```bash |
|
git remote add huggingface https://huggingface.co/username/your-repo-name |
|
``` |
|
|
|
3. Push your code to the Hugging Face repository: |
|
```bash |
|
git push huggingface your-branch:main |
|
``` |
|
|
|
### Step 2: Create an Inference Endpoint |
|
|
|
1. Go to your repository on Hugging Face Hub: https://huggingface.co/username/your-repo-name |
|
2. Click on "Deploy" in the top menu, then select "Inference Endpoints" |
|
3. Click "Create a new endpoint" |
|
4. Configure your endpoint with the following settings: |
|
- Name: Give your endpoint a name |
|
- Region: Choose a region close to your users (e.g., us-east-1) |
|
- Instance Type: Choose a GPU instance (recommended: at least 16GB VRAM for SDXL) |
|
- Replicas: Start with 1 replica |
|
- Autoscaling: Configure as needed |
|
|
|
**IMPORTANT: IF YOU SEE THIS WARNING**: |
|
> "Warning: deploying this model will probably fail because the model's Diffusers pipeline is not set" |
|
|
|
5. Click "Continue anyway" - this is expected because you're using a custom handler implementation |
|
6. Under Advanced configuration: |
|
- Make sure "Framework" is set to "Custom" |
|
- Configure "Task" as "Text-to-Image" |
|
|
|
7. Click "Create endpoint" |
|
|
|
The Hugging Face Inference Endpoints service will automatically detect and use your `EndpointHandler` class in the `handler.py` file. |
|
|
|
### Step 3: Test your Inference Endpoint |
|
|
|
Once deployed, you can test your endpoint using: |
|
|
|
```python |
|
import requests |
|
import json |
|
import base64 |
|
from PIL import Image |
|
import io |
|
|
|
# Your Hugging Face API token and endpoint URL |
|
API_TOKEN = "your-hugging-face-api-token" |
|
API_URL = "https://api-inference.huggingface.co/models/username/your-repo-name" |
|
|
|
# Headers for the request |
|
headers = { |
|
"Authorization": f"Bearer {API_TOKEN}", |
|
"Content-Type": "application/json" |
|
} |
|
|
|
# Request payload |
|
payload = { |
|
"inputs": "a beautiful landscape with mountains and a lake", |
|
"parameters": { |
|
"negative_prompt": "blurry, low quality", |
|
"seed": 42, |
|
"inference_steps": 30, |
|
"guidance_scale": 7 |
|
} |
|
} |
|
|
|
# Send the request |
|
response = requests.post(API_URL, headers=headers, json=payload) |
|
result = response.json() |
|
|
|
# Convert the base64-encoded image to a PIL Image |
|
image_bytes = base64.b64decode(result[0]["generated_image"]) |
|
image = Image.open(io.BytesIO(image_bytes)) |
|
image.save("generated_image.jpg") |
|
print(f"Image saved with seed: {result[0]['seed']}") |
|
``` |
|
|
|
### Required Files |
|
|
|
For deployment on Hugging Face Inference Endpoints, you need: |
|
- `handler.py` - Contains the `EndpointHandler` class implementation |
|
- `requirements.txt` - Lists the Python dependencies |
|
- `app.conf` - Contains configuration parameters |
|
|
|
Note: A `Procfile` is not needed for Hugging Face Inference Endpoints deployment, as the service automatically detects and uses the `EndpointHandler` class. |
|
|
|
## Local Development |
|
|
|
1. Install dependencies: `pip install -r requirements.txt` |
|
2. Run the API locally: `python handler.py [--port PORT] [--host HOST]` |
|
3. The API will be available at http://localhost:8000 |
|
|
|
The local server uses the FastAPI implementation included in `handler.py` that provides the same functionality as the Hugging Face Inference Endpoints interface. |
|
|
|
## Environment Variables |
|
|
|
- `PORT`: Port to run the server on (default: 8000) |
|
- `USE_TORCH_COMPILE`: Set to "1" to enable torch.compile for performance (default: "0") |
|
|
|
## License |
|
|
|
This project is licensed under the terms of the MIT license. |
|
|
|
## Testing Your Inference Endpoint |
|
|
|
We've included a test script `test_endpoint.py` to help you test your deployed endpoint. |
|
|
|
### Prerequisites |
|
- Python 3.7+ |
|
- Your Hugging Face API token |
|
- An active Hugging Face Inference Endpoint |
|
|
|
### Installation |
|
```bash |
|
pip install requests pillow |
|
``` |
|
|
|
### Usage |
|
```bash |
|
python test_endpoint.py --token "YOUR_HF_API_TOKEN" --url "YOUR_ENDPOINT_URL" --prompt "your test prompt here" |
|
``` |
|
|
|
#### Additional Options |
|
``` |
|
--negative_prompt TEXT Negative prompt to guide generation |
|
--seed INTEGER Random seed for reproducibility |
|
--steps INTEGER Number of inference steps (default: 30) |
|
--guidance FLOAT Guidance scale (default: 7.0) |
|
--width INTEGER Image width (default: 1024) |
|
--height INTEGER Image height (default: 768) |
|
--output_dir TEXT Directory to save generated images (default: "generated_images") |
|
``` |
|
|
|
#### Example |
|
```bash |
|
python test_endpoint.py \ |
|
--token "hf_..." \ |
|
--url "https://api-inference.huggingface.co/models/username/your-repo-name" \ |
|
--prompt "beautiful sunset over mountains" \ |
|
--negative_prompt "blurry, low quality" \ |
|
--seed 42 \ |
|
--steps 30 \ |
|
--guidance 7.5 |
|
``` |
|
|
|
This will: |
|
1. Send a request to your endpoint |
|
2. Download the generated image |
|
3. Save it to the specified output directory |
|
4. Display the seed used for generation |
|
|
|
## Troubleshooting |
|
|
|
### Error: "You are trying to load the model files of the `variant=fp16`, but no such modeling files are available" |
|
|
|
If you encounter this error when deploying your endpoint, it means the model you're trying to use doesn't have an fp16 variant explicitly available. To fix this: |
|
|
|
1. Open `handler.py` |
|
2. Find the `StableDiffusionXLPipeline.from_pretrained` call |
|
3. Remove the `variant="fp16"` parameter |
|
|
|
The corrected code should look like: |
|
```python |
|
pipe = StableDiffusionXLPipeline.from_pretrained( |
|
ckpt_dir, |
|
vae=vae, |
|
torch_dtype=torch.float16, |
|
use_safetensors=self.cfg.get("use_safetensors", True) |
|
) |
|
``` |
|
|
|
This change allows the model to be loaded with fp16 precision without requiring a specific fp16 variant of the model weights. |