FLUX.1 [schnell] -- Flumina Server App (FP8 Version)

This repository contains an implementation of the FLUX.1 [schnell] FP8 version, which utilizes float8 numerics instead of bfloat16. This update allows for a 2x performance improvement, significantly speeding up inference tasks when deployed via Fireworks AI’s Flumina Server App toolkit.

Getting Started -- Serverless deployment on Fireworks

This FP8 Server App is deployed to Fireworks as-is in a "serverless" deployment, offering high-speed, hassle-free performance.

Grab an API Key from Fireworks and set it in your environment variables:

export API_KEY=YOUR_API_KEY_HERE

Text-to-Image Example Call

curl -X POST 'https://api.fireworks.ai/inference/v1/workflows/accounts/fireworks/models/flux-1-schnell-fp8/text_to_image' \
    -H "Authorization: Bearer $API_KEY" \
    -H "Content-Type: application/json" \
    -H "Accept: image/jpeg" \
    -d '{
        "prompt": "Woman laying in the grass",
        "aspect_ratio": "16:9",
        "guidance_scale": 3.5,
        "num_inference_steps": 4,
        "seed": 0
    }' \
    --output output.jpg

What is Flumina?

Flumina is Fireworks.ai’s innovative platform for hosting Server Apps that lets users deploy deep learning inference to production environments in just minutes.

What does Flumina offer for FLUX models?

Flumina provides the following advantages for FLUX models:

Clear, precise definitions of server-side workloads by reviewing the server app implementation (right here).
Extensibility interface for dynamic loading and dispatching of add-ons server-side. For FLUX, this includes:
- ControlNet (Union) adapters
- LoRA adapters
Off-the-shelf support for on-demand capacity scaling with Server Apps on Fireworks.
- Customization of the deployment logic through modifications to the Server App, with easy redeployment.
Support for FP8 numerics, unlocking faster, more efficient inference capabilities.

Deploying FLUX.1 [schnell] FP8 Version to Fireworks On-Demand

Deploying Custom FLUX.1 [schnell] FP8 Apps to Fireworks On-demand

Coming soon!