YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
SD Turbo ONNX Q8 Static ARM64
Source model: stabilityai/sd-turbo
Export:
- ONNX export from the original Diffusers model in FP32.
- Static ONNX Runtime Q8 quantization for ARM64-oriented testing.
- Quantized ops:
Conv,MatMul,Gemm. - Quantization format:
QOperator(QLinearConv,QLinearMatMul). - Activation type:
QUInt8. - Weight type:
QInt8. - Per-channel weights: enabled.
Known notes:
- This is not an OpenVINO INT8 export.
- It is intended for ONNX Runtime Android benchmarking.
- Calibration was minimal and prompt-oriented; quality should be validated on-device before production use.
- Scheduler parity expects Diffusers trailing Euler ancestral timesteps, e.g. 4 steps:
[999, 749, 499, 249].
Local smoke test:
.venv-onnx/bin/python onnx/onnx_txt2img_sdturbo_official.py \
--model-dir onnx/pipeline_runs/sd-turbo-q8-ort/onnx-q8-static-arm64 \
--prompt "A red sports car on a mountain road at sunrise" \
--scheduler euler-ancestral \
--seed 1234 \
--width 512 \
--height 512 \
--steps 4 \
--latent-rng torch
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support