YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

This is an unavoidable double quantization due to the release state of Ideogram4.

The FP8 weights were cast to FP32 with the FP8 scales, then downcast to BF16 before being converted to INT8.

Speed is 1.78x faster(2.03s/it) than FP8(3.62s/it) on my 3090, without compile.

~2x faster with torch compile.

Quick comparison:

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support