Instructions to use Bedovyy/Anima-FP8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusion Single File
How to use Bedovyy/Anima-FP8 with Diffusion Single File:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
For those who are wondering why Anima FP8 in ComfyUI only grants a minor speed increase
#4
by Zhijie-Chen - opened
See this issue
FWIW, I tested one of my suggested changes locally, and the generation speed increase goes from ~15% to ~50%, which is what one would expect from FP8 on a compute bottlenecked application.
Great. I have tested with simple monkey patch that AI made, and It boost generation speed.
Hope to see this optimization integrated soon.
| Model Quantization | Configuration | Before | After | Improvement |
|---|---|---|---|---|
| fp8tensorwise | 832×1216, 30steps | 4.47s (6.95it/s) | 3.58s (8.74it/s) | +25.3% |
| 1216×1856, 30steps | 11.78s (2.62it/s) | 9.62s (3.22it/s) | +18.3% | |
| mxfp8 | 832×1216, 30steps | 4.69s (6.60it/s) | 3.75s (8.34it/s) | +20.0% |
| 1216×1856, 30steps | 11.95s (2.58it/s) | 9.99s (3.10it/s) | +16.4% |
ComfyUI/custom_nodes/patch_anima_mlp_patch.py
import torch
from torch import nn
def patch_fp8_mlp():
try:
from comfy.ldm.cosmos.predict2 import GPT2FeedForward
except ImportError:
print("[FP8 Patch] GPT2FeedForward class not found. Skipping patch.")
return
def patched_forward(self, x: torch.Tensor) -> torch.Tensor:
original_shape = x.shape
x_reshaped = x.view(-1, original_shape[-1])
x_out = self.layer1(x_reshaped)
x_out = self.activation(x_out)
x_out = self.layer2(x_out)
x_out = x_out.view(*original_shape)
return x_out
GPT2FeedForward.forward = patched_forward
print("[FP8 Patch] Successfully patched GPT2FeedForward.forward for FP8 GEMM support.")
patch_fp8_mlp()