Please use distilled ltx model

#3
by shivshankar - opened

Please use distilled ltx model which can generate audio in 2-4 step.

Scenema AI org

We're already using the distilled model (8 steps). The bottleneck in our pipeline isn't denoising. It's Gemma 3 12B text encoding and audio VAE decoding. Even if we halved the diffusion steps, you wouldn't see a meaningful difference in wall-clock time.

Sign up or log in to comment