MotionLCM supports inference pipelines of 1-4 steps, with almost no difference in effectiveness between 1 and 4 steps. Generating approximately 200 frames of motion only takes about 30ms, which averages to approximately 6k fps per frame.
Our MotionLCM can achieve high-quality text-to-motion and precise motion control results (both sparse and dense conditions) in ∼30 ms.
We integrated a control module into the diffusion of the latent space, named Motion ControlNet, to achieve controllable motion generation. Our control algorithm is approximately 1,000 times faster than the best-performing baseline, with comparable quality.