arxiv:2411.19527

DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding

Published on Nov 29, 2024

· Submitted by

junwann on Dec 2, 2024

Upvote

Authors:

Jungbin Cho ,

Junwan Kim ,

Minseo Kim ,

Mingu Kang ,

Tae-Hyun Oh ,

Abstract

Human motion, inherently continuous and dynamic, presents significant challenges for generative models. Despite their dominance, discrete quantization methods, such as VQ-VAEs, suffer from inherent limitations, including restricted expressiveness and frame-wise noise artifacts. Continuous approaches, while producing smoother and more natural motions, often falter due to high-dimensional complexity and limited training data. To resolve this "discord" between discrete and continuous representations, we introduce DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding, a novel method that decodes discrete motion tokens into continuous motion through rectified flow. By employing an iterative refinement process in the continuous space, DisCoRD captures fine-grained dynamics and ensures smoother and more natural motions. Compatible with any discrete-based framework, our method enhances naturalness without compromising faithfulness to the conditioning signals. Extensive evaluations demonstrate that DisCoRD achieves state-of-the-art performance, with FID of 0.032 on HumanML3D and 0.169 on KIT-ML. These results solidify DisCoRD as a robust solution for bridging the divide between discrete efficiency and continuous realism. Our project page is available at: https://whwjdqls.github.io/discord.github.io/.

View arXiv page View PDF Add to collection

Community

junwann

Paper author Paper submitter Dec 2, 2024

•

edited Dec 2, 2024

🚀 We are excited to present “DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding,” a novel approach to human motion generation that combines the strengths of both discrete and continuous methods.

🏃🏻‍♂️ Our approach achieves state-of-the-art performance on the HumanML3D and also introduces a novel metric for evaluating the naturalness of generated motion. Feel free to explore our Project page: https://whwjdqls.github.io/discord.github.io/