Direct Preference Optimization Using Sparse Feature-Level Constraints Paper • 2411.07618 • Published Nov 12 • 15
m3hrdadfi/wav2vec2-base-100k-gtzan-music-genres Automatic Speech Recognition • Updated Jul 6, 2021 • 150 • 20
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens Paper • 2410.13863 • Published Oct 17 • 36
ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification and KV Cache Compression Paper • 2410.08584 • Published Oct 11 • 12
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Paper • 2408.16532 • Published Aug 29 • 47