14 mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding · 13 authors 1
7 Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers · 4 authors
6 SkipDecode: Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference · 6 authors