deepdml/faster-whisper-large-v3-turbo-ct2 Automatic Speech Recognition • Updated 28 days ago • 383k • 72
Generative Multimodal Models are In-Context Learners Paper • 2312.13286 • Published Dec 20, 2023 • 34
CapsFusion: Rethinking Image-Text Data at Scale Paper • 2310.20550 • Published Oct 31, 2023 • 25
CapsFusion: Rethinking Image-Text Data at Scale Paper • 2310.20550 • Published Oct 31, 2023 • 25