Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model Paper • 2502.08820 • Published 13 days ago • 4
CIDAR: Culturally Relevant Instruction Dataset For Arabic Paper • 2402.03177 • Published Feb 5, 2024 • 6
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models Paper • 2309.14509 • Published Sep 25, 2023 • 18
DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention Paper • 2309.14327 • Published Sep 25, 2023 • 22