BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 27
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models Paper • 1910.02054 • Published Oct 4, 2019 • 4
ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning Paper • 2104.07857 • Published Apr 16, 2021
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale Paper • 2201.05596 • Published Jan 14, 2022 • 2
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies Paper • 2310.04610 • Published Oct 6, 2023 • 1
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference Paper • 2401.08671 • Published Jan 9 • 14
DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales Paper • 2308.01320 • Published Aug 2, 2023 • 44