Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment Paper • 2405.03594 • Published 26 days ago • 7
DBRX Collection DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 89
Sparse Foundational Llama 2 Models Collection Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras • 27 items • Updated 15 days ago • 7
Cerebras LLaVA Collection Cerebras implementation and training recipes related to multimodal LLaVA models • 4 items • Updated Mar 19 • 1