Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras
-
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment
Paper • 2405.03594 • Published • 6 -
🏃
Llama 2 Sparse Transfer Chat Deepsparse
-
neuralmagic/Llama-2-7b-pruned50-retrained
Text Generation • Updated • 95 -
neuralmagic/Llama-2-7b-pruned70-retrained
Text Generation • Updated • 45