Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras
Neural Magic
company
Verified
AI & ML interests
LLMs, optimization, compression, sparsification, quantization, pruning, distillation, NLP, CV
Organization Card
About org cards
Software-Delivered AI Inference
Neural Magic helps developers in accelerating deep learning performance using automated model sparsification technologies and inference engines. Download our sparsity-aware inference engines and open source tools for fast model inference.
- nm-vllm: A high-throughput and memory-efficient inference engine for LLMs, incorporating the latest LLM optimizations like quantization and sparsity
- DeepSparse: Inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application
- SparseML: Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
- SparseZoo: Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
Collections
5
LLMs compressed using SparseGPT and GPTQ for optimized inference with nm-vllm https://github.com/neuralmagic/nm-vllm
-
neuralmagic/OpenHermes-2.5-Mistral-7B-pruned50
Text Generation • Updated • 1.8k • 1 -
neuralmagic/OpenHermes-2.5-Mistral-7B-pruned2.4
Text Generation • Updated • 1.5k -
neuralmagic/OpenHermes-2.5-Mistral-7B-marlin
Text Generation • Updated • 942 • 1 -
neuralmagic/phi-2-pruned50
Text Generation • Updated • 36
models
121
neuralmagic/Llama-2-7b-pruned50-retrained-ultrachat
Text Generation
•
Updated
•
151
neuralmagic/Llama-2-7b-pruned70-retrained-ultrachat
Text Generation
•
Updated
•
28
neuralmagic/Llama-2-7b-pruned50-retrained-instruct
Text Generation
•
Updated
•
9
neuralmagic/Llama-2-7b-pruned70-retrained-instruct-quant-ds
Text Generation
•
Updated
•
9
neuralmagic/Llama-2-7b-pruned50-retrained-instruct-quant-ds
Text Generation
•
Updated
•
11
neuralmagic/Llama-2-7b-pruned70-retrained-instruct
Text Generation
•
Updated
•
13
neuralmagic/Llama-2-7b-evolcodealpaca
Text Generation
•
Updated
•
15
neuralmagic/Llama-2-7b-pruned70-retrained-evolcodealpaca-quant-ds
Text Generation
•
Updated
•
9
neuralmagic/Llama-2-7b-pruned50-retrained-evolcodealpaca-quant-ds
Text Generation
•
Updated
•
14
neuralmagic/Llama-2-7b-pruned70-retrained-evolcodealpaca
Text Generation
•
Updated
•
7
datasets
None public yet