Neural Magic

company

Verified

https://neuralmagic.com/

neuralmagic

neuralmagic

AI & ML interests

LLMs, optimization, compression, sparsification, quantization, pruning, distillation, NLP, CV

Organization Card

About org cards

Software-Delivered AI Inference

Neural Magic helps developers in accelerating deep learning performance using automated model sparsification technologies and inference engines. Download our sparsity-aware inference engines and open source tools for fast model inference.

nm-vllm: A high-throughput and memory-efficient inference engine for LLMs, incorporating the latest LLM optimizations like quantization and sparsity
DeepSparse: Inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application
SparseML: Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
SparseZoo: Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes

Collections 5

spaces 12

Running on CPU Upgrade

Llama 2 Sparse Transfer Chat Deepsparse

DeepSparse Sentiment Analysis

DeepSparse Named Entity Recognition

DeepSparse Text Classification

DeepSparse Question Answering

Running on CPU Upgrade

LLM RAG QA - AI Expo Europe

models 129

neuralmagic/Llama-2-7b-cnn-daily-mail-pruned_70-quantized-deepsparse

Updated 25 minutes ago

neuralmagic/Llama-2-7b-dolphin-open_platypus-pruned_70-quantized-deepsparse

Text Generation • Updated about 22 hours ago • 6

neuralmagic/Llama-2-7b-dolphin-open_platypus-pruned_50-quantized-deepsparse

Text Generation • Updated about 22 hours ago • 4

neuralmagic/Llama-2-7b-cnn-daily-mail-pruned_50

Text Generation • Updated 2 days ago • 1

neuralmagic/Llama-2-7b-cnn-daily-mail

Text Generation • Updated 2 days ago • 1

neuralmagic/Llama-2-7b-cnn-daily-mail-pruned_50-quantized-deepsparse

Updated 2 days ago

neuralmagic/Llama-2-7b-ultrachat200k-pruned_70

Text Generation • Updated 2 days ago • 34

neuralmagic/Llama-2-7b-ultrachat200k-pruned_70-quantized-deepsparse

Text Generation • Updated 2 days ago • 20

neuralmagic/Llama-2-7b-ultrachat200k-pruned_50

Text Generation • Updated 2 days ago • 61

neuralmagic/Llama-2-7b-dolphin-open_platypus-pruned_70

Text Generation • Updated 2 days ago • 79

datasets

None public yet