neuralmagic 's Collections

Compressed LLMs for nm-vllm

LLMs compressed using SparseGPT and GPTQ for optimized inference with nm-vllm https://github.com/neuralmagic/nm-vllm