Compressed LLMs for nm-vllm Collection LLMs compressed using SparseGPT and GPTQ for optimized inference with nm-vllm https://github.com/neuralmagic/nm-vllm • 17 items • Updated 12 days ago • 7