SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models Paper • 2405.14917 • Published 30 days ago • 1
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study Paper • 2404.14047 • Published Apr 22 • 38
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs Paper • 2402.04291 • Published Feb 6 • 48