Collections
Discover the best community collections!
Collections including paper arxiv:2402.17764
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 571 -
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Paper • 2310.19102 • Published • 7 -
AMSP: Super-Scaling LLM Training via Advanced Model States Partitioning
Paper • 2311.00257 • Published • 8 -
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Paper • 2402.04291 • Published • 48