-
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Paper • 2306.12929 • Published • 12 -
Norm Tweaking: High-performance Low-bit Quantization of Large Language Models
Paper • 2309.02784 • Published • 1 -
QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models
Paper • 2310.08041 • Published • 1 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 603
Martin Fan
perfectoid
·
AI & ML interests
None yet
Organizations
Collections
1
models
None public yet
datasets
None public yet