Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients Paper • 2407.08296 • Published Jul 11 • 31
H_2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models Paper • 2306.14048 • Published Jun 24, 2023 • 12