GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM Paper • 2403.05527 • Published Mar 8, 2024 • 2