-
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 134 -
SparQ Attention: Bandwidth-Efficient LLM Inference
Paper • 2312.04985 • Published • 37 -
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Paper • 2401.04658 • Published • 24 -
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
Paper • 2401.06951 • Published • 24
Collections
Discover the best community collections!
Collections including paper arxiv:2403.00071