Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks Paper • 2402.04248 • Published Feb 6 • 30
Scavenging Hyena: Distilling Transformers into Long Convolution Models Paper • 2401.17574 • Published Jan 31 • 15
SambaMixer: State of Health Prediction of Li-ion Batteries using Mamba State Space Models Paper • 2411.00233 • Published Oct 31 • 7
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20 • 38
Gated Delta Networks: Improving Mamba2 with Delta Rule Paper • 2412.06464 • Published 13 days ago • 9