Memory-Efficient LLM Training with Online Subspace Descent Paper • 2408.12857 • Published Aug 23 • 11
Longhorn: State Space Models are Amortized Online Learners Paper • 2407.14207 • Published Jul 19 • 16