toread FLM-101B: An Open LLM and How to Train It with $100K Budget Paper • 2309.03852 • Published Sep 7, 2023 • 42
FLM-101B: An Open LLM and How to Train It with $100K Budget Paper • 2309.03852 • Published Sep 7, 2023 • 42
Papers to read Jamba: A Hybrid Transformer-Mamba Language Model Paper • 2403.19887 • Published Mar 28 • 99 The Unreasonable Ineffectiveness of the Deeper Layers Paper • 2403.17887 • Published Mar 26 • 75