OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22, 2024 • 126
Speculative Streaming: Fast LLM Inference without Auxiliary Models Paper • 2402.11131 • Published Feb 16, 2024 • 42