GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints Paper • 2305.13245 • Published May 22, 2023 • 5
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models Paper • 2404.12387 • Published 21 days ago • 34
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published 17 days ago • 118