LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models Paper • 2405.18377 • Published 19 days ago • 15
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory Paper • 2405.08707 • Published May 14 • 27
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated 17 days ago • 331
Llamafied Models Collection This is a collection of llamafied models - such as Qwen. • 5 items • Updated Apr 19 • 1