Collections
Discover the best community collections!
Collections including paper arxiv:2401.04088
-
Lost in the Middle: How Language Models Use Long Contexts
Paper • 2307.03172 • Published • 36 -
Efficient Estimation of Word Representations in Vector Space
Paper • 1301.3781 • Published • 6 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 14 -
Attention Is All You Need
Paper • 1706.03762 • Published • 44
-
Mixtral of Experts
Paper • 2401.04088 • Published • 157 -
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts
Paper • 2401.04081 • Published • 70 -
TinyLlama: An Open-Source Small Language Model
Paper • 2401.02385 • Published • 89 -
LLaMA Pro: Progressive LLaMA with Block Expansion
Paper • 2401.02415 • Published • 53