-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 96 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 253 -
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Paper • 2404.16710 • Published • 75
Collections
Discover the best community collections!
Collections including paper arxiv:2404.14219
-
Nemotron-4 15B Technical Report
Paper • 2402.16819 • Published • 42 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 52 -
RWKV: Reinventing RNNs for the Transformer Era
Paper • 2305.13048 • Published • 15 -
Reformer: The Efficient Transformer
Paper • 2001.04451 • Published
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 605 -
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 183 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 52 -
ResLoRA: Identity Residual Mapping in Low-Rank Adaption
Paper • 2402.18039 • Published • 11
-
Nemotron-4 15B Technical Report
Paper • 2402.16819 • Published • 42 -
InternLM2 Technical Report
Paper • 2403.17297 • Published • 30 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 12 -
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 126
-
Neural Network Diffusion
Paper • 2402.13144 • Published • 95 -
Genie: Generative Interactive Environments
Paper • 2402.15391 • Published • 70 -
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Paper • 2402.17177 • Published • 88 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper • 2403.00522 • Published • 44
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 145 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 50 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 82 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 26
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 145 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 12 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 52 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 45
-
TinyLlama: An Open-Source Small Language Model
Paper • 2401.02385 • Published • 90 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 45 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 69 -
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Paper • 2401.16380 • Published • 48