μLO: Compute-Efficient Meta-Generalization of Learned Optimizers Paper • 2406.00153 • Published 27 days ago • 9
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model Paper • 2406.04333 • Published 21 days ago • 36
An Introduction to Vision-Language Modeling Paper • 2405.17247 • Published about 1 month ago • 77
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning Paper • 2405.18386 • Published 30 days ago • 17
LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models Paper • 2405.18377 • Published 30 days ago • 16
Layer-Condensed KV Cache for Efficient Inference of Large Language Models Paper • 2405.10637 • Published May 17 • 17
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention Paper • 2405.12981 • Published May 21 • 26
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 240
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25 • 56
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 572
A Survey on Hardware Accelerators for Large Language Models Paper • 2401.09890 • Published Jan 18 • 1
RewardBench: Evaluating Reward Models for Language Modeling Paper • 2403.13787 • Published Mar 20 • 19
The Unreasonable Ineffectiveness of the Deeper Layers Paper • 2403.17887 • Published Mar 26 • 75
MoAI: Mixture of All Intelligence for Large Language and Vision Models Paper • 2403.07508 • Published Mar 12 • 73
Common Corpus Collection The largest public domain dataset for training LLMs. • 27 items • Updated 10 days ago • 106
MPIrigen: MPI Code Generation through Domain-Specific Language Models Paper • 2402.09126 • Published Feb 14 • 11
GPTVQ: The Blessing of Dimensionality for LLM Quantization Paper • 2402.15319 • Published Feb 23 • 19
TinyLLaVA: A Framework of Small-scale Large Multimodal Models Paper • 2402.14289 • Published Feb 22 • 17
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models Paper • 2402.19481 • Published Feb 29 • 17
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models Paper • 2402.19427 • Published Feb 29 • 50