Many-Shot In-Context Learning in Multimodal Foundation Models Paper • 2405.09798 • Published 7 days ago • 24
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published 23 days ago • 61
A Careful Examination of Large Language Model Performance on Grade School Arithmetic Paper • 2405.00332 • Published 22 days ago • 24
view article Article Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU! By lyogavin • Apr 21 • 35
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published about 1 month ago • 235
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6 • 173
Learning to Learn Faster from Human Feedback with Language Model Predictive Control Paper • 2402.11450 • Published Feb 18 • 20
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design Paper • 2401.14112 • Published Jan 25 • 17
Generative Agents: Interactive Simulacra of Human Behavior Paper • 2304.03442 • Published Apr 7, 2023 • 11
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper • 2401.01335 • Published Jan 2 • 61
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 253
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention Paper • 2312.07987 • Published Dec 13, 2023 • 39
CLEX: Continuous Length Extrapolation for Large Language Models Paper • 2310.16450 • Published Oct 25, 2023 • 9
RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models Paper • 2308.07922 • Published Aug 15, 2023 • 16
LongNet: Scaling Transformers to 1,000,000,000 Tokens Paper • 2307.02486 • Published Jul 5, 2023 • 79
Extending Context Window of Large Language Models via Positional Interpolation Paper • 2306.15595 • Published Jun 27, 2023 • 52
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 Paper • 2306.02707 • Published Jun 5, 2023 • 45