Layer-Condensed KV Cache for Efficient Inference of Large Language Models Paper • 2405.10637 • Published 6 days ago • 13
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published 23 days ago • 61
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10 • 92
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples Paper • 2404.07544 • Published Apr 11 • 15
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Paper • 2404.05961 • Published Apr 9 • 62
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU Paper • 2403.06504 • Published Mar 11 • 52
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6 • 61
Common 7B Language Models Already Possess Strong Math Capabilities Paper • 2403.04706 • Published Mar 7 • 16
Do Large Language Models Latently Perform Multi-Hop Reasoning? Paper • 2402.16837 • Published Feb 26 • 24
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models Paper • 2402.14848 • Published Feb 19 • 18
A Touch, Vision, and Language Dataset for Multimodal Alignment Paper • 2402.13232 • Published Feb 20 • 11
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper • 2402.13753 • Published Feb 21 • 104
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM Paper • 2401.02994 • Published Jan 4 • 44
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon Paper • 2401.03462 • Published Jan 7 • 25
DiarizationLM: Speaker Diarization Post-Processing with Large Language Models Paper • 2401.03506 • Published Jan 7 • 12
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism Paper • 2401.02954 • Published Jan 5 • 38
LLM Leaderboard best models ❤️🔥 Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 70 items • Updated 7 days ago • 308
Distributed Inference and Fine-tuning of Large Language Models Over The Internet Paper • 2312.08361 • Published Dec 13, 2023 • 23