MEDIC: Towards a Comprehensive Framework for Evaluating LLMs in Clinical Applications Paper • 2409.07314 • Published Sep 11 • 50
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 604
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 96
RoFormer: Enhanced Transformer with Rotary Position Embedding Paper • 2104.09864 • Published Apr 20, 2021 • 11
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper • 2402.13753 • Published Feb 21 • 112
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Paper • 2205.14135 • Published May 27, 2022 • 11
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 253
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning Paper • 2307.08691 • Published Jul 17, 2023 • 8
Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published May 1 • 25