Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 25 days ago • 90
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published Dec 23, 2024 • 40 • 26
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published Dec 23, 2024 • 40 • 26
Position Information Emerges in Causal Transformers Without Positional Encodings via Similarity of Nearby Embeddings Paper • 2501.00073 • Published Dec 30, 2024 • 1
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models Paper • 2412.07171 • Published Dec 10, 2024 • 1 • 1
Lee's RoPE Tricks / Context Extension Reads Collection Set of Long Context (RoPE or otherwise) I'm collecting off of HF • 45 items • Updated 30 days ago • 3
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models Paper • 2412.07171 • Published Dec 10, 2024 • 1
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding Paper • 2501.00712 • Published Jan 1 • 6 • 4
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding Paper • 2501.00712 • Published Jan 1 • 6
Lee's RoPE Tricks / Context Extension Reads Collection Set of Long Context (RoPE or otherwise) I'm collecting off of HF • 45 items • Updated 30 days ago • 3
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing Paper • 2501.00658 • Published Dec 31, 2024 • 7