-
LLoCO: Learning Long Contexts Offline
Paper • 2404.07979 • Published • 20 -
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper • 2402.13753 • Published • 112 -
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Paper • 2402.11550 • Published • 16 -
LongAlign: A Recipe for Long Context Alignment of Large Language Models
Paper • 2401.18058 • Published • 20
Collections
Discover the best community collections!
Collections including paper arxiv:2408.14906
-
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 138 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 135 -
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Paper • 2409.02795 • Published • 71 -
Attention Heads of Large Language Models: A Survey
Paper • 2409.03752 • Published • 88
-
EfficientRAG: Efficient Retriever for Multi-Hop Question Answering
Paper • 2408.04259 • Published -
HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction
Paper • 2408.04948 • Published • 1 -
Graph Retrieval-Augmented Generation: A Survey
Paper • 2408.08921 • Published • 4 -
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 138
-
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 138 -
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Paper • 2410.10819 • Published • 6 -
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models
Paper • 2410.09342 • Published • 37 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53
-
A Comparative Study on Automatic Coding of Medical Letters with Explainability
Paper • 2407.13638 • Published • 5 -
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
Paper • 2407.07061 • Published • 26 -
AgentInstruct: Toward Generative Teaching with Agentic Flows
Paper • 2407.03502 • Published • 48 -
Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions
Paper • 2407.06723 • Published • 10
-
RLHF Workflow: From Reward Modeling to Online RLHF
Paper • 2405.07863 • Published • 66 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 126 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 53 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 86
-
Rho-1: Not All Tokens Are What You Need
Paper • 2404.07965 • Published • 86 -
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Paper • 2404.05961 • Published • 64 -
Compression Represents Intelligence Linearly
Paper • 2404.09937 • Published • 27 -
Multi-Head Mixture-of-Experts
Paper • 2404.15045 • Published • 59
-
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 78 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 104 -
ReFT: Representation Finetuning for Language Models
Paper • 2404.03592 • Published • 91 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 60