-
Communicative Agents for Software Development
Paper • 2307.07924 • Published • 2 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 31 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 12
Collections
Discover the best community collections!
Collections including paper arxiv:2402.11550
-
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 10 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 13 -
Instruction-tuned Language Models are Better Knowledge Learners
Paper • 2402.12847 • Published • 24 -
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models
Paper • 2402.13064 • Published • 45
-
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper • 2402.13753 • Published • 106 -
Data Engineering for Scaling Language Models to 128K Context
Paper • 2402.10171 • Published • 18 -
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Paper • 2402.11550 • Published • 12 -
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Paper • 2401.07872 • Published • 2
-
InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory
Paper • 2402.04617 • Published • 4 -
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Paper • 2403.09347 • Published • 20 -
Resonance RoPE: Improving Context Length Generalization of Large Language Models
Paper • 2403.00071 • Published • 19 -
Training-Free Long-Context Scaling of Large Language Models
Paper • 2402.17463 • Published • 18
-
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss
Paper • 2402.10790 • Published • 39 -
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Paper • 2402.11550 • Published • 12 -
A Neural Conversational Model
Paper • 1506.05869 • Published • 2 -
Data Engineering for Scaling Language Models to 128K Context
Paper • 2402.10171 • Published • 18
-
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 62 -
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
Paper • 2401.14112 • Published • 17 -
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
Paper • 2401.13919 • Published • 22 -
Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation
Paper • 2401.14257 • Published • 9
-
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 50 -
Simple linear attention language models balance the recall-throughput tradeoff
Paper • 2402.18668 • Published • 17 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 18 -
Linear Transformers are Versatile In-Context Learners
Paper • 2402.14180 • Published • 5
-
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper • 2401.02412 • Published • 35 -
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Paper • 2402.11550 • Published • 12 -
A Multimodal Automated Interpretability Agent
Paper • 2404.14394 • Published • 19
-
FLM-101B: An Open LLM and How to Train It with $100K Budget
Paper • 2309.03852 • Published • 42 -
Extending LLMs' Context Window with 100 Samples
Paper • 2401.07004 • Published • 14 -
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Paper • 2402.11550 • Published • 12 -
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Paper • 2401.07872 • Published • 2