-
Can large language models explore in-context?
Paper • 2403.15371 • Published • 30 -
Long-context LLMs Struggle with Long In-context Learning
Paper • 2404.02060 • Published • 33 -
PIQA: Reasoning about Physical Commonsense in Natural Language
Paper • 1911.11641 • Published • 2 -
AQuA: A Benchmarking Tool for Label Quality Assessment
Paper • 2306.09467 • Published • 1
Collections
Discover the best community collections!
Collections including paper arxiv:2404.11912
-
LIMA: Less Is More for Alignment
Paper • 2305.11206 • Published • 19 -
Garment3DGen: 3D Garment Stylization and Texture Generation
Paper • 2403.18816 • Published • 19 -
EgoLifter: Open-world 3D Segmentation for Egocentric Perception
Paper • 2403.18118 • Published • 7 -
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 75
-
Accelerating LLM Inference with Staged Speculative Decoding
Paper • 2308.04623 • Published • 21 -
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Paper • 2310.12962 • Published • 13 -
The Curious Case of Neural Text Degeneration
Paper • 1904.09751 • Published • 2 -
On Speculative Decoding for Multimodal Large Language Models
Paper • 2404.08856 • Published • 12
-
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
Paper • 2403.09636 • Published • 2 -
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
Paper • 2404.11912 • Published • 16 -
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache
Paper • 2401.02669 • Published • 11 -
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding
Paper • 2404.16710 • Published • 56
-
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 8 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 91 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 104
-
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains
Paper • 2402.05140 • Published • 18 -
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems
Paper • 2311.11315 • Published • 6 -
Revisit Input Perturbation Problems for LLMs: A Unified Robustness Evaluation Framework for Noisy Slot Filling Task
Paper • 2310.06504 • Published -
Large Language Models as Zero-shot Dialogue State Tracker through Function Calling
Paper • 2402.10466 • Published • 16
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 135 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 10 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 47 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 41