-
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 8 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 99 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 109
Collections
Discover the best community collections!
Collections including paper arxiv:2402.10200
-
Lossless Acceleration for Seq2seq Generation with Aggressive Decoding
Paper • 2205.10350 • Published • 2 -
Blockwise Parallel Decoding for Deep Autoregressive Models
Paper • 1811.03115 • Published • 2 -
Fast Transformer Decoding: One Write-Head is All You Need
Paper • 1911.02150 • Published • 6 -
Sequence-Level Knowledge Distillation
Paper • 1606.07947 • Published • 2
-
PALO: A Polyglot Large Multimodal Model for 5B People
Paper • 2402.14818 • Published • 23 -
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper • 2402.13753 • Published • 111 -
User-LLM: Efficient LLM Contextualization with User Embeddings
Paper • 2402.13598 • Published • 18 -
Coercing LLMs to do and reveal (almost) anything
Paper • 2402.14020 • Published • 12
-
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 99 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 57 -
MathScale: Scaling Instruction Tuning for Mathematical Reasoning
Paper • 2403.02884 • Published • 15
-
Speculative Streaming: Fast LLM Inference without Auxiliary Models
Paper • 2402.11131 • Published • 41 -
Generative Representational Instruction Tuning
Paper • 2402.09906 • Published • 51 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 99 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 17
-
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Paper • 2310.17631 • Published • 32 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 53 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 99 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 17