Collections
Discover the best community collections!
Collections including paper arxiv:2309.02427
-
JetMoE: Reaching Llama2 Performance with 0.1M Dollars
Paper • 2404.07413 • Published • 32 -
Allowing humans to interactively guide machines where to look does not always improve a human-AI team's classification accuracy
Paper • 2404.05238 • Published • 1 -
Cognitive Architectures for Language Agents
Paper • 2309.02427 • Published • 2 -
Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Paper • 2305.13571 • Published • 2
-
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 37 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 91 -
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Paper • 2403.14624 • Published • 50 -
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
Paper • 2402.12875 • Published • 2
-
Cognitive Architectures for Language Agents
Paper • 2309.02427 • Published • 2 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 37 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 69 -
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models
Paper • 2311.00871 • Published • 2
-
RA-DIT: Retrieval-Augmented Dual Instruction Tuning
Paper • 2310.01352 • Published • 6 -
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Paper • 2203.11171 • Published • 1 -
MemGPT: Towards LLMs as Operating Systems
Paper • 2310.08560 • Published • 6 -
Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
Paper • 2310.06117 • Published • 3
-
Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering
Paper • 2204.04581 • Published • 1 -
Retrieval-Augmented Multimodal Language Modeling
Paper • 2211.12561 • Published • 1 -
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
Paper • 2212.10511 • Published • 1 -
Memorizing Transformers
Paper • 2203.08913 • Published • 2