-
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper • 2401.02412 • Published • 35 -
Generative Representational Instruction Tuning
Paper • 2402.09906 • Published • 50 -
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Paper • 2305.02301 • Published • 1 -
Evolutionary Optimization of Model Merging Recipes
Paper • 2403.13187 • Published • 44
Collections
Discover the best community collections!
Collections including paper arxiv:2401.02412
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 135 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 26 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 19 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 62
-
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 35 -
Mixtral of Experts
Paper • 2401.04088 • Published • 152 -
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM
Paper • 2401.02994 • Published • 44 -
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper • 2401.02412 • Published • 35
-
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper • 2401.02412 • Published • 35 -
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
Paper • 2402.11550 • Published • 12 -
A Multimodal Automated Interpretability Agent
Paper • 2404.14394 • Published • 19
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 59 -
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 173 -
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 50 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 24