-
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 257 -
Magicoder: Source Code Is All You Need
Paper • 2312.02120 • Published • 80 -
Mixtral of Experts
Paper • 2401.04088 • Published • 158 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 105
Collections
Discover the best community collections!
Collections including paper arxiv:2402.10200
-
TinyLlama: An Open-Source Small Language Model
Paper • 2401.02385 • Published • 90 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 45 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 69 -
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Paper • 2401.16380 • Published • 48
-
LLaMA Pro: Progressive LLaMA with Block Expansion
Paper • 2401.02415 • Published • 53 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 105 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 20 -
Generative Representational Instruction Tuning
Paper • 2402.09906 • Published • 54
-
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs
Paper • 2210.14986 • Published • 5 -
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2
Paper • 2311.10702 • Published • 18 -
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 75 -
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting
Paper • 2309.04269 • Published • 32
-
VideoBooth: Diffusion-based Video Generation with Image Prompts
Paper • 2312.00777 • Published • 21 -
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation
Paper • 2312.03641 • Published • 20 -
GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Paper • 2312.04557 • Published • 12 -
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
Paper • 2312.04433 • Published • 9
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 71 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 28 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 10 -
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
Paper • 2308.00436 • Published • 22
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 71 -
ToolTalk: Evaluating Tool-Usage in a Conversational Setting
Paper • 2311.10775 • Published • 7 -
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 24 -
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Paper • 2311.11501 • Published • 33
-
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation
Paper • 2310.18628 • Published • 7 -
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise
Paper • 2310.19019 • Published • 9 -
Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs
Paper • 2311.02262 • Published • 10 -
Thread of Thought Unraveling Chaotic Contexts
Paper • 2311.08734 • Published • 6