-
TinyLlama: An Open-Source Small Language Model
Paper • 2401.02385 • Published • 81 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 41 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 62 -
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Paper • 2401.16380 • Published • 46
Collections
Discover the best community collections!
Collections including paper arxiv:2402.10200
-
LLaMA Pro: Progressive LLaMA with Block Expansion
Paper • 2401.02415 • Published • 50 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 91 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 17 -
Generative Representational Instruction Tuning
Paper • 2402.09906 • Published • 50
-
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs
Paper • 2210.14986 • Published • 4 -
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2
Paper • 2311.10702 • Published • 17 -
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 72 -
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting
Paper • 2309.04269 • Published • 29
-
VideoBooth: Diffusion-based Video Generation with Image Prompts
Paper • 2312.00777 • Published • 19 -
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation
Paper • 2312.03641 • Published • 19 -
GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Paper • 2312.04557 • Published • 12 -
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
Paper • 2312.04433 • Published • 9
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 69 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 24 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 3 -
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
Paper • 2308.00436 • Published • 20
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 69 -
ToolTalk: Evaluating Tool-Usage in a Conversational Setting
Paper • 2311.10775 • Published • 7 -
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 24 -
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Paper • 2311.11501 • Published • 32
-
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation
Paper • 2310.18628 • Published • 6 -
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise
Paper • 2310.19019 • Published • 9 -
Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs
Paper • 2311.02262 • Published • 9 -
Thread of Thought Unraveling Chaotic Contexts
Paper • 2311.08734 • Published • 4
-
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper • 2310.08740 • Published • 14 -
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Paper • 2310.12823 • Published • 33 -
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
Paper • 2308.10848 • Published • 1 -
CLEX: Continuous Length Extrapolation for Large Language Models
Paper • 2310.16450 • Published • 9