Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.02415

PockEngine: Sparse and Efficient Fine-tuning in a Pocket

Paper • 2310.17752 • Published Oct 26, 2023 • 11
S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 27
Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Paper • 2311.06243 • Published Nov 10, 2023 • 17
Fine-tuning Language Models for Factuality

Paper • 2311.08401 • Published Nov 14, 2023 • 27

Small Multimodal Models

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 141
LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model

Paper • 2401.02330 • Published Jan 4 • 14
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 84
Visual Instruction Tuning

Paper • 2304.08485 • Published Apr 17, 2023 • 10

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 92
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 37
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 17
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15 • 35

Efficient-Continuous Training

LLaMA Pro: Progressive LLaMA with Block Expansion

Paper • 2401.02415 • Published Jan 4 • 50
Scaling Laws for Downstream Task Performance of Large Language Models

Paper • 2402.04177 • Published Feb 6 • 17

model architecture

Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 156
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Paper • 2401.04081 • Published Jan 8 • 69
TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4 • 84
LLaMA Pro: Progressive LLaMA with Block Expansion

Paper • 2401.02415 • Published Jan 4 • 50

LLaMA Pro: Progressive LLaMA with Block Expansion

Paper • 2401.02415 • Published Jan 4 • 50

Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4 • 60
DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 178
LLaMA Beyond English: An Empirical Study on Language Capability Transfer

Paper • 2401.01055 • Published Jan 2 • 52
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Paper • 2401.01325 • Published Jan 2 • 26

Large Langauge Model

LLaMA Pro: Progressive LLaMA with Block Expansion

Paper • 2401.02415 • Published Jan 4 • 50

LLaMA Pro: Progressive LLaMA with Block Expansion

Paper • 2401.02415 • Published Jan 4 • 50

LLaMA Pro: Progressive LLaMA with Block Expansion

Paper • 2401.02415 • Published Jan 4 • 50
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 92
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 17
Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15 • 50

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs