Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.17764

Small LLM approaches

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 585

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

Paper • 2402.14083 • Published Feb 21 • 43
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 585
Genie: Generative Interactive Environments

Paper • 2402.15391 • Published Feb 23 • 70
Humanoid Locomotion as Next Token Prediction

Paper • 2402.19469 • Published Feb 29 • 25

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 585

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 585
Nemotron-4 15B Technical Report

Paper • 2402.16819 • Published Feb 26 • 42
Extreme Compression of Large Language Models via Additive Quantization

Paper • 2401.06118 • Published Jan 11 • 12

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 585
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6 • 181
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29 • 52
ResLoRA: Identity Residual Mapping in Low-Rank Adaption

Paper • 2402.18039 • Published Feb 28 • 11

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 585

SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Paper • 2401.15024 • Published Jan 26 • 67
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 585

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 585

MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs

Paper • 2402.15627 • Published Feb 23 • 33
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Paper • 2402.16822 • Published Feb 26 • 15
FuseChat: Knowledge Fusion of Chat Models

Paper • 2402.16107 • Published Feb 25 • 36
Multi-LoRA Composition for Image Generation

Paper • 2402.16843 • Published Feb 26 • 28

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 255
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 585

Previous
1
...
12
13
14
15
16
...
19
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs