Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.02791

Approximating Two-Layer Feedforward Networks for Efficient Transformers

Paper • 2310.10837 • Published Oct 16, 2023 • 10
BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 94
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models

Paper • 2310.16795 • Published Oct 25, 2023 • 26
LLM-FP4: 4-Bit Floating-Point Quantized Transformers

Paper • 2310.16836 • Published Oct 25, 2023 • 10

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs