Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2311.09578

LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

Paper • 2310.08659 • Published Oct 12, 2023 • 21
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

Paper • 2309.14717 • Published Sep 26, 2023 • 43
ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers

Paper • 2309.16119 • Published Sep 28, 2023 • 1
LoRA ensembles for large language model fine-tuning

Paper • 2310.00035 • Published Sep 29, 2023 • 2

Tied-Lora: Enhacing parameter efficiency of LoRA with weight tying

Paper • 2311.09578 • Published Nov 16, 2023 • 12

Weekly reading selection

Contrastive Chain-of-Thought Prompting

Paper • 2311.09277 • Published Nov 15, 2023 • 31
Tied-Lora: Enhacing parameter efficiency of LoRA with weight tying

Paper • 2311.09578 • Published Nov 16, 2023 • 12
Llamas Know What GPTs Don't Show: Surrogate Models for Confidence Estimation

Paper • 2311.08877 • Published Nov 15, 2023 • 5
Fusion-Eval: Integrating Evaluators with LLMs

Paper • 2311.09204 • Published Nov 15, 2023 • 5

HuggingFaceH4/zephyr-7b-alpha

Text Generation • Updated Nov 21, 2023 • 71.8k • • 1.08k
Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 117
Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 69
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning

Paper • 2311.11501 • Published Nov 20, 2023 • 32

FlashDecoding++: Faster Large Language Model Inference on GPUs

Paper • 2311.01282 • Published Nov 2, 2023 • 32
S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 27
Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Paper • 2311.06243 • Published Nov 10, 2023 • 17
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores

Paper • 2311.05908 • Published Nov 10, 2023 • 12

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 120
Controlled Decoding from Language Models

Paper • 2310.17022 • Published Oct 25, 2023 • 12
Tied-Lora: Enhacing parameter efficiency of LoRA with weight tying

Paper • 2311.09578 • Published Nov 16, 2023 • 12

Large Language Models for Compiler Optimization

Paper • 2309.07062 • Published Sep 11, 2023 • 22
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

Paper • 2310.17157 • Published Oct 26, 2023 • 9
FP8-LM: Training FP8 Large Language Models

Paper • 2310.18313 • Published Oct 27, 2023 • 31
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Paper • 2310.19102 • Published Oct 29, 2023 • 8

OmnimatteRF: Robust Omnimatte with 3D Background Modeling

Paper • 2309.07749 • Published Sep 14, 2023 • 6
AudioSR: Versatile Audio Super-resolution at Scale

Paper • 2309.07314 • Published Sep 13, 2023 • 23
Generative Image Dynamics

Paper • 2309.07906 • Published Sep 14, 2023 • 51
MagiCapture: High-Resolution Multi-Concept Portrait Customization

Paper • 2309.06895 • Published Sep 13, 2023 • 27

about 9 hours ago

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

Paper • 2309.04662 • Published Sep 9, 2023 • 21
Neurons in Large Language Models: Dead, N-gram, Positional

Paper • 2309.04827 • Published Sep 9, 2023 • 16
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Paper • 2309.05516 • Published Sep 11, 2023 • 8
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs

Paper • 2309.03907 • Published May 18, 2023 • 6

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs