Collections
Discover the best community collections!
Collections including paper arxiv:2311.09578
-
Contrastive Chain-of-Thought Prompting
Paper • 2311.09277 • Published • 34 -
Tied-Lora: Enhacing parameter efficiency of LoRA with weight tying
Paper • 2311.09578 • Published • 14 -
Llamas Know What GPTs Don't Show: Surrogate Models for Confidence Estimation
Paper • 2311.08877 • Published • 6 -
Fusion-Eval: Integrating Evaluators with LLMs
Paper • 2311.09204 • Published • 5
-
HuggingFaceH4/zephyr-7b-alpha
Text Generation • Updated • 19.7k • • 1.1k -
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 117 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 71 -
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Paper • 2311.11501 • Published • 33
-
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper • 2311.01282 • Published • 35 -
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Paper • 2311.03285 • Published • 28 -
Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization
Paper • 2311.06243 • Published • 17 -
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
Paper • 2311.05908 • Published • 12
-
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
Paper • 2310.08659 • Published • 23 -
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper • 2309.14717 • Published • 44 -
ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers
Paper • 2309.16119 • Published • 1 -
LoRA ensembles for large language model fine-tuning
Paper • 2310.00035 • Published • 2
-
Large Language Models for Compiler Optimization
Paper • 2309.07062 • Published • 23 -
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Paper • 2310.17157 • Published • 12 -
FP8-LM: Training FP8 Large Language Models
Paper • 2310.18313 • Published • 32 -
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Paper • 2310.19102 • Published • 10
-
OmnimatteRF: Robust Omnimatte with 3D Background Modeling
Paper • 2309.07749 • Published • 7 -
AudioSR: Versatile Audio Super-resolution at Scale
Paper • 2309.07314 • Published • 25 -
Generative Image Dynamics
Paper • 2309.07906 • Published • 52 -
MagiCapture: High-Resolution Multi-Concept Portrait Customization
Paper • 2309.06895 • Published • 27
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 22 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 9 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 10