LLMQ (LLM-Quantization)

Welcome to the official Hugging Face organization for LLMQ. In this organization, you can find quantized models of LLM by cutting-edge quantization methods. In order to access models here, please select the suitable model for your personal use.

We are dedicated to advancing the field of Artificial Intelligence with a focus on enhancing efficiency. Our primary research interests include quantiation, binarization, efficient learning, etc. We are committed to innovating and developing cutting-edge techniques that make large language model (LLM) more accessible and sustainable, minimizing computational costs and maximizing performance. Our interdisciplinary approach leverages global expertise to push the boundaries of efficient AI technologies.

Recent Works:

[22.04.2024] How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study. Arxiv, 2024. ArXiv GitHub

LLM-Quantization

AI & ML interests

Collections 1

LLMQ/LLaMA-3-8B-GPTQ-4bit-b128

LLMQ/LLaMA-3-8B-SmoothQuant-4bit-4bit

LLMQ/LLaMA-3-8B-AWQ-4bit-b128

LLMQ/LLaMA-3-8B-SmoothQuant-8bit-8bit

models 10

LLMQ/LLaMA-3-70B-GPTQ-4bit-b128

LLMQ/LLaMA-3-8B-AWQ-4bit-b128

LLMQ/LLaMA-3-8B-DB-LLM-2bit-fake

LLMQ/LLaMA-3-8B-QuIP-2bit

LLMQ/LLaMA-3-8B-IR-QLoRA

LLMQ/LLaMA-3-8B-SmoothQuant-4bit-4bit

LLMQ/LLaMA-3-8B-SmoothQuant-8bit-8bit

LLMQ/LLaMA-3-8B-PB-LLM-1.7bit-fake

LLMQ/LLaMA-3-8B-BiLLM-1.1bit-fake

LLMQ/LLaMA-3-8B-GPTQ-4bit-b128

datasets

AI & ML interests

Team members 6

Collections 1

models 10 Sort: Recently updated

datasets

models 10