Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
HaotongQin 
posted an update Apr 23
Post
1890
We release an empirical study to showcase "How Good Are Low-bit Quantized hashtag#LLaMA3 🦙 Models" with existing LLM quantization techniques!

In this study, the performance of the low-bit LLaMA3 models (especially LLaMA3-70B) is impressively notable. 🚀 However, the results also exposed significant performance degradation issues faced by existing quantization techniques when dealing with LLaMA3, especially under ultra-low bit-width.

We hope this study can serve as a reference for the LLM quantization community and promote the emergence of stronger LLM quantization methods in the context of LLaMA3's release. More work is on the way...

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study (2404.14047)

LLMQ/llama3-quantization-66251258525135aeda16513c
In this post