Younes Belkada


AI & ML interests

Large Language Models, Quantization, Vision, Multimodality, Diffusion models



Posts 2

view post
Check out quantized weights from ISTA-DAS Lab directly in their organisation page: ! With official weights of AQLM (for 2bit quantization) & QMoE (1-bit MoE quantization)

Read more about these techniques below:

AQLM paper: Extreme Compression of Large Language Models via Additive Quantization (2401.06118)
QMoE: QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models (2310.16795)

Some useful links below:

AQLM repo:
How to use AQLM & transformers:
How to use AQLM & PEFT:

Great work from @BlackSamorez and team !