@KnutJaegersberg on Hugging Face: "QuIP# ecosystem is growing :) I've seen a quip# 2 bit Qwen-72b-Chat model…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

KnutJaegersberg

posted an update Jan 7

Post

QuIP# ecosystem is growing :)

I've seen a quip# 2 bit Qwen-72b-Chat model today on the hub that shows there is support for vLLM inference.
This will speed up inference and make high performing 2 bit models more practical. I'm considering quipping MoMo now, as I can only use brief context window of Qwen-72b on my system otherwise, even with bnb double quantization.

keyfan/Qwen-72B-Chat-2bit

Also notice the easier to use Quip# for all library :)

https://github.com/chu-tianxiang/QuIP-for-all

sbrandeis

Jan 8

•

edited Jan 8

The papers supporting QuIP are fascinating, here are some links in case people were not aware of them:

🔗 QuIP paper: https://hf.co/papers/2307.13304
🔗 QuIP# overview: https://cornell-relaxml.github.io/quip-sharp

Good to know that there are still people doing real maths in this field :D

julien-c

Jan 10

might be of interest to @merve @osanseviero too

In this post