llama2-chat-70b-w4 weights

by sandrobovelli - opened

Hi. I would like to know when the 4bit quantized 70B weights will be released.
I'm trying to quantize the original hf checkpoint, but the script constantly run out of memory (my system RAM is around 96GB).
What is the approximate memory requirement to quantize the 70B model?

Add more swap and run again. But still, having the weights here would be nice.

unsubscribe changed discussion status to closed

Sign up or log in to comment