MetaIX/GPT4-X-Alpasta-30b-4bit · Please reconvert to new GGML format

May 16, 2023

llama.cpp now includes GPU offloading support, but it requires for model file to be represented in new GGML file format.

MetaIX

Owner May 18, 2023

Updating today

b-t

May 19, 2023

Can't wait!

vdruts

Jun 21, 2023

Second this. Please convert to GGML3 with the new K Quants.

Jun 23, 2023

•

I tried to do k-quants for this model myself the other day because I was asked to, but it's not currently possible.

There's currently an issue that prevents making k-quants with certain models, models which feature tensors that aren't divisible by 256.

That affects two types of Llama models:

Ones that had a vocab size of 32001 instead of 32000 (because of the addition of a PAD token - which I think was an early hack which got copied even where it's not needed)
Models based on OpenAssistant which have a vocab of 32016 tokens.

This model is an example of the latter, so it won't be possible to make k-quants until this is resolved: https://github.com/ggerganov/llama.cpp/issues/1919#issuecomment-1599484900