when will have a ggml version?

#3
by CUIGuy - opened

is it possible to have ggml version?

There is already one from TheBloke ( https://huggingface.co/TheBloke/Llama-2-7B-32K-Instruct-GGML ), unfortunately it only outputs gibberish for me

There is already one from TheBloke ( https://huggingface.co/TheBloke/Llama-2-7B-32K-Instruct-GGML ), unfortunately it only outputs gibberish for me

what prompt are you using? People say this use a different prompt then the original llama chat prompt. @pbkowalski

@CUIGuy I've tried both the variant specified [INST]...[\INST] and others, but the output is just symbols regardless

@CUIGuy I've tried both the variant specified [INST]...[\INST] and others, but the output is just symbols regardless

got.

Together org

@pbkowalski for which quantization levels did you observe this ?

@mauriceweber I've only tried 2_K, 4_0 and 4_1

The output I get from 4_1:

'[INST]\nWrite a poem about cats\n[\INST]\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n',

I tried different prompts and as well only get long sequences of "\n". Could it be that something breaks in the tokenization of the input?
Can someone with access to the unquantized model verify if the token sequence for the following?

m.tokenize("[INST]\nWrite a poem about cats\n[/INST]\n\n".encode('utf8'))
[1, 29961, 25580, 29962, 13, 6113, 263, 26576, 1048, 274, 1446, 13, 29961, 29914, 25580, 29962, 13, 13]

Based on my experiences, Q2...Q4 quantizations are too small for proper outputs - even when generating "useful" texts (rather than just newlines) these models hallucinate far too much. The Q8_0 quantization, however, works pretty well - and, when using llama.cpp, 16GB RAM allow for context lengths up to 16k, 24GB RAM for lengths up to 32k (tested on a Macbook Air 15" with 24GB unified RAM).

Sign up or log in to comment