bartowski/Cerebrum-1.0-8x7b-GGUF-old · i can't get any logical completion out of this

Mar 26, 2024

•

edited Mar 26, 2024

i am using llamacpp and trying to prompting this llm

from langchain_community.llms import LlamaCpp
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
self.llm = LlamaCpp(
                model_path="./LLMs/Aether/Cerebrum-1.0-8x7b-Q6_K.gguf",
                n_gpu_layers=-1,
                n_batch=16,
                f16_kv=True,
                n_ctx = 32000,
                temperature = 0.0,
                callback_manager=callback_manager,
                verbose=True
)

however, none of the generated completions is either logical or meaningful at all. Most of the times, it just repeats stuff.

input: "<s> You are an ai assistant\nUser: I have 3 apple and i eat one. How many apples do i have now?\nAI: "
output: "102456789"

input: "<s> You are an ai assistant\nUser: Create a json describing a cat\nAI: "
output: "1.json\n{\n \"name\": \"John\",\n\"age\":30,\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n<<<<<<<brain'LBL_\n<<<<<<<a\nLBL_<<<<<<<div>\n\n\n\n\n\n\n<<<<<<<div>"

bartowski

Owner Mar 26, 2024

Interesting, even the 3.0 bpw model with exl2 was giving good answers, lemme try redownloading the GGUF to check

andreariboni

Mar 26, 2024

i checked also Q5_K_M and i get nonsensical responses with it too

bartowski

Owner Mar 26, 2024

Yup I'm getting the same as you, that's very interesting, can't imagine why the exl2 would work but GGUF doesn't, will have to investigate, sorry about that!

andreariboni

Mar 26, 2024

nice to know that it's not my setup haha, thanks!