π© Report
I have used Q6_K version.
when I ask: tell me how to write sympy code to ... fitted to some data. data might be random generated, .... using numpy or sympy.
answer: bla blah,..... here's link of sympy....
ask2: i meant not referring link, show me your code of python.
answer2: sorry I don't have source code ... bla bla....
ask2: why can you show me your code even though llama 8b can generate sympy code?
answer3: sorry i cannot generate code ... bla bla...
I think it is something corrupted llm model.
First, it seems you are using this like a chat model, which it isn't. Using a base model as if it were a chat model is likely giving disappointing results. Then, since this is a llama-3 model, make sure your inference engine supports that fully (e.g. llama.cpp, kobold.cpp not older then a day or so, most other engines do not yet support llama 3). Also, you need to use the correct template for llama 3. And since this model was quantized a while ago you need to manually set the pretokenizer to llama3.
If its then still bad, then it might be a model issue :)