Text Generation
Transformers
PyTorch
English
German
llama
conversational
custom_code
Inference Endpoints
text-generation-inference

Porting model to ollama

#7
by mars-ars - opened

Hi guys, I am trying to use the leo-hessianai-7B model on Ollama. I use the GGUF file (Q4_K_M.gguf from here https://huggingface.co/TheBloke/leo-hessianai-7B-GGUF/tree/main) and follow the instructions from Ollama (https://github.com/ollama/ollama/blob/main/docs/import.md). I already managed to generate answers with the model, but they are extremely wrong and hallucinating (you can say crazy). Unfortunately, I don't know what I'm doing wrong. I assume that the parameters or the template (in the Modelfile you have to create for Ollama) are incorrect.
Hope you can help me out πŸ™‚

I tried the following Modelfiles:

FROM ./leo-hessianai-7b.Q4_K_M.gguf
TEMPLATE """{{- if .System }}
<|im_start|>system {{ .System }}<|im_end|>
{{- end }}
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""

SYSTEM """"""

PARAMETER stop <|im_start|>
PARAMETER stop <|im_end|>
FROM ./leo-hessianai-7b.Q4_K_M.gguf
TEMPLATE "[INST] {{ .Prompt }} [/INST]"

(The same problem occurred, when I used the safetensors from this repo and used the ollama tools to convert and quantize the model.)

Sign up or log in to comment