Will there be quantized GGUF for the instruct model as well?

by rsolva - opened Apr 12

Apr 12

Having briefly tested the warm model, it is apparent that it is a bit all over the place (but very funny!). It would be great if the instruct model was released as GGUF as well.

davda54

Norwegian Large Language Models org Apr 12

I believe Lucas is also working on that :) We found the default hyperparameters of llama-cpp a bit strange, it was better to turn off the repetition penalty (set it to 1.0) and to set the temperature to a lower value; the model behaved very chaotically otherwise. I'm sure there's more that can be done about these (and other) hyperparameters, they influence the outputs more than I'd like.

fuzzbin

Apr 14

First: Thank you very much for providing gguf-files for the instruct model. It made the life for us amateurs a little more easy. 🤩

I am a total beginner and are experimenting a little with the instruct-model on Ollama. Does anyone have som tips for parameter setting that works well?

Currently my Ollama-modelfile looks like this:

TEMPLATE """
{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>{{ end }}
{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
"""
SYSTEM """Du er en vennlig assistent som skal svare på spørsmål? Svar kort og konkret på spørsmål"""
PARAMETER num_ctx 4096
PARAMETER stop "<|im_end|>"
PARAMETER temperature 1

The responses are ok-ish, but I wonder if the settings/modelfile can be improved.

rsolva

Apr 14

Here is my Modelfile using the suggestions from @davda54 and the README. I made the following changes to the Modelfile:

a space before system, user and assistant (as suggested in the README).
set the repeat penalty to 1.0
the temperature to 0.3
added extra stop parameters

TEMPLATE """
{{ if .System }}<|im_start|> system
{{ .System }}<|im_end|>{{ end }}
{{ if .Prompt }}<|im_start|> user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|> assistant
"""
SYSTEM """Du er Kari Nordmann, en jovial og hjelpsom assistent som svarer kort og konsist på spørsmål."""
PARAMETER num_ctx 4096
PARAMETER temperature 0.3
PARAMETER repeat_penalty 1.0
PARAMETER stop "<|im_end|>"
PARAMETER stop "<|im_start|>"
PARAMETER stop " user"
PARAMETER stop " assistant"

It behaves quite well, but it is still a bit to verbose for my taste, so I will experiment further with different system messages.

davda54

Norwegian Large Language Models org Apr 15

Thanks for the response @rsolva ! Verbosity is definitely a problem, we tried to augment the open instruction datasets by step-by-step reasoning, detailed descriptions, etc. and apparently we overdid it :) It's on our list of things to improve in the next release. By the way, if you noticed some other recurring unwanted patterns in the outputs, please let us know your feedback!

davda54 changed discussion status to closed Apr 18

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment