prompt format and text-gen

#1
by gandolfi - opened

hello,
how i can configure this prompt format in text-gen ?

{system_prompt}
Human: {prompt}
Assistant: <|EOT|>

thanks

It should work yes, was extracted from the model's chat template

Try instruct instead of chat instruct, I personally dislike chat instruct cause it wraps everything in "the following is a chat between a user and a bot" or something..

Your rope frequency also looks off, try setting it to 4 for I think alpha_value (may be the compress one), I've found even when not pushing context some models NEED the rope frequency or they become incoherent

i have try with instruct only and alpha value to 4. but still incoherent. i will try with ollama.

AutoCoder models are based on DeepSeek-Coder that needs to use compress_emb_pos = 4 to extend context from 4k to 16k.
It seems TGW properly reads GGUF rope scaling from metadata from old DeepSeek-Coder models (e.g. TheBloke's) but doesn't recognize it in newer ones.
Most likely GGUF metadata format changed since then and this change wasn't reflected in TGW code where it only recognizes old "rope.scale_linear" entry but in newer GGUFs it's defined in two entries which it doesn't check: "rope.scaling.factor" and "rope.scaling.type".

Imho, since the model was trained with linear scaling, most likely compress_emb_pos should work better here than alpha value.

Thanks @cgus I couldn't remember which was correct, and your comment seems likely to be true for the rest as well!

thanks. Works now with compress_emb_pos = 4 and an update of text-gen

Awesome glad to hear it!!

Sign up or log in to comment