What is the prompt format?

#1
by siddhesh22 - opened

Is it ChatML? or something else?

That's a great question, from the original model page I would have guessed chatml (it uses the im_end token) but from their hosted demo I think it's llama2 or mistral (uses [INST])

From original repo tokenizer_config.json:

...
  "chat_template": "{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
...

Have you guys gotten the model to run?

Sure. I use it with ollama. See here https://ollama.com/sskostyaev/starchat2-15b/tags

@NeoDim yeah that is definitely chatml but odd then that their demo has the [inst] tag, have you found performance with chatml works well?

Also, is it possible to submit my own GGUFs to ollama?

Yes and yes.

For submitting your own models to ollama you need to register in ollama hub(ollama.ai), configure ollama to use your key for pushes, create your own model named with your user name as prefix and then you will can push model to ollama hub.

Hi,

I currently use llamacpp python for codellama and mistral and this is my demo code for prompt format. I want to know how to include starcoder model for the same. what is the format and stop token?

Reference snippet

        input_prompt = f"[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n "
    else:
        input_prompt = f"[INST] "
    for interaction in history:
        input_prompt = input_prompt + str(interaction[0]) + " [/INST] " + str(interaction[1]) + " </s><s> [INST] "

    input_prompt = input_prompt + str(message) + " [/INST] "

output snippet below for stop token i currently use -

output = llm(
        input_prompt,
        temperature=Env.TEMPERATURE,
        top_p=Env.TOP_P,
        top_k=Env.TOP_K,
        repeat_penalty=Env.REPEAT_PENALTY,
        max_tokens=max_tokens_input,
        stop=[
            "<|prompter|>",
            "<|endoftext|>",
            "<|endoftext|> \n",
            "ASSISTANT:",
            "USER:",
            "SYSTEM:",
        ],
        stream=True,
    )

@madhucharan for me this works fine:

{
    "stop": [
        "<|im_start|>",
        "<|im_end|>"
    ]
}

Hi @NeoDim

Thanks for response. But I want to know the input prompt format as well, This is my new format snippet. Please let me know this is correct.

    if use_system_prompt:
        input_prompt = f"<|im_start|> system\n{system_prompt} <|im_end|>\n"
    else:
        input_prompt = f"<|im_start|>"

    input_prompt = f"{input_prompt}user\n{str(message)}<|im_end|>\n<|im_start|>assistant\n"

    output = llm(
        input_prompt,
        temperature=Env.TEMPERATURE,
        top_p=Env.TOP_P,
        top_k=Env.TOP_K,
        repeat_penalty=Env.REPEAT_PENALTY,
        max_tokens=max_tokens_input,
        stop=[
        "<|im_start|>",
        "<|im_end|>"
        ],
        stream=True,
    )

@madhucharan See this comment https://huggingface.co/bartowski/starchat2-15b-v0.1-GGUF/discussions/1#65fb102fbb78d93852b6a3ba
I don't use [INST] tags, <s> tags and don't insert space between <|im_start|> and role, like system in your message. See template here https://ollama.com/sskostyaev/starchat2-15b

@NeoDim I have followed your ollama template only and added \n wherever next line of yours was (or line ends). If you see the template after system there was next line. So Im confused now if I have to remove \n after system. There is no \n between <> and system right, It was after the system?

removed redundant lines where [INST] was there. I forgot to remove it in above comment before posting.

    if use_system_prompt:
        input_prompt = f"<|im_start|>system\n{system_prompt}<|im_end|>\n"
    else:
        input_prompt = f"<|im_start|>"

    input_prompt = f"{input_prompt}user\n{str(message)}<|im_end|>\n<|im_start|>assistant\n"

@madhucharan you don't need to remove \n after system.

There is no \n between <> and system right, It was after the system?

Right.

Now your template looks right to me.

Thanks a lot for your time and support. I was confused a bit and now it got cleared, I will test this and let you know.

Sign up or log in to comment