System prompts ignored in chat completions

#51
by joshuaturner - opened

From https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/discussions/11 :

As of the most recent upload, the template in the published quants lists the chat template as:

{{ bos_token }}{% for message in messages %}{% if (message['role'] == 'user') %}{{'<|user|>' + '
' + message['content'] + '<|end|>' + '
' + '<|assistant|>' + '
'}}{% elif (message['role'] == 'assistant') %}{{message['content'] + '<|end|>' + '
'}}{% endif %}{% endfor %}

...which has the net result of ignoring any system prompt passed in.

The breaking change is commit 300945e90b6f55d3cb88261c8e5333fae696f672.

I also have this problem!

Microsoft org

The model has not been optimized for the system instruction and produces better generations without it.

That’s why we opted to remove altogether any reference to system. Try appending it to your first user prompt, should work better than a separate system instruction.

gugarosa changed discussion status to closed

Perhaps a discussion rather than simply closing the issues is in order.

Why do you feel that ignoring parameters from the user is better than conforming to the API contract? Would revising the template to treat the system prompt as an additional user prompt not achieve the goal you set out in the thread on the GGUF repo?

I second this^

The user has an expectation that system prompts will be used if they are included in a given dataset. I’d prefer an approach like the one outlined above for GGUF or if you’re going to break this contract completely, it should be widely publicized on the model card

Microsoft org
This comment has been hidden
gugarosa changed discussion status to open

@gugarosa Thanks for the follow-up - very eager to hear the report from the MSFT team responsible for finetuning of Phi-3.

(FYI, I believe only repository admin are able to re-open closed Discussions)

@jrc is correct; we don't have the ability to re-open closed discussions.

In my application, I've used the "microsoft/Phi-3" as a magic string to change behaviour - I place the system prompt in a <|user|> block before the rest of the conversation. It seems to work acceptably, and would be implementable in the Jinja template with a swap out of:

{% if (message['role'] == 'user') %}

with

{% if (message['role'] == 'user' or message['role'] == 'system') %}


@jrc

	 is correct; we don't have the ability to re-open closed discussions.

Oh god, 100% my bad then, I thought everyone was able to re-open a discussion. Well, now that I know this, I will stop closing them lol

Microsoft org

We are doing some ablations between including system as an additional <|user|> conversation and prepending the prompt on the first <|user|> conversation.

Will let you know soon the results!

Following up on this @gugarosa - any results to share?

Sign up or log in to comment