FIne tuned model generating both user and assistant dialogues during inference

#63
by sabber - opened

Hello there!!
I'm working on fine-tuning a large language model (Mistral-7B-Instruct-v0.2) for conversational tasks using the Hugging Face Transformers library. I processed my conversational dataset by treating each conversation as a list of dialog turns, with separate fields for user utterances and assistant responses. I then formatted my conversational dataset using the setup_chat_format function from the trl library, which prepares the model and tokenizer for chat-style (ChatML) inputs and outputs.

However, during inference, the fine-tuned model generates both user and assistant utterances, instead of just the assistant's response. Additionally, the model tends to generate repeated information. I've tried adjusting the inference settings and exploring different sampling strategies, but the issue persists.

Did anybody face this problem? Appreciate your help/feedback/suggestions.

Yeah i faced it when seeing this(thats why i came here),but solved(partially) it after reading ur question ; )

It seems that the inference interface provided in hugging face website uses similar backend approach for this model.

This model works more like completion model to me here.
when is ask "what is 1+1" it just asks that back to me! no change in phrase and stuff.
but then i asked "USER: what is 1+1 ASSISTANT:", and then it completed the answer! i need to make some scripting to just get out the assistant part out of text,separated from user ; )

seems this is implemented so that the same output can be joined with user's input to easily make a chat interface.

Sign up or log in to comment