mistralai/Mistral-7B-Instruct-v0.2 · FIne tuned model generating both user and assistant dialogues during inference

Hello there!!
I'm working on fine-tuning a large language model (Mistral-7B-Instruct-v0.2) for conversational tasks using the Hugging Face Transformers library. I processed my conversational dataset by treating each conversation as a list of dialog turns, with separate fields for user utterances and assistant responses. I then formatted my conversational dataset using the setup_chat_format function from the trl library, which prepares the model and tokenizer for chat-style (ChatML) inputs and outputs.

However, during inference, the fine-tuned model generates both user and assistant utterances, instead of just the assistant's response. Additionally, the model tends to generate repeated information. I've tried adjusting the inference settings and exploring different sampling strategies, but the issue persists.

Did anybody face this problem? Appreciate your help/feedback/suggestions.