Seeking guidance on enhancingoutput of fine-tuned result

#12
by huytungst - opened

I have recently tried to fine-tune the model (using the QLoRA method). While the fine-tuning process concluded without issue, the resulting output responses did not meet the anticipated standards since the generated contents were quite oddly. I suspect the root of this issue may stem from the training dataset, which may not have been adequately prepared to facilitate effective learning for the model. The model I have previously finetuned was trained usin gonly 'text' column from https://huggingface.co/datasets/heegyu/namuwiki-extracted, lacking any question-answering format. So, to enhance the finetuned model's performance, I am wondering whether the following format should be like this:

<|system|>\n{system prompt sample}<|end|>\n<|user|>\n{user prompt sample}<|end|>\n<|assistant|>\n{assistant prompt sample}<|end|>\n

As each training session for my model incurs considerable costs due to the GPU rental fees, I am hesitant to proceed with further training without advice. Therefore, I'm reaching out for opinions from professionals before training again. I greatly appreciate any guidance or recommendations that can help me optimize my process, my data and potentially save on these significant costs.

hi , does this way of training helps you improve the results?

Could you maybe give some details regarding your fine tuning script?

@sunsx0810 @LukasSchmidt

Here is my script based on QLoRA method. However, the model’s responses became more obscure after fine-tuning. I hope it helps.
https://colab.research.google.com/drive/1gz6w2W8OLXpZeAew36RUFr2cmn3Z12vT?usp=sharing

Does anyone have a suggestion for an effective way to pre-process the fine-tuning data to enhance the model’s performance in new foreign languages?

The access is currently restricted, so we can't access it

could this be because you are missing the special tokens {"additional_special_tokens": ["<|system|>", "<|assistant|>", "<|user|>", "<|end|>"]}?

@huytungst where you able to find any way to solve this issue?
I would like to learn how to finetune starchat beta on my own coding scenarios

Sign up or log in to comment