PEFT based Fine Tuned model hallucinates values from the fine tuning training data while inferencing.

#111
by Pradeep1995 - opened

I have fine-tuned the mistral model using a set of training data in PEFT Method. I have done the instruction finetuning.
after finetuning while inferencing, it hallucinates values from the data from the finetuning dataset. How to solve this issue?

What learning rate and number of epochs did you use? What is the size of your dataset? This sounds like model overfitting to me.

these are the parameters

lora_r = 8
lora_alpha = 24
lora_dropout = 0.2
size of the dataset = 507 rows
learning_rate = 2e-4
max_steps=5700

@cekal

@Pradeep1995 were you able to find a fix to your problem? I am having the same issue

not yet @aditico

If you're training for 5700 steps and have only 507 items, then you are training for 5700/507=11.24 epochs - that's massively overfitting your data. STF should usually only trains for 1-3 epochs.

@Rhiz0morph
But my loss is not decreasing for the first 3-4 epochs. I am getting a reasonable loss( ~0.04 - ~0.02) only after 9th or 10th epochs.
So then how i can decrease the epoch number in PEFT?

I finetuned mistral 7b on 21 samples only for 256 epochs and it's working fine.
But let's say its task is pretty straightforward.

image.png

image.png

Sign up or log in to comment