Did the LoRa finetuned model end up performing the same compared to full-finetuning?

#30
by timlim123 - opened

Hi there,

There are two beta-model that was published which is:

  1. Full finetuning: https://huggingface.co/alignment-handbook/zephyr-7b-dpo-full
  2. LoRA: https://huggingface.co/alignment-handbook/zephyr-7b-dpo-lora

In the paper, it was mentioned:

"""
All models are trained with the AdamW optimizer and no weight decay. We did not
experiment with parameter-efficient techniques such as LoRA (Hu et al., 2021), but expect similar
results to hold with these methods
"""

Any conclusion regarding this now?

LoRA and full finetuning training cost differs by quite a bit. It seems that the LoRA model achieve a lower loss in the evaluation set. I believe this model published that everyone is using is based on full-SFT?

Following this as well. I did see that the preference accuracy is significantly less too.

Sign up or log in to comment