Train after merging?

#1
by adi-kmt - opened

Other than adding a positive prompt, is it necessary to further finetune after merging to an moe?

Owner

fine-tune can improve the score again if the dataset is great.
https://huggingface.co/cloudyu/Pluto_24B_DPO_200/blob/main/dpo-metrics.jpg

Thanks for sharing, i would like to know if have you done the sft before doing dpo, and if yes on which dataset? if not, then could you tell me which comparison training you are using during dpo training ? thanks again for you sharing

Sign up or log in to comment