Any comparison between the embed methods and adding pos/neg prompts?

#9
by adi-kmt - opened

Noticed that you had a phixtral model with cheap embed and no pos prompt https://huggingface.co/mlabonne/phixtral-4x2_8/discussions/6.
Do you notice hidden and adding pos prompts gives you better responses?

Also do you finetune after merging?

Yes, pos/neg prompts are a lot better to initialize the gating weights (phixtral's are random). I didn't fine-tune them because it's quite a tricky process and I haven't been successful with it so far, but in theory this should be done.

Sign up or log in to comment