fine-tuning is needed after self-merging?
#7
by
oodgnas
- opened
Thanks @oodgnas ! This model hasn't been fine-tuned but this would probably be better (see https://arxiv.org/abs/2312.15166). It looks like small source models really require it while big models can do without but they're kind of insane.