DARE method

#1
by sophosympatheia - opened

Please let me know what you think of it. I haven't tried it yet, and I'm curious whether it improves the results.

I'm not Sao, but I'm curious about what this DARE method is. I found https://arxiv.org/pdf/2011.07713.pdf but that's for image models; and I found https://github.com/GoogleCloudPlatform/generative-ai/pull/122/files but that's prompting, not training. What're you talking about, sophosympatheia?

@Heralax Check out https://github.com/yule-buaa/mergelm

EDIT: By the way, it looks like the fine developers behind cg123/mergekit are already integrating this new DARE approach into their wonderful mergekit tool. Hurray!

Please let me know what you think of it. I haven't tried it yet, and I'm curious whether it improves the results.

Honestly feels like TIES but with lesser wrangling needed to find proper densities or weights, nothing new.

Same thing in terms of results.

Please let me know what you think of it. I haven't tried it yet, and I'm curious whether it improves the results.

Honestly feels like TIES but with lesser wrangling needed to find proper densities or weights, nothing new.

Same thing in terms of results.

I tried both mergelm and mergekit, similar enough results in my minor tests.

@Sao10K Thanks for sharing, Sao. I appreciate the work you're doing.

Sign up or log in to comment