Reddit Comments on Model Weights

#5
by acastanza - opened

Any comment to this post from the /r/LocalLLaMA subreddit that the model weights you've claimed to produce are identical to those of another model (Weyaxi/Seraph-7B)?

I consider this concerning as this reddit post is written by Weyaxi, the producers of one of the models you've claimed to use in your merge (Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp), and the model that you've claimed, that apparently has identical weights was produced by this same individual previously (Weyaxi/Seraph-7B), but was coincidentally omitted from all the testing you've described in your blog post.

No one is necessarily accusing anyone of any impropriety but identical sha256 hashes of the model weights to a model that predates yours by a full week would at least appear to deserve some explanation.

Note that I haven't personally validated the findings here, but I did think that the reddit post merited discussion especially considering you're attempting to monetize.

Edit; forgot to add the link to the reddit post - https://www.reddit.com/r/LocalLLaMA/comments/18rhv8r/mistralftoptimized1218s_weights_were_already_on/

Hey @acastanza , thanks for bringing this up. I just read that Reddit thread and checked out the Seraph model. It does look like Weyaxi merged the same models using the same mergekit defaults we did here. So assuming he didn't change anything else and was using a similar version of Mergekit, it likely is the same model. I'll update the README to credit him. Full response here: https://www.reddit.com/r/LocalLLaMA/comments/18rhv8r/comment/kf1sp75/?utm_source=share&utm_medium=web2x&context=3

Thanks for the response! Considering both what you've said and other comments in the thread coming to similar conclusions as to the likely cause, I think this is satisfactorily resolved.
Same merge, with same tools, with same parameters = identical weights. Glad to see this ended up being a friendly misunderstanding!

Sign up or log in to comment