TrinityMaid-13b

#5
by Joseph717171 - opened

I would love to see a merge of Noromaid-13b-0.4-DPO + WhiteRabbit-Trinity-13b.
WhiteRabbit's Trinity is surprisingly smart. However, it's based on CodeLlama (Llama-2), so I don't know if Llama-2-13b and Code Llama (Llama-2) will work due to their compatibility. πŸ€”
https://huggingface.co/WhiteRabbitNeo/Trinity-13B

This comment has been hidden
Joseph717171 changed discussion status to closed
Joseph717171 changed discussion status to open
NeverSleep org

Bump for maybe later

Thank you! I'm having trouble getting them to merge myself. The flatten versions behave normally, but when I merge them together, I get complete incoherent symbol rambling from there merge. Any insights would be most appreciated from you guys. πŸ™πŸ˜©

NeverSleep org

So I looked up and yep, like you said Trinity is based on CodeLlama, and it use a different rope theta value, that's why it don't work.
I will look later if there is a solution, but I don't think there is one right now sadly

Thanks for looking into it @Undi95

I know from experience when models aren't compatible and in my case don't know or try to do it anyway can lead to some weird results to say the least

Here is the fully power of my 70B model that I tried to create merging two Japanese LLMs
image.png
It is definitely a good bloated entropy generator I guess, maybe if I was Google you would see story about how the AI is evil and developing it's own language and how I needed that shut it down... how spooky ooooooooo but it's just brain damage beyond repair. I don't know if their is a pattern besides it's love for the letter Q and the word Question, definitely interesting case study perhaps

Sign up or log in to comment