Question

#1
by dillfrescott - opened

Ive been messing with merging and combining models. Do you think the pattern of 0-32 and 24-32 could be continued to form a coherent, even larger model?

I also wonder to be honest. I want to create an unholy 7B param version of TinyLlama to try it out.

The Open LLM Leaderboard failed this eval, unfortunately. Do you know if there's an issue with bfloat16? I see that you're using float16 for https://huggingface.co/dillfrescott/trinity-medium

I have no clue, you would be better off asking a member of HF staff prob

Sign up or log in to comment