This is not a 7B. It's a ~9B. Please label appropriately.

#7
by georgewalker - opened

Like several of the top '7B' models on the leaderboard, this is actually a 9B, downstream of https://huggingface.co/zyh3826/GML-Mistral-merged-v1, a merge that combined the first 32 layers (ie all of them) of one Mistral-7B finetune with the last 8 layers of another Mistral finetune, creating a model that is about 9B parameters.

It is helpful to label model sizes appropriately. Better would be if Huggingface labeled models based on their file size and bpw, instead of allowing for these sorts of mistakes to occur and proliferate, as one mislabeled model begets others derived from it.

Some responses from this model appear better considered than those of some other 7B models, but this model employs 25% more layers to achieve its winning performance. I agree that this model best be labeled as a 9B model.

Sign up or log in to comment