13b in the future?

#21
by deleted - opened
deleted

This seems great, wondering if future, larger models are coming.

I think the Mistral team said that they would train bigger ones

deleted

Great to hear. No rush or complaints, just was hoping that it was in the future plans.

Why would you want 13b when the 7b model is outperforming 70b models? it clearly shows the performance isn't reliant on the size but instead on how it's structured, like the I4 engine matching v8. What we need is the turbo or supercharger for the models while making the model more sophisticated.

Why would you want 13b when the 7b model is outperforming 70b models? it clearly shows the performance isn't reliant on the size but instead on how it's structured, like the I4 engine matching v8. What we need is the turbo or supercharger for the models while making the model more sophisticated.

7b outperforming 70b doesn't mean that 13b could not do better.

Like with the same turbo or supercharger a v8 will probably outperform an l4

deleted

Why would you want 13b when the 7b model is outperforming 70b models? it clearly shows the performance isn't reliant on the size but instead on how it's structured, like the I4 engine matching v8. What we need is the turbo or supercharger for the models while making the model more sophisticated.

7b outperforming 70b doesn't mean that 13b could not do better.

Agreed. a 13b might be like another 100b+ model, and still be of more than reasonable size

Why would you want 13b when the 7b model is outperforming 70b models? it clearly shows the performance isn't reliant on the size but instead on how it's structured, like the I4 engine matching v8. What we need is the turbo or supercharger for the models while making the model more sophisticated.

7b outperforming 70b doesn't mean that 13b could not do better.

Agreed. a 13b might be like another 100b+ model, and still be of more than reasonable size

I understand the idealogy, 'make it bigger to make more power'. That's the same approach Google and openAI took and now it requires supercomputers to run those models. Seeing models with only 70b (XWIN-LM) surpassing gpt4 with 1.7T on alpacaeval and Zepher 7b matching the 1.7T model on some benchmarks should inspire people to focus more on refinement than just resize. The biggest advantage of 7b models is efficiency especially if you need to run multiple models for different agents at the same time on consumer-grade devices. Let's not forget the main goal of open-source AI, making it accessible for everyone.

deleted

Adding a 13 isn't going to require a super computer, that is still more than approachable to us 'serfs'.. And no one said anything about removing the smaller option. Just adding more options.

Adding a 13 isn't going to require a super computer, that is still more than approachable to us 'serfs'.. And no one said anything about removing the smaller option. Just adding more options.

You are right about not needing a supercomputer for 13b but my point is not to shift from the current strategy, it winning against models 100x bigger. For the need of more raw power, you have Mixtral 8x7b. That has 42b parameters making it 3x bigger than the 13b you're asking for.

deleted

Agree that is bigger ( better, time will tell ), but i do think 13 is a magic number for many. not too small, not too large. And i like i was saying i never suggested for 7b to go to the wayside... just include the next step up. Before you make the leap to something much larger.

Sign up or log in to comment