7b model

#4
by Nexesenex - opened

Hello.
This model is great, due to its ability to handle well more tokens than usual (I start to have trouble at 2500, and lose fully coherence around 3,000).
Is there a 7b version planned for those having smaller configurations?
Thanks for the great work.

Edit: Q4_0 quantizations could be great also, to provide a little boost of perfs.
I know, I should learn to do that job myself.. ^^

We're working on an update to this model, and once ready, I think that would be a good moment to train all the variants up to 30b. First there will be a new 30b, and then 13b and 7b together, so yeah it wouldn't be too much trouble training a 7b. First bluemoons have been a bit experimental, and therefore easier to evaluate with larger models, but as it's converging there will not be any issues adding a 7b to the selection (probably within a week or so)! I can add the q4_0 there also once it's up.

Honestly this is amazing work. The 2048 context size is the main downside of all the current Llama models except yours, no matter how qualitative they otherwise are. And it matters a lot to create more coherent storylines and dialogues, even with the help of memory tricks like those offered by Koboldcpp, Ooba, and SillyTavern.
Your exclusive about a context size of 4096 tokens, even if it seems to be still a work in progress due to the initial limitations of the Llama models, should be the norm.
As for quants, 5_0 and 4_0 are the best options so far, in respect for one's choice between quality, speed, and config requirements.
Until next updates (and me saving up for a proper rig!), thanks again for your team's great job, and best regards!

It's cool indeed how well the extended context works. I'm not sure if one can extend the context with LoRA since I've never trained one myself, but maybe that's one reason why we haven't seen 4k or larger context much around in other models. Glad you like the work, big credit goes to everyone who works on the datasets here in HF, and all those doing the early testing!

Nexesenex changed discussion status to closed
Nexesenex changed discussion status to open

Sign up or log in to comment