new model?

#1
by KatyTheCutie - opened

Will you experiment with 3B models anytime soon? I think StableLM's 3B model has potential, the Zephyr model is outstanding for a 3B model but its not that good for Roleplay, I think Zephyr + Aesir + toxicQA dataset could make a good model! OpenHermes may also be good at increasing model intelligence.Thank you for all of your work.

I was wondering about this, and now there's 8x1b MOEs (there's a 1b llama that on first inspection seems to be doing what OPT 30B could like... last year.) and such and I'm thinking "I've got a 3090" and then I think "learning is hard" and I don't.
Besides the merges, is pretty much all of this the work of HF transformers built in stuff? i think i could bring myself to set up a new venv and copy and paste some boilerplate in vscode.

Would be lovely to see! @ProphetOfBostrom

KatyTheCutie changed discussion status to closed

Sign up or log in to comment