Great Local Model
After testing about 50 different models (including all the new mixtral hotness) and this model (Q5_K
) seems to give the best output (and runs amazingly on my M1 Max Macbook, tolerable on my 5900X 3080 PC).
Throwing some light coding tasks and a few other basic things I'd normally throw at ChatGPT it seems like a really solid all-arounder. Did a side by side comparison with two instances of SillyTavern with the same saved chat (200+ messages done with GPT4 initially) and consistently everyone I made try it preferred OpenDolphinMaid
to GPT4. Not saying it's a GPT killer or anything, but it's a damned good model with moderate HW requirements.
Just wanted to say thanks, and drop a huge recommendation for anyone wanting a really solid general model that is a bit better at following the script than Noromaid/Hermes/Dolphin on their own. I'd say I prefer XWin-LM-70b
, but I prefer OpenDolphinMaid
to XWin-LM-13b
.
Damn good job. π
Thanks!
I Kneel.
The best model for my system specs that I can fit in my gpus at this quality.
5900X 3080 PC (Q5_K)
I assume 32GB VRAM and 10GB GPU? What context length and layering do you use, in this case?
64GB total system memory (but only 30 used with the model loaded and m_lock). And 10GB VRAM for the GPU.
I found ~18 gpu layers, and 24 cpu threads work well, giving me ~1.5 TTFT ~7 - 8 tok/s.
(On the mac, it's actually hilariously faster, which makes me laugh.)