Undi95/OpenDolphinMaid-4x7b-GGUF

Jan 23

After testing about 50 different models (including all the new mixtral hotness) and this model (Q5_K) seems to give the best output (and runs amazingly on my M1 Max Macbook, tolerable on my 5900X 3080 PC).

Throwing some light coding tasks and a few other basic things I'd normally throw at ChatGPT it seems like a really solid all-arounder. Did a side by side comparison with two instances of SillyTavern with the same saved chat (200+ messages done with GPT4 initially) and consistently everyone I made try it preferred OpenDolphinMaid to GPT4. Not saying it's a GPT killer or anything, but it's a damned good model with moderate HW requirements.

Just wanted to say thanks, and drop a huge recommendation for anyone wanting a really solid general model that is a bit better at following the script than Noromaid/Hermes/Dolphin on their own. I'd say I prefer XWin-LM-70b, but I prefer OpenDolphinMaid to XWin-LM-13b.

Damn good job. 😁

Undi95

Owner Jan 24

Thanks!

CitrusFruits

Jan 25

I Kneel.
The best model for my system specs that I can fit in my gpus at this quality.

Erilaz

Jan 26

5900X 3080 PC (Q5_K)

I assume 32GB VRAM and 10GB GPU? What context length and layering do you use, in this case?

morgul

Jan 26

64GB total system memory (but only 30 used with the model loaded and m_lock). And 10GB VRAM for the GPU.

I found ~18 gpu layers, and 24 cpu threads work well, giving me ~1.5 TTFT ~7 - 8 tok/s.

(On the mac, it's actually hilariously faster, which makes me laugh.)

Undi95
/

OpenDolphinMaid-4x7b-GGUF

Great Local Model