NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF · More quantization variants

May 16, 2024

You should consider releasing more official quants because the unofficial ones I have found are worse than yours in my logic tests even if I compared quants of the same size. For me Q4S is interesting because of it's speed but other sizes might also have demand.z

bartowski

May 16, 2024

When you say unofficial ones, does that include mine? I'd be highly curious if somehow my quants were worse than the ones provided by Nous

Yuma42

May 16, 2024

When you say unofficial ones, does that include mine? I'd be highly curious if somehow my quants were worse than the ones provided by Nous

Yes, there is a logic test which I have.
Nous Q4_K_M passes that (which is great because my own mistral based model fails it too xD ) but unfortunately your Q4_K_S and Q4_K_M both fail the test. I can share the prompt on request if you want.

And about the prompt format, I used the ChatML one which Nous is publishing, I could try the other one (I have seen your response on that but still need to read it).

bartowski

May 16, 2024

Yeah please do share since i'm curious, and if it's repeatable i'd like to see if there's something I can improve

Yuma42

May 16, 2024

Yeah please do share since i'm curious, and if it's repeatable i'd like to see if there's something I can improve

Bob is faster than John.
John is faster than Erica.
No one older than Erica is faster than her.

Is Bob older than Erica?

Yuma42

May 16, 2024

Nous tries to answer step by step and often gets the correct answer. Yours doesn't go into the step by step mode and says that information is missing.

bartowski

May 17, 2024

Interestingly neither of them got it correct in my testing but both attempted to do COT, they both tend to say the right answer but then immediately say that that info can't be determined which is.. interesting..

Either way, in my testing, these GGUFs and mine seem to perform identical 🤷‍♂️

Yuma42

May 17, 2024

•

edited May 21, 2024

they both tend to say the right answer but then immediately say that that info can't be determined which is.. interesting..

Yeah the task is easy for humans but 7-8b models really struggle with it, I think because it breaks expectations (I made the test up so they haven't seen it anywhere).

Either way, in my testing, these GGUFs and mine seem to perform identical 🤷‍♂️

That's good to know maybe it's the client which I'm using (I'm very limited at what I can run) but I'm not sure. I will test later with setting top k to 1 to have something more deterministic but I did run a lot of runs and the patterns were the same between them with nous either getting it correct or getting it nearly as you observed.

Edit: I did try again with top k = 1 and yes for me the Nous version can solve it while the other version can't.

mishaml77

May 26, 2024

•

edited May 26, 2024

Hi ! @Yuma42 and @bartowski

Just realized this models quants are from a different model than the one they claim , their page says this:

Quantized from
NousResearch/Hermes-2-Pro-Llama-3-8B

Maybe thats why you are getting different results

Yuma42

May 27, 2024

Hi ! @Yuma42 and @bartowski

Just realized this models quants are from a different model than the one they claim , their page says this:

Quantized from
NousResearch/Hermes-2-Pro-Llama-3-8B

Maybe thats why you are getting different results

I think the file name difference has other reasons at one place its theta at anther place its the theta symbol and at another place it includes "merge dpo" which is also what theta is supposed to be. I would guess they had that file name before they decided to name it theta.

mishaml77

May 27, 2024

Oh you are right, still it is a little bit confussing

bartowski

May 27, 2024

Yeah even the link in the model description points to the old URL

https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF#model-description

https://huggingface.co/NousResearch/Instruct-Hermes-2-Pro-Llama-3-8B redirects to https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF

So it's the same model but renamed

Yuma42

Aug 5, 2024

@bartowski I just want to let you know that I myself also can't observe the behavior somehow 😅 nous gets it wrong. Maybe I had an unlikely amount of very lucky runs? Guess I should switch to your versions back.