TheBloke/chronos-hermes-13B-GGML

Mykee

Jul 25, 2023

I see that the k-quant models have been deleted. Will there be a new version or Llama2 release?

Crataco

Aug 5, 2023

Yeah, that confused me too. You can still get them from an older version of the repo (here).

TheBloke

Owner Aug 5, 2023

I deleted them a while ago because they were at risk of producing garbage output. This model uses a non-standard vocab size (32001) and for a while that broke k-quants. The issue was resolved a few weeks ago, but I've not had a chance to go back and re-make k-quants for this or some other older models.

Are you saying that the k-quants I deleted do in fact work? I think it may be the case that they sometimes produce garbage output, as a change was required in llama.cpp to produce valid k-quants for models like this, changing how certain layers of the model were quantised. But maybe they work most of the time?

There hasn't been a Llama 2 Chronos Hermes yet, but there is a Llama 2 Chronos, and a whole bunch of Chronos merges which I quantised over the last 48 hours. So there's a lot of Llama 2 choice now.

Crataco

Aug 7, 2023

In my testing with q6_K in Oobabooga, the model works as intended (tested with Simple-1 and Mirostat settings), but maybe I haven't been testing for longer periods of time.

I would think it's worth a revisit to quantize again, just in case. I've heard some reports of Llama 2 models having repetition issues, including Chronos-Hermes 2, so for some of us LLaMA 1-based models are still a viable option.

If not, we still have the original quants. Either way, keep up the great work.

Crataco

Sep 26, 2023

I noticed TheBloke is requantizing some older models, so a GGUF version has been released with updated k-quants. Thank you!
TheBloke/chronos-hermes-13B-GGUF

TheBloke
/

chronos-hermes-13B-GGML

k-quant models?