An interesting yet useless consideration over the fp16 being out or not.

#21

by Nexesenex - opened Feb 2, 2024

Feb 2, 2024

I noticed something interesting : Miqu-1-70b's Q2_K size is 25.5GB.

That corresponds to recent LlamaCPP Q2_K quantizations, from january 2024, at barely 3bpw.
Previous GGUF quants Q2_K of 2023 were almost the size of a Q3_K_S, at around 3.4bpw.
So, Miqu-1-70b's Q2_K has been made in january 2024.

Either Miqudev requantized from an anterior Q5_K_M, either he quantized from a Q8_0.. or a FP16.

I'm not an expert on the internals of the GGUF format, but is there a meta-data specifying that a quant is actually a requant?
If yes, we can know.

In any case, that would lead us nowhere, but still!

Anthonyg5005

Feb 2, 2024

considering the fact that this person was an employee of a company which had been given only the quantized versions I don't think it's possible for it to be from fp16. Either it was a requantization of Q5 or Mistral quantized it right before handing them over to the company.

Nexesenex

Feb 2, 2024

When that early access was likely given, the Q2_K variant used in Miqudev's quant didn't exist yet (why to present an already obsolete product to a customer, this while you face a ferocious competition?).
Hence the interrogation.

Anthonyg5005

Feb 2, 2024

Yeah, makes sense. I didn't realize that it was given as early access a while ago and thought it might've been given recently. I believe it was a requantization though as the Q5 was most likely the one given to them.

152334H

Feb 3, 2024

we could at least check if the result of q5 -> f16 -> q2 is identical to the uploaded checkpoint. if it is, it should be more than likely that it was requantized in that fashion.

mradermacher

Feb 4, 2024

•

edited Feb 4, 2024

All three quants have a general.name of "D:\HF", which is strong evidence that all quants are made for hf upload from something else. Edit: and in fact, all metadata kv's other than the filetype are identical.

exito100

Feb 9, 2024

This is the first model that could answer all my test questions (including GPT4). I wished there was a gptq or awq version (4 bit) so the speed would be more practical...

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment