GGUF quants?

#1
by lemon07r - opened

I tried https://huggingface.co/spaces/ggml-org/gguf-my-repo but it didn't work. Couldn't find gguf quants for this model anywhere.

flammen.ai org

llama.cpp has some quirks with llama 3, I'll try to build some quants on one of our machines

flammen.ai org

Not sure, if you saw them but a few quants available here: https://huggingface.co/flammenai/Mahou-1.1-llama3-8B-GGUF

:)

Not sure, if you saw them but a few quants available here: https://huggingface.co/flammenai/Mahou-1.1-llama3-8B-GGUF

:)

Found the 1.2 quant as well (and a lot of your other merges)! So far pretty good results. Any chance you guys can train a phi medium model as well? Would love to see where you guys can take it

flammen.ai org

Sure, I'll try phi medium next.

Sure, I'll try phi medium next.

Huge thanks for the mahou 1.2 14b ft! Makes me very sad to say, but phi medium is not very good, none of the fts I tested for it, or the base instruct model were very good. I liked mini a lot so I tried hard to like it too. Had hopes a mahou ft might save it since I liked the llama 3 one a lot. At least having mahou versions of both llama 3 and phi medium was able to help confirm the base model is just bad.

On another note, can I interest you in finetuning yi 1.5 34b? It's hands down the best model I've tested in a long timeeee, and I've tested quite a few lately (most of them getting my hopes and being disappointing). Would love to see what you guys could do with a bigger base model that's actually good. This is coming from someone who disliked the old 34b yi, and most of its finetunes. They really should have named it something else with how different it is.

flammen.ai org

Appreciate your input! And yeah, we'll do a Yi 1.5 34B for Mahou 1.3.

Appreciate your input! And yeah, we'll do a Yi 1.5 34B for Mahou 1.3.

I've tried most of your models and merges (including mahou 1.3 8b) now since there aren't very many good finetunes for writing. I use the same writing prompt and setting for all my testing then use gpt4o, gemini pro, claude opus, and yi large preview as judges to evaluate the outputs (thank you chat arena). I do this in multiple passes for higher confidence. From what I've gathered and tested (this is fairly shallow testing so take it with a grain of salt), here are the models that got the most wins from the judges:

  1. llama-3-spicy-abliterated-stella-8B.Q8_0
  2. Mahou-1.2-llama3-8B-Q8_0-imat (tested a q6_K quant from someone else much earlier and that one didnt do as well)
  3. llama-3-daredevil-mahou-8b-q8_0*

*surprisingly mahou-1.3 didn't do as well, even though it's a finetuned version of daredevil mahou, right?

All of these were tested with chatml format.

You can see all the models I've tested and their outputs here: https://discord.com/channels/849937185893384223/1183980242168184832 (you will need to be in the koboldai discord to see, https://koboldai.org/discord)

Again, my testing is pretty shallow, but maybe with some analysis and data of your own this information might be of interest to you, so I thought I'd share. Feel free to infer from my findings however you like, or to totally ignore it since it might not be any helpful anyways.

flammen.ai org

Thanks for sharing! Maybe I'll do a merge of those 3. Mahou 1.3 was retrained and rereleased btw, not sure if the testing was done with the old version or not.

Thanks for sharing! Maybe I'll do a merge of those 3. Mahou 1.3 was retrained and rereleased btw, not sure if the testing was done with the old version or not.

Here's what I tested, https://huggingface.co/lemon07r/Mahou-1.3-llama3-8B-Q8_0-GGUF made roughly 13 hours ago from posting this comment. I believe your 1.3 repo was last updated 13 hours ago too, so I think this is the latest version, right?

Sign up or log in to comment