Thanks for your quants!

#2
by Cran-May - opened

May you make a quant for this model?
CausalLM/8x7B-MoE-test-NOT-MIXTRAL
https://huggingface.co/CausalLM/8x7B-MoE-test-NOT-MIXTRAL

And can you offer more iMatrix quants for this model, such as IQ3_XS IQ3_S, IQ3_M or IQ4_XS ?

Thanks for your job!

@Cran-May Sounds like this model isn't quite right... even the author has this mention on their model page:

Only intended for conceptual validation, however the expert models do not seem to be working as expected. The model could output text and complete the conversation normally, but the performance of the expert model was not significant.

@Cran-May Sounds like this model isn't quite right... even the author has this mention on their model page:

Only intended for conceptual validation, however the expert models do not seem to be working as expected. The model could output text and complete the conversation normally, but the performance of the expert model was not significant.

I agree with you, but it seems interesting to try a MoE model but not fine-tuned from Mixtral🤓

Ah I see, then what you want is this -> https://huggingface.co/mradermacher/Nous-Hermes-2-Mixtral-8x7B-DPO-i1-GGUF

What about make a quant for this?
https://huggingface.co/OpenBuddy/openbuddy-qwen1.5-14b-v20.1-32k

A model aims at adding multi-language support to Qwen.
Can you offer IQ2~Q4 iMatrix quants?

@Cran-May Isn't Qwen multilingual already?

Multilingual support of both base and chat models;

@Cran-May Isn't Qwen multilingual already?

Multilingual support of both base and chat models;

In fact, I need a model with Asian language support of around 14B, which may require IQ2~IQ3, and the Q2_K officially provided by Alibaba - Qwen has a big quality loss. It would be great if you could accomplish this quant!
Based on past experience, the fine-tuning done by the OpenBuddy team has been pleasing, with their fine-tuned Mistral, gemma or llama showing excellent multi-language performance and good long text scaling ability and strong robustness under different tasks, so I'm guessing that they'll be able to do just as well this time around (I'm not sure if that's accurate or not). Maybe Qwen-1.5-14B has improved these points above, my info may be behind the curve.
Thanks for your work anyway, your iMatrix quantization is always done better than others!

There you go, enjoy! https://huggingface.co/dranger003/openbuddy-qwen1.5-14b-v20.1-32k-iMat.GGUF

Sincerely thx.
But IQ2 quants are missing.
I tested the IQ3-xxs one, it plays well but takes all the memory(TAT)

You can give it a try but I wouldn't expect much below 3-bit for a 14B.

Sign up or log in to comment