Thanks for your quants!

by Cran-May - opened Mar 20, 2024

Mar 20, 2024

May you make a quant for this model?
CausalLM/8x7B-MoE-test-NOT-MIXTRAL
https://huggingface.co/CausalLM/8x7B-MoE-test-NOT-MIXTRAL

And can you offer more iMatrix quants for this model, such as IQ3_XS IQ3_S, IQ3_M or IQ4_XS ?

Thanks for your job!

dranger003

Owner Mar 20, 2024

@Cran-May Sounds like this model isn't quite right... even the author has this mention on their model page:

Only intended for conceptual validation, however the expert models do not seem to be working as expected. The model could output text and complete the conversation normally, but the performance of the expert model was not significant.

Cran-May

Mar 20, 2024

@Cran-May Sounds like this model isn't quite right... even the author has this mention on their model page:

Only intended for conceptual validation, however the expert models do not seem to be working as expected. The model could output text and complete the conversation normally, but the performance of the expert model was not significant.

I agree with you, but it seems interesting to try a MoE model but not fine-tuned from Mixtral🤓

dranger003

Owner Mar 20, 2024

Ah I see, then what you want is this -> https://huggingface.co/mradermacher/Nous-Hermes-2-Mixtral-8x7B-DPO-i1-GGUF

Cran-May

Mar 23, 2024

Ah I see, then what you want is this -> https://huggingface.co/mradermacher/Nous-Hermes-2-Mixtral-8x7B-DPO-i1-GGUF

What about make a quant for this?
https://huggingface.co/OpenBuddy/openbuddy-qwen1.5-14b-v20.1-32k

A model aims at adding multi-language support to Qwen.
Can you offer IQ2~Q4 iMatrix quants?

dranger003

Owner Mar 23, 2024

@Cran-May Isn't Qwen multilingual already?

Multilingual support of both base and chat models;

Cran-May

Mar 24, 2024

@Cran-May Isn't Qwen multilingual already?

Multilingual support of both base and chat models;

In fact, I need a model with Asian language support of around 14B, which may require IQ2~IQ3, and the Q2_K officially provided by Alibaba - Qwen has a big quality loss. It would be great if you could accomplish this quant!
Based on past experience, the fine-tuning done by the OpenBuddy team has been pleasing, with their fine-tuned Mistral, gemma or llama showing excellent multi-language performance and good long text scaling ability and strong robustness under different tasks, so I'm guessing that they'll be able to do just as well this time around (I'm not sure if that's accurate or not). Maybe Qwen-1.5-14B has improved these points above, my info may be behind the curve.
Thanks for your work anyway, your iMatrix quantization is always done better than others!

dranger003

Owner Mar 24, 2024

There you go, enjoy! https://huggingface.co/dranger003/openbuddy-qwen1.5-14b-v20.1-32k-iMat.GGUF

Cran-May

Mar 25, 2024

There you go, enjoy! https://huggingface.co/dranger003/openbuddy-qwen1.5-14b-v20.1-32k-iMat.GGUF

Sincerely thx.
But IQ2 quants are missing.
I tested the IQ3-xxs one, it plays well but takes all the memory(TAT)

dranger003

Owner Mar 25, 2024

You can give it a try but I wouldn't expect much below 3-bit for a 14B.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment