cognitivecomputations/dolphin-2.9-llama3-70b · AWQ Quantized version (for use with vllm etc)

29 days ago

We quantized dolphin-2.9-llama3-70b using autoawq (version 0.2.5) and uploaded to hf, in case anyone finds it useful:

https://huggingface.co/julep-ai/dolphin-2.9-llama3-70b-awq

Kearm

Cognitive Computations org 29 days ago

This comment has been hidden

Suparious

Cognitive Computations org 29 days ago

•

edited 28 days ago

No, @Kearm ,actually they've supported it for a long time. As a matter of fact, llama3 is using the same format as llama2 (absolutely nothing needed to be done to support it). We are not slow to support new models, just that often we don't see the need to support some model types (like qwen), and the changing of models is super annoying (there is a example issue in AutoAWQ, when you can add support for gpt2 to see the process).

Let's not promote stigmas like this. I just don't quant the 70b as it take some dedicated effort.

Well done @diwank and thank-you for doing this.

Kearm

Cognitive Computations org 28 days ago

•

edited 28 days ago

No, @Kearm ,actually they've supported it for a long time. As a matter of fact, llama3 is using the same format as llama2 (absolutely nothing needed to be done to support it). We are not slow to support new models, just that often we don't see the need to support some model types (like qwen), and the changing of models is super annoying (there is a example issue in AutoAWQ, when you can add support for gpt2 to see the process).

Let's not promote stigmas like this. I just don't quant the 70b as it take some dedicated effort.

Well done @diwank and thank-you for doing this.

@Suparious

I entirely admit my mistake and acknowledge I was spreading misinformation. I had limited knowledge of the AutoAWQ project and that was my fault for a spurious coment. I have hidden the post as to not spread the information but for transparency sake I commented they seemed slow to support new models which is false. There is enough negativity in HF comments already and I apoligize to casper-hansen for my misinformed comment.