Quantization request

by dillfrescott - opened

@LoneStriker GGUF format please!!! I'd appreciate it so much!

Yes please, thank you

would be awesome to have this on AWQ for fast inference on serving engines

It's difficult right now, but I'll try.

Any word on quants or exl2 that can run this on a 24gb card? Would love to run this locally.

Sign up or log in to comment