What about the GGUF quants ?

#1
by BernardH - opened

I was surprised to see GGUF quants from 7 months ago, considering the llama.cpp support for t5 just landed. Are these supposed to work with llama.cpp ?
Are there any evaluations of the performance loss incurred by quantization ?
Thanks for the models !
Best Regards

Sign up or log in to comment