Text Generation
Transformers
Safetensors
llama
llama-3
conversational
Inference Endpoints
text-generation-inference

GGUF and exl2 quants for anyone who wants

#2
by bartowski - opened

Thank you! Any plans on adding AWQ? :)

I wasn't but i'll look into it today :)

Thank you, GPTQ would be even better, looking for something vLLM compatible :)

Sign up or log in to comment