datatab
/

Yugo45A-GPT-Quantized-GGUF

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Yugo45A-GPT-Quantized-GGUF / Yugo45A-GPT-Quantized-GGUF.Q4_K_M.gguf

Commit History

q4_k_m: Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q4_K

1c9b3a7
verified

datatab commited on Feb 29