Q4_k_m for Ollama

#1
by shadovv76 - opened

Hello! Could you show me way to quantization you tune for ollama format? App or github link?
Now, ollama have q4_k_m for base models 3.2 vision in self library.

Later, I will try to provide a quantified version.

thank you. i was blind. this way mllama compatible? i will try this.

yes, the ollama inference module supports Llama-3.2-90B-Vision-Instruct, but the conversion module does not support Llama-3.2-90B-Vision-Instruct.

ollama create -f c:\Users\1\Downloads\Llama-3.2-90B-Vision-Instruct-abliterated\Modelfile --quantize q4_K_M mymodel
transferring model data 100%
converting model
Error: unsupported architecture

from safetensor did not success. how can a get gguf from saftensor? LLamacpp do not suppoting mlama.

Sign up or log in to comment