Q4_k_m for Ollama

by shadovv76 - opened about 21 hours ago

about 21 hours ago

Hello! Could you show me way to quantization you tune for ollama format? App or github link?
Now, ollama have q4_k_m for base models 3.2 vision in self library.

huihui-ai

Owner about 15 hours ago

https://github.com/ollama/ollama/blob/main/docs/import.md#quantizing-a-model

huihui-ai

Owner about 15 hours ago

Later, I will try to provide a quantified version.

huihui-ai

Owner about 14 hours ago

•

edited about 13 hours ago

try...

shadovv76

about 12 hours ago

thank you. i was blind. this way mllama compatible? i will try this.

huihui-ai

Owner about 9 hours ago

yes, the ollama inference module supports Llama-3.2-90B-Vision-Instruct, but the conversion module does not support Llama-3.2-90B-Vision-Instruct.

huihui-ai

Owner about 9 hours ago

https://huggingface.co/huihui-ai/Llama-3.2-3B-Instruct-abliterated#ollama

shadovv76

about 3 hours ago

ollama create -f c:\Users\1\Downloads\Llama-3.2-90B-Vision-Instruct-abliterated\Modelfile --quantize q4_K_M mymodel
transferring model data 100%
converting model
Error: unsupported architecture

from safetensor did not success. how can a get gguf from saftensor? LLamacpp do not suppoting mlama.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment