mlx-community/Llama-3.2-11B-Vision-Instruct-4bit · I can't figure out how to run this model

Oct 20, 2024

When I run this example from the README:

python -m mlx_vlm.generate \
  --model mlx-community/Llama-3.2-11B-Vision-Instruct-4bit \
  --max-tokens 100 --temp 0.0

This happens:

Image: ['http://images.cocodataset.org/val2017/000000039769.jpg']
Prompt: <|begin_of_text|><|start_header_id|>user<|end_header_id|>

What are these?<|image|><|eot_id|><|start_header_id|>assistant<|end_header_id|>

The image shows two cats lying on a couch, with two remote controls placed nearby. The cats are positioned on a pink blanket, which covers the couch. The larger cat...

Since I haven't prompted the model yet I don't know why it is doing this.

zimring

Oct 29, 2024

The model defaults to Qwen2-VL, and the image you pasted. Try running python -m mlx_vlm.generate --help to get the cli options!

simonw

Oct 29, 2024

Right, I ran that one fine. I'm trying to figure out how to run mlx-community/Llama-3.2-11B-Vision-Instruct-4bit.