I can't figure out how to run this model

#1
by simonw - opened

When I run this example from the README:

python -m mlx_vlm.generate \
  --model mlx-community/Llama-3.2-11B-Vision-Instruct-4bit \
  --max-tokens 100 --temp 0.0

This happens:

Image: ['http://images.cocodataset.org/val2017/000000039769.jpg']
Prompt: <|begin_of_text|><|start_header_id|>user<|end_header_id|>

What are these?<|image|><|eot_id|><|start_header_id|>assistant<|end_header_id|>

The image shows two cats lying on a couch, with two remote controls placed nearby. The cats are positioned on a pink blanket, which covers the couch. The larger cat...

Since I haven't prompted the model yet I don't know why it is doing this.

The model defaults to Qwen2-VL, and the image you pasted. Try running python -m mlx_vlm.generate --help to get the cli options!

Right, I ran that one fine. I'm trying to figure out how to run mlx-community/Llama-3.2-11B-Vision-Instruct-4bit.

Sign up or log in to comment