Phi-3.5-vision-instruct-q4f16_1-MLC

This is the Phi-3.5-vision-instruct model in MLC format q4f16_1. The weights are identical to mlc-ai/Phi-3.5-vision-instruct-q4f16_1-MLC. The accompanying web-llm library (model_lib) is compiled from a fork with num_crops defaulted to 4 — matching Microsoft's preprocessor_config.json — rather than the 16 baked into mlc-ai's upstream build. This reduces per-image compute ~3-5× and produces a stable 757 image-embed tokens for typical photo aspects (matches the upstream HuggingFace token-count formula (h/336 * w/336 + 1)*144 + 1 + (h/336 + 1)*12).

The model can be used for projects MLC-LLM and WebLLM.

Example Usage

Install MLC LLM following the installation documentation.

Chat

mlc_llm chat HF://dixieclick/Phi-3.5-vision-instruct-q4f16_1-MLC

REST Server

mlc_llm serve HF://dixieclick/Phi-3.5-vision-instruct-q4f16_1-MLC

Python API

from mlc_llm import MLCEngine

model = "HF://dixieclick/Phi-3.5-vision-instruct-q4f16_1-MLC"
engine = MLCEngine(model)

for response in engine.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": "https://www.ilankelman.org/stopsigns/australia.jpg",
                },
                {"type": "text", "text": "Describe this image please."},
            ],
        },
    ],
    model=model,
    stream=True,
):
    for choice in response.choices:
        print(choice.delta.content, end="", flush=True)
print("\n")

engine.terminate()

Documentation

For more information on MLC LLM please visit the documentation and GitHub repo.

Downloads last month: 3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dixieclick/Phi-3.5-vision-instruct-q4f16_1-MLC

Base model

microsoft/Phi-3.5-vision-instruct

Quantized

(13)

this model