fastvlm — quantized
Collection
Quantized FastVLM models. • 2 items • Updated
This is a torchao W4A8 (4-bit) quantized version of apple/FastVLM-1.5B.
from transformers import AutoProcessor, AutoModelForImageTextToText
import torch
model = AutoModelForImageTextToText.from_pretrained(
"{REPO_ID}",
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True,
)
processor = AutoProcessor.from_pretrained("{REPO_ID}", trust_remote_code=True)
Replace {REPO_ID} with the repo ID of this model.
See apple/FastVLM-1.5B for the original FP16 model.
Base model
apple/FastVLM-1.5B