How to setup this model

#1
by aswad546 - opened

from unsloth import FastVisionModel # FastLanguageModel for LLMs

import torch

model, tokenizer = FastVisionModel.from_pretrained(
"unsloth/Llama-3.2-90B-Vision-Instruct-bnb-4bit",
load_in_4bit = True, # Use 4bit to reduce memory use. False for 16bit LoRA.
use_gradient_checkpointing = "unsloth", # True or "unsloth" for long context
)

I run this code from your Collab demo for the 90B model 4bit

But apparently it can't find the repository:
FileNotFoundError: unsloth/llama-3.2-90b-vision-instruct-unsloth-bnb-4bit/*.json (repository not found)

Any tips would be appreciated. Thanks

Also you mention in your blog posts that this makes fine tuning easier but is it possible to use the quantized 4 bit 90B model for inference out of the box? The reason I am asking is that I cannot curate a dataset for my use case and I want something better than the llama 11B model with default precision.

Sign up or log in to comment