How to setup this model
from unsloth import FastVisionModel # FastLanguageModel for LLMs
import torch
model, tokenizer = FastVisionModel.from_pretrained(
"unsloth/Llama-3.2-90B-Vision-Instruct-bnb-4bit",
load_in_4bit = True, # Use 4bit to reduce memory use. False for 16bit LoRA.
use_gradient_checkpointing = "unsloth", # True or "unsloth" for long context
)
I run this code from your Collab demo for the 90B model 4bit
But apparently it can't find the repository:
FileNotFoundError: unsloth/llama-3.2-90b-vision-instruct-unsloth-bnb-4bit/*.json (repository not found)
Any tips would be appreciated. Thanks
Also you mention in your blog posts that this makes fine tuning easier but is it possible to use the quantized 4 bit 90B model for inference out of the box? The reason I am asking is that I cannot curate a dataset for my use case and I want something better than the llama 11B model with default precision.