--- tags: - autotrain - text-generation-inference - image-text-to-text - text-generation - peft library_name: transformers base_model: google/paligemma-3b-pt-224 license: other datasets: - abhishek/vqa_small --- # Model Trained Using AutoTrain This model was trained using AutoTrain. For more information, please visit [AutoTrain](https://hf.co/docs/autotrain). # Usage ```python # you will need to adjust code if you didnt use peft from PIL import Image from transformers import PaliGemmaForConditionalGeneration, PaliGemmaProcessor import torch import requests from peft import PeftModel base_model_id = BASE_MODEL_ID peft_model_id = THIS_MODEL_ID max_new_tokens = 100 text = "Whats on the flower?" img_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/bee.JPG?download=true" image = Image.open(requests.get(img_url, stream=True).raw) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") base_model = PaliGemmaForConditionalGeneration.from_pretrained(base_model_id) processor = PaliGemmaProcessor.from_pretrained(base_model_id) model = PeftModel.from_pretrained(base_model, peft_model_id) model.merge_and_unload() model = model.eval().to(device) inputs = processor(text=text, images=image, return_tensors="pt").to(device) with torch.inference_mode(): generated_ids = model.generate( **inputs, max_new_tokens=max_new_tokens, do_sample=False, ) result = processor.batch_decode(generated_ids, skip_special_tokens=True) print(result) ```