File size: 1,602 Bytes
795d6a2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
---
tags:
- autotrain
- text-generation-inference
- image-text-to-text
- text-generation
- peft
library_name: transformers
base_model: google/paligemma-3b-pt-224
license: other
datasets:
- abhishek/vqa_small
---
# Model Trained Using AutoTrain
This model was trained using AutoTrain. For more information, please visit [AutoTrain](https://hf.co/docs/autotrain).
# Usage
```python
# you will need to adjust code if you didnt use peft
from PIL import Image
from transformers import PaliGemmaForConditionalGeneration, PaliGemmaProcessor
import torch
import requests
from peft import PeftModel
base_model_id = BASE_MODEL_ID
peft_model_id = THIS_MODEL_ID
max_new_tokens = 100
text = "Whats on the flower?"
img_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/bee.JPG?download=true"
image = Image.open(requests.get(img_url, stream=True).raw)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
base_model = PaliGemmaForConditionalGeneration.from_pretrained(base_model_id)
processor = PaliGemmaProcessor.from_pretrained(base_model_id)
model = PeftModel.from_pretrained(base_model, peft_model_id)
model.merge_and_unload()
model = model.eval().to(device)
inputs = processor(text=text, images=image, return_tensors="pt").to(device)
with torch.inference_mode():
generated_ids = model.generate(
**inputs,
max_new_tokens=max_new_tokens,
do_sample=False,
)
result = processor.batch_decode(generated_ids, skip_special_tokens=True)
print(result)
``` |