- this model repo is sharded so it can be easily loaded on low-RAM Colab runtimes :)
- Refer to the original model card for more details about the model description, intended uses, and limitations, as well as instructions for how to use the model on CPU and GPU in different precisions.
Refer to the original model card for details or see this blog post. Here is how you can use it on CPU:
Requires the current
main of transformers (at time of writing):
pip install accelerate git+https://github.com/huggingface/transformers.git -U -q
Use (this is for CPU, check out the original model card/blog for
import requests from PIL import Image from transformers import BlipProcessor, Blip2ForConditionalGeneration model_name = "ethzanalytics/blip2-flan-t5-xl-sharded" processor = BlipProcessor.from_pretrained(model_name) model = Blip2ForConditionalGeneration.from_pretrained(model_name) img_url = 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg' raw_image = Image.open(requests.get(img_url, stream=True).raw).convert('RGB') question = "how many dogs are in the picture?" inputs = processor(raw_image, question, return_tensors="pt") out = model.generate(**inputs) print(processor.decode(out, skip_special_tokens=True))
- Downloads last month
Inference API has been turned off for this model.