Dileep7729
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -19,29 +19,44 @@ This is a fine-tuned version of BLIP for visual answering on images. This model
|
|
19 |
|
20 |
This experimental model can be used for answering questions on product images in retail industry. Product meta data enrichment, Validation of human generated product description are some of the examples sue case.
|
21 |
|
22 |
-
Examples: (place images here)
|
23 |
|
24 |
-
Input Image | Model Output
|
25 |
-
___________________________________________________________________________________________________________________________________________________________________________
|
26 |
-
|
27 |
|
28 |
-
|
29 |
|
|
|
|
|
|
|
|
|
|
|
30 |
|
31 |
|
32 |
-
|
33 |
|
|
|
|
|
|
|
|
|
34 |
|
|
|
|
|
35 |
|
36 |
-
|
|
|
37 |
|
38 |
-
|
|
|
|
|
39 |
|
40 |
-
|
41 |
-
|
42 |
-
| <img src="https://cdn-uploads.huggingface.co/production/uploads/672d17c98e098bf429c83670/YcSs_CFcRj-Tb4woXIArC.png" width=100 height=100 /> | bush ' s best white beans |
|
43 |
|
|
|
|
|
44 |
|
|
|
|
|
|
|
|
|
45 |
|
46 |
|
47 |
## BibTex and citation info
|
|
|
19 |
|
20 |
This experimental model can be used for answering questions on product images in retail industry. Product meta data enrichment, Validation of human generated product description are some of the examples sue case.
|
21 |
|
|
|
22 |
|
|
|
|
|
|
|
23 |
|
24 |
+
# Sample model predictions
|
25 |
|
26 |
+
| Image | Description |
|
27 |
+
|------------------------------------------------------------------------------------------------------------------|-------------|
|
28 |
+
| <img src="https://cdn-uploads.huggingface.co/production/uploads/672d17c98e098bf429c83670/YcSs_CFcRj-Tb4woXlArC.png" width="100" height="100" /> | bush 's best white beans |
|
29 |
+
| <img src="https://cdn-uploads.huggingface.co/production/uploads/672d17c98e098bf429c83670/lTediQ7Zuez_CQQR7YIY0.png" width="100" height="100" /> | a bottle of milk sitting on a counter |
|
30 |
+
| <img src="https://cdn-uploads.huggingface.co/production/uploads/672d17c98e098bf429c83670/7r5oJ7BiSFkLt3nmT3RIv.jpeg" alt="image/jpeg" width="100" height="100" /> | a man in a suit walking across a crosswalk |
|
31 |
|
32 |
|
33 |
+
### How to use the model:
|
34 |
|
35 |
+
'''
|
36 |
+
import requests
|
37 |
+
from PIL import Image
|
38 |
+
from transformers import BlipProcessor, BlipForConditionalGeneration
|
39 |
|
40 |
+
processor = BlipProcessor.from_pretrained("quadranttechnologies/qhub-blip-image-captioning-finetuned")
|
41 |
+
model = BlipForConditionalGeneration.from_pretrained("quadranttechnologies/qhub-blip-image-captioning-finetuned")
|
42 |
|
43 |
+
img_url = 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg'
|
44 |
+
raw_image = Image.open(requests.get(img_url, stream=True).raw).convert('RGB')
|
45 |
|
46 |
+
# conditional image captioning
|
47 |
+
text = "a photography of"
|
48 |
+
inputs = processor(raw_image, text, return_tensors="pt")
|
49 |
|
50 |
+
out = model.generate(**inputs)
|
51 |
+
print(processor.decode(out[0], skip_special_tokens=True))
|
|
|
52 |
|
53 |
+
# unconditional image captioning
|
54 |
+
inputs = processor(raw_image, return_tensors="pt")
|
55 |
|
56 |
+
out = model.generate(**inputs)
|
57 |
+
print(processor.decode(out[0], skip_special_tokens=True))
|
58 |
+
|
59 |
+
'''
|
60 |
|
61 |
|
62 |
## BibTex and citation info
|